ZhiKe AI
May 23, 2026 · Artificial Intelligence
Zhipu AI Unveils GLM-5.1-HighSpeed, Achieving 400 Tokens/s and 6× Faster Generation
On May 22 2026, Zhipu AI released the GLM‑5.1‑HighSpeed variant, which generates up to 400 tokens per second—over six times the speed of the standard GLM‑5.1 and twice that of Google’s Gemini‑3.5‑Flash—thanks to multi‑dimensional inference, attention and sequence‑parallel optimizations while preserving full model capabilities.
GLM-5.1-HighSpeedInference OptimizationLLM
0 likes · 3 min read
