May 23, 2026 · Artificial Intelligence

Zhipu AI Unveils GLM-5.1-HighSpeed, Achieving 400 Tokens/s and 6× Faster Generation

On May 22 2026, Zhipu AI released the GLM‑5.1‑HighSpeed variant, which generates up to 400 tokens per second—over six times the speed of the standard GLM‑5.1 and twice that of Google’s Gemini‑3.5‑Flash—thanks to multi‑dimensional inference, attention and sequence‑parallel optimizations while preserving full model capabilities.

GLM-5.1-HighSpeedInference OptimizationLLM

0 likes · 3 min read

Zhipu AI Unveils GLM-5.1-HighSpeed, Achieving 400 Tokens/s and 6× Faster Generation

Zhipu AI Unveils GLM-5.1-HighSpeed, Achieving 400 Tokens/s and 6× Faster Generation

Zhipu AI Unveils GLM-5.1-HighSpeed, Achieving 400 Tokens/s and 6× Faster Generation