Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

The article dissects GLM-5’s 744B‑parameter MoE design, 28.5 T token training corpus, novel Muon Split and MLA‑256 optimizations, DSA sparse attention, a fully asynchronous RL pipeline, extensive domestic chip adaptation, and benchmark results that place it on par with Claude Opus 4.5 and ahead of Gemini 3 Pro.

AI ArchitectureDSAGLM-5

0 likes · 13 min read

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

Baidu Intelligent Cloud Tech Hub

Oct 28, 2025 · Artificial Intelligence

How Baidu’s New MTP Inference Code Doubles DeepSeek‑V3.2 Throughput

Baidu Baige and the SGLang community have open‑sourced a production‑tested MTP inference engine that boosts DeepSeek‑V3.2 decoding speed by over two times while delivering exceptional stability, thanks to a DSA‑optimized architecture that predicts multiple tokens in a single forward pass.

AIDSADeepSeek

0 likes · 4 min read

How Baidu’s New MTP Inference Code Doubles DeepSeek‑V3.2 Throughput

Fun with Large Models

Sep 30, 2025 · Artificial Intelligence

DeepSeek-V3.2 Architecture Breakthrough: A 5‑Minute Guide to Its Core Features

The article introduces DeepSeek-V3.2, highlighting its new DeepSeek Sparse Attention (DSA) that boosts training and inference efficiency by up to 50%, cuts model usage costs dramatically, explains the updated API endpoints, and details the four‑stage post‑training pipeline that underpins the model’s performance improvements.

AI ArchitectureDSADeepSeek-V3.2

0 likes · 8 min read

DeepSeek-V3.2 Architecture Breakthrough: A 5‑Minute Guide to Its Core Features

Architects' Tech Alliance

Mar 30, 2025 · Industry Insights

Why Memory, Not Compute, Is the Bottleneck for Next‑Gen AI Chips

The article analyzes the rapid growth of AI model memory and compute demands, the slow increase of chip memory capacity, and argues that memory bandwidth and energy consumption, rather than raw compute, will dominate AI chip design, emphasizing multi‑tenancy, DSA flexibility, and data‑flow optimization.

AI chipsDSAMemory Bandwidth

0 likes · 7 min read

Why Memory, Not Compute, Is the Bottleneck for Next‑Gen AI Chips

Architects' Tech Alliance

Mar 27, 2025 · Artificial Intelligence

What Makes AI Chips Different? A Deep Dive into Training and Inference Processors

This article explains the rise of AI‑specific processors, defines AI chips, compares their architectures, and examines the distinct requirements of training versus inference chips while outlining the main technology routes (GPU, FPGA, ASIC) and future outlook.

AI chipsASICDSA

0 likes · 9 min read

What Makes AI Chips Different? A Deep Dive into Training and Inference Processors

DataFunTalk

Dec 1, 2021 · Artificial Intelligence

AI DSA: Architecture Features, Industry Trends, and Software Stack Challenges

The article summarizes Dr. Tang Shan's presentation on AI domain‑specific architectures, covering their background, the explosion of diverse AI hardware designs, and the significant software‑stack challenges that arise from fragmented tools and the need for full‑stack solutions.

AIDSAHardware

0 likes · 14 min read

AI DSA: Architecture Features, Industry Trends, and Software Stack Challenges