Tagged articles
6 articles
Page 1 of 1
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 19, 2026 · Artificial Intelligence

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

The article dissects GLM-5’s 744B‑parameter MoE design, 28.5 T token training corpus, novel Muon Split and MLA‑256 optimizations, DSA sparse attention, a fully asynchronous RL pipeline, extensive domestic chip adaptation, and benchmark results that place it on par with Claude Opus 4.5 and ahead of Gemini 3 Pro.

AI ArchitectureDSAGLM-5
0 likes · 13 min read
Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance
Fun with Large Models
Fun with Large Models
Sep 30, 2025 · Artificial Intelligence

DeepSeek-V3.2 Architecture Breakthrough: A 5‑Minute Guide to Its Core Features

The article introduces DeepSeek-V3.2, highlighting its new DeepSeek Sparse Attention (DSA) that boosts training and inference efficiency by up to 50%, cuts model usage costs dramatically, explains the updated API endpoints, and details the four‑stage post‑training pipeline that underpins the model’s performance improvements.

AI ArchitectureDSADeepSeek-V3.2
0 likes · 8 min read
DeepSeek-V3.2 Architecture Breakthrough: A 5‑Minute Guide to Its Core Features
Architects' Tech Alliance
Architects' Tech Alliance
Mar 30, 2025 · Industry Insights

Why Memory, Not Compute, Is the Bottleneck for Next‑Gen AI Chips

The article analyzes the rapid growth of AI model memory and compute demands, the slow increase of chip memory capacity, and argues that memory bandwidth and energy consumption, rather than raw compute, will dominate AI chip design, emphasizing multi‑tenancy, DSA flexibility, and data‑flow optimization.

AI chipsDSAMemory Bandwidth
0 likes · 7 min read
Why Memory, Not Compute, Is the Bottleneck for Next‑Gen AI Chips
DataFunTalk
DataFunTalk
Dec 1, 2021 · Artificial Intelligence

AI DSA: Architecture Features, Industry Trends, and Software Stack Challenges

The article summarizes Dr. Tang Shan's presentation on AI domain‑specific architectures, covering their background, the explosion of diverse AI hardware designs, and the significant software‑stack challenges that arise from fragmented tools and the need for full‑stack solutions.

AIDSAHardware
0 likes · 14 min read
AI DSA: Architecture Features, Industry Trends, and Software Stack Challenges