How Tsinghua & Tencent Mixed‑X Won the MLSys 2026 MoE Inference Challenge with a 4.1× Speedup
The Tsinghua‑Tencent Mixed‑X team captured the MLSys 2026 MoE inference optimization championship by analyzing NPU bottlenecks, redesigning data movement, applying expert‑level sharding, continuous DMA, PSUM batching, and an Agent‑based optimizer, achieving a 4.1× end‑to‑end speedup while preserving bit‑level output fidelity.
