Tagged articles
2 articles
Page 1 of 1
Machine Heart
Machine Heart
May 4, 2026 · Artificial Intelligence

Mega MoE vs SonicMoE: Which Will Lead the Next AI Speed Race?

SonicMoE, a new ultra‑fast Mixture‑of‑Experts model from Tri Dao and Ion Stoica’s team, achieves peak throughput on Nvidia Blackwell GPUs, outperforms DeepSeek’s DeepGEMM, and introduces algorithmic redesigns that decouple activation memory from expert granularity while fusing I/O‑aware kernels for up to double the speed of existing MoE frameworks.

AI PerformanceBlackwellGPU Acceleration
0 likes · 12 min read
Mega MoE vs SonicMoE: Which Will Lead the Next AI Speed Race?
Machine Heart
Machine Heart
Apr 17, 2026 · Artificial Intelligence

DeepSeek Introduces Mega MoE and FP4 Indexer – Inside the New GPU Fusion Kernel

DeepSeek's latest DeepGEMM update adds Mega MoE, a fused GPU kernel that collapses the entire Mixture‑of‑Experts pipeline and overlaps computation with NVLink communication, while also unveiling an FP4 indexer and FP8×FP4 precision experiments, signaling a push toward highly efficient large‑scale AI training.

DeepGEMMDeepSeekFP4 Indexer
0 likes · 5 min read
DeepSeek Introduces Mega MoE and FP4 Indexer – Inside the New GPU Fusion Kernel