Inside Grok-5 and MiniMax-M3: Massive Model Upscale and New Sparse Attention Gains
The article reveals that xAI’s upcoming Grok-5 (Grok V9-Medium) will feature a 1.5-trillion-parameter model trained with extensive Cursor programming data, while MiniMax-M3 introduces a new sparse-attention architecture that boosts pre-fill speed by 9.7× and decode speed by 15.6×, highlighting a strategic partnership between SpaceX, Cursor, and xAI.
