Solo Development of GQLA: Challenging DeepSeek’s MLA and DSA
This article presents GQLA, a single‑author variant of MLA that eliminates three hardware‑related drawbacks of MLA, demonstrates how it achieves balanced compute‑memory performance on both high‑end H100 and more modest H20 GPUs, and details conversion methods (TransGQLA) and sparse extensions with concrete benchmark results.
