Huawei Ascend 950 NPU Architecture Deep Dive – Full Whitepaper Inside
The article provides a detailed technical analysis of Huawei's Ascend 950 NPU series, covering its one‑chip dual‑structure for training and inference, SIMD/SIMT dual‑mode compute, ultra‑fine memory granularity, PD separation, native FP4 support, a high‑bandwidth 2.0 interconnect, and a fully self‑developed yet CUDA‑compatible ecosystem.
