Tagged articles
1 articles
Page 1 of 1
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 16, 2025 · Artificial Intelligence

How Baidu Cloud Achieved 4µs End-to-End Latency for Large-Scale PD Inference

Baidu Intelligent Cloud built a 4µs end-to-end low‑latency HPN cluster, optimized traffic management and communication operators, and introduced dynamic expert balancing to dramatically improve the performance of large‑scale PD‑separated inference services, showcasing the deep integration of network infrastructure with AI workloads.

AI inferenceAll-to-AllHPN
0 likes · 14 min read
How Baidu Cloud Achieved 4µs End-to-End Latency for Large-Scale PD Inference