Baidu Intelligent Cloud Tech Hub
May 16, 2025 · Artificial Intelligence
How Baidu Cloud Achieved 4µs End-to-End Latency for Large-Scale PD Inference
Baidu Intelligent Cloud built a 4µs end-to-end low‑latency HPN cluster, optimized traffic management and communication operators, and introduced dynamic expert balancing to dramatically improve the performance of large‑scale PD‑separated inference services, showcasing the deep integration of network infrastructure with AI workloads.
AI inferenceAll-to-AllHPN
0 likes · 14 min read
