Kuaishou Tech
Jul 16, 2021 · Artificial Intelligence
Bagua: An Open‑Source Distributed Training Framework for Deep Learning
Bagua is a distributed training framework co‑developed by Kuaishou and ETH Zürich that combines algorithmic and system‑level optimizations—such as decentralized, asynchronous, and compressed communication—to achieve up to 60% higher performance than existing frameworks like PyTorch‑DDP, Horovod, and BytePS across various AI workloads.
BaguaGPU scalingPyTorch
0 likes · 15 min read