Tag

Sparse Parameters

1 views collected around this technical thread.

Baidu Geek Talk
Baidu Geek Talk
Oct 31, 2022 · Artificial Intelligence

PaddleBox: A GPU‑Based Ultra‑Large‑Scale Sparse DNN Training Framework

PaddleBox is Baidu’s GPU‑based ultra‑large‑scale sparse DNN training framework that combines a three‑tier hierarchical parameter server (SSD, DRAM, HBM) with pipelined scheduling and multi‑machine multi‑GPU communication, delivering 5–40× cost‑performance gains over traditional CPU solutions and powering Baidu’s advertising services.

GPULarge-Scale ModelsPaddleBox
0 likes · 15 min read
PaddleBox: A GPU‑Based Ultra‑Large‑Scale Sparse DNN Training Framework
DataFunTalk
DataFunTalk
Dec 23, 2021 · Artificial Intelligence

Deep Customization and Optimization of TensorFlow for Large-Scale Sparse Training at Meituan

This article details Meituan's internal, heavily customized TensorFlow 1.x implementation that addresses large‑scale sparse parameter support, distributed training challenges, communication bottlenecks, and pipeline optimizations, achieving over ten‑fold scalability improvements and significant per‑node performance gains in recommendation system workloads.

Sparse ParametersTensorFlowdistributed training
0 likes · 32 min read
Deep Customization and Optimization of TensorFlow for Large-Scale Sparse Training at Meituan