Tag

RSS

0 views collected around this technical thread.

DataFunTalk
DataFunTalk
Jun 22, 2024 · Big Data

Migrating Spark Shuffle Service from ESS to RSS (Celeborn) at Zhihu: Design, Implementation, and Benefits

This article details Zhihu's migration of massive Spark and MapReduce shuffle workloads from the External Shuffle Service (ESS) to a push‑based Remote Shuffle Service (RSS) powered by Celeborn, covering background problems, evaluation of open‑source implementations, deployment architecture, encountered issues, solutions, performance gains, and future plans.

Big DataCelebornRSS
0 likes · 19 min read
Migrating Spark Shuffle Service from ESS to RSS (Celeborn) at Zhihu: Design, Implementation, and Benefits
Baidu Geek Talk
Baidu Geek Talk
Jun 7, 2023 · Big Data

Optimization Practices for Offline Big Data Computing and Storage at Baidu MEG

Baidu MEG’s offline big‑data platform cut costs and boost efficiency by applying intelligent scheduling, storage‑separation, tide‑power workload profiling, remote shuffle services and dynamic quota resizing, raising compute utilization from 55 % to 80 % and storage from 63 % to 78 %, slashing annual expenses by roughly ¥70 million and reducing task duration by about 30 %.

Big DataOffline ComputingRSS
0 likes · 12 min read
Optimization Practices for Offline Big Data Computing and Storage at Baidu MEG