Forum on Building Ultra‑Scale Storage Systems: Insights from Baidu, Meituan, Ant Group, Xiaomi and Baidu Cloud
The forum gathers senior experts from Baidu, Meituan, Ant Group, Xiaomi and Baidu Cloud to share practical experiences and future trends on constructing ultra‑large‑scale file, block, KV and NoSQL storage systems, focusing on low‑cost, high‑performance solutions and architectural challenges.
With the rapid growth of enterprise intelligence, cloud computing, AI and big data, data storage volumes are expanding exponentially, making the construction of highly scalable, ultra‑large‑scale storage systems a critical challenge.
Speaker: Duan Liguo – Baidu Cloud Storage Architect
Master’s graduate of Northeastern University (2011) with 10 years of storage development experience, currently leading Baidu Object Storage (BOS).
Speaker: Ma Jingwei – Baidu Cloud Architect
Ph.D. from Nankai University, author of multiple CCF A/B papers. Joined Baidu Cloud in 2016 and led the design and implementation of the Append engine and EC engine for Baidu’s large‑scale block storage (CDS), dramatically reducing PB‑level storage costs and improving performance.
Talk Title: Building Large‑Scale Block Storage EC Systems
Outline: Comparison of data fault‑tolerance methods; technical challenges of large‑scale block storage EC; Baidu’s implementation and business impact.
Audience Takeaways: Difficulties and solutions for building EC engines in massive block storage systems.
Speaker: Qi Zeben – Meituan Researcher (Basic Technology Department)
Responsible for KV and file storage at Meituan, with over 10 years of distributed storage R&D and operations experience, previously contributed to Baidu’s MFS and BDRP projects.
Talk Title: Challenges and Architectural Practices of Meituan’s Large‑Scale KV Storage
Outline: Overview of Meituan’s KV services handling trillions of daily requests with 5‑nine availability; introduction of two KV systems—Cellar (high reliability) and Squirrel (high throughput); scalability and availability challenges; future trends.
Audience Takeaways: 1) Understanding scalability and availability challenges of massive KV storage; 2) Differences between distributed cache and persistent KV architectures; 3) Future directions of KV storage technology.
Speaker: He Yuchen – Senior Software Engineer, Xiaomi
Graduated from Renmin University (both bachelor and master). Joined Xiaomi in 2017, responsible for the distributed KV storage system Pegasus, contributing major features such as Bulk Load and Partition Split, and serving as Pegasus PPMC after its Apache graduation.
Talk Title: Apache Pegasus: Architecture, Use Cases, and Future Roadmap
Outline: Overall architecture, core functionalities, typical user scenarios, and upcoming plans for Pegasus.
Audience Takeaways: Suitable application scenarios for Pegasus and how to contribute to open‑source projects.
Speaker: Huang Hua – Ant Group Graph Computing Technology Expert
Decades of experience in storage devices, engines, and large‑scale database storage systems.
Talk Title: Perfect‑Hash‑Based Read‑Optimized Storage Systems
Outline: 1) Batch‑update storage systems using perfect‑hash indexes and their use in Ant Group; 2) Building high‑efficiency, low‑cost ultra‑large‑scale KV storage with perfect‑hash; 3) Transforming static perfect‑hash indexes into real‑time read‑write storage.
Audience Takeaways: 1) Designing compact, high‑efficiency point‑lookup storage with perfect‑hash; 2) Enabling real‑time read/write capabilities for static hash indexes.
Speaker: Zheng Pengfei – Baidu Cloud Senior Architect
Ph.D. from University of Chinese Academy of Sciences; leads Baidu Cloud’s file storage direction with 8 years of distributed storage experience across block, object, cache, and file systems.
Talk Title: Building a Billion‑File Scale Distributed File System
Outline: 1) Factors affecting the scalability of distributed file systems; 2) Evolution of metadata systems; 3) Core design of Baidu Cloud CFS metadata system.
Audience Takeaways: 1) Fundamental scalability issues of distributed file systems; 2) Abstract concepts of distributed file system design; 3) How Baidu Cloud solved metadata scalability.
Scan the QR code to register for free
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.