A Decade of Alibaba's Big Data Platform Evolution Through Double 11
The article chronicles Alibaba's ten‑year journey of building and scaling its big data platform—from early Oracle clusters and Hadoop‑based Cloud‑Ladder 1 to the self‑developed ODPS/MaxCompute, real‑time Blink engine, and the unified DataWorks ecosystem—highlighting key technical milestones, performance breakthroughs, and operational challenges that powered successive Double 11 shopping festivals.
Every year the Double 11 shopping festival serves as a high‑stakes test for Alibaba's data teams, showcasing the evolution of its massive data processing platform over the past ten years.
Stage 1: Early Days (2009‑2010)
Alibaba relied on Oracle clusters and early Hadoop‑based Cloud‑Ladder 1, quickly outgrowing these systems as transaction volume surged, prompting the creation of the proprietary distributed storage (Pangu) and scheduler (Fuxi) components of the “Feitian” cloud platform.
Stage 2: First‑Generation ODPS (2010‑2011)
ODPS became the core compute engine, but stability issues forced teams to work overnight to process daily batch jobs, highlighting the need for more robust scheduling and resource management.
Stage 3: New‑Generation Platform (2012‑2015)
Key projects such as Ice‑Fire‑Bird, the 5K challenge, and the “Moon” migration unified offline and real‑time workloads, expanded cluster size from 1,500 to over 10,000 servers, and introduced tools like DataX, TT, and the upgraded Tianwang scheduler.
Stage 4: Global Expansion (2015‑2017)
Alibaba’s data platform began serving external customers, launched MaxCompute (formerly ODPS), StreamCompute, and DataWorks, and achieved record‑breaking processing speeds (e.g., 4.72 × 10⁸ rows/s in real‑time Blink) during Double 11.
Stage 5: Innovation Era (2017‑present)
Further advancements include MaxCompute Lightning for sub‑second interactive queries, storage‑compute separation, hybrid‑cloud deployments, and the DataHub bus for cross‑region real‑time data replication, all supported by automated SRE tools like TeslaV3.0.
Overall, the continuous technical breakthroughs and operational refinements have turned Alibaba’s big data platform into a core, enterprise‑grade service that powers not only internal business but also global cloud customers.
Alibaba Cloud Infrastructure
For uninterrupted computing services
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.