Big Data 14 min read

Building and Evolving the Dada‑JD Daojia Big Data Platform: Architecture, Strategies, and Lessons Learned

This article presents a comprehensive case study of the Dada‑JD Daojia big data platform, detailing its evolution from a MySQL‑based warehouse to a multi‑layered One Data, One Platform, One Service, Many Apps architecture, the technical challenges faced, and the strategic approaches adopted to ensure coverage, accuracy, stability, and scalability.

Dada Group Technology
Dada Group Technology
Dada Group Technology
Building and Evolving the Dada‑JD Daojia Big Data Platform: Architecture, Strategies, and Lessons Learned

Author Bio: Cai Zhiwu, head of Dada‑JD Daojia big data product development, graduated from Shanghai Jiao Tong University Software School, with extensive experience in data warehouse and platform construction and team management.

The Dada‑JD Daojia big data platform was built to support the rapid growth of the company's logistics and retail businesses, drawing on industry best practices while tailoring its own implementation strategy to ensure continuous, sustainable development.

Construction Review: In 2016 the DRP platform used MySQL for short‑term reporting; 2017 saw migration to Hive and the creation of unified permission, metadata, self‑service reporting, query, and data exchange tools; 2018 focused on rebuilding scheduling, demand management, data quality monitoring, and developing E‑SQL to improve usability; 2019 emphasized treating data as an asset, establishing an ecosystem of compute, storage, security, and synchronization engines, and product‑driven application development.

The overall framework consists of four pillars: One Data (unified data warehouse), One Platform (integrated data platform), One Service (centralized data services), and Many Apps (various data applications).

One Data: Covers 22 domain topics across logistics and home‑delivery, handling over 5,000 offline tasks, 120+ real‑time tasks, and petabytes of data. Emphasis is placed on coverage, accuracy, and stability, achieved through unified data sources, modeling, centralized ETL, data quality monitoring, source probing, ETL optimization, cluster‑based scheduling, multi‑level alerts, and professional operations.

One Platform: Provides unified permission management, a scheduling development platform, data source and ingestion capabilities, storage engines (Hive, HDFS, HBase, Kafka, Redis, Elasticsearch, Druid, MySQL), resource management via YARN, compute engines (Flink for streaming, Spark SQL for near‑line, Hive for batch, Presto for ad‑hoc queries), and a suite of data applications.

One Service: Consolidates data delivery through middle‑layer services, file storage services, and configurable API services, enabling efficient data access for internal and external applications without extensive custom development.

Many Apps: Includes BI self‑built applications (Cangqiong platform, DRP, MiniReport, MyQuery, E‑SQL), shared platform applications (self‑service reporting and querying), and co‑built business applications (CRM, logistics algorithms, order and marketing systems), all leveraging the data middle‑platform to accelerate development and improve data utilization.

In the concluding outlook, the platform is likened to a young adult entering its prime, poised to make further advances in data empowerment and technical depth.

Recruitment: The Dada‑JD Daojia big data product development team invites talented individuals in data product, data warehouse, data platform, and data visualization to join the effort; interested candidates should send their resumes to [email protected].

case studybig datadata platformdata warehouseETLdata governancedata services
Dada Group Technology
Written by

Dada Group Technology

Sharing insights and experiences from Dada Group's R&D department on product refinement and technology advancement, connecting with fellow geeks to exchange ideas and grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.