Evolution of JD Baitiao’s Data Architecture: From MySQL to Apache ShardingSphere
This article chronicles JD Baitiao’s journey from a monolithic MySQL setup through NoSQL and DBRep to a mature ShardingSphere‑based sharding solution, highlighting the technical motivations, architectural decoupling strategies, evaluation criteria, performance comparison, and the operational benefits achieved for a high‑traffic financial service.
JD Baitiao, a fast‑growing financial consumption platform of JD.com, has continuously evolved its data architecture to support billions of users and massive daily traffic. The article outlines the progression from the early MySQL‑centric design to a multi‑stage evolution involving Solr + HBase, MongoDB, and finally a DBRep‑driven pipeline feeding Elasticsearch and HBase.
1. Technology lifecycle – MySQL → NoSQL → DBRep
During 2014‑2015, a Solr + HBase solution split read/write loads but introduced operational overhead. In 2015‑2016, MongoDB was adopted for month‑based sharding, improving query speed for hot data yet suffering from memory consumption and limited scalability. By 2016‑2017, data volume exceeded hundreds of billions, prompting the introduction of DBRep to capture MySQL binlog changes and route them to a messaging hub, ultimately persisting to Elasticsearch and HBase for real‑time analytics.
2. Decoupling the backend architecture
To reduce code coupling, the team identified four decoupling dimensions: data‑architecture, technical‑architecture, business‑relationship, and development‑process. They evaluated sharding middleware against four essential criteria—product maturity, extreme performance, massive data handling, and flexible extensibility—concluding that a mature, stable solution was required for a financial‑grade system.
3. Choosing Apache ShardingSphere
A comparative test between a self‑developed sharding framework and ShardingSphere showed both high performance, but ShardingSphere offered lower code coupling, reduced business intrusion, easier upgrades, and better scalability. The comparison table below summarizes the results:
Self‑Developed Sharding
ShardingSphere
Performance
High
High
Code Coupling
High
Low
Business Intrusion
High
Low
Upgrade Difficulty
High
Low
Scalability
Average
Good
Consequently, JD Baitiao adopted Apache ShardingSphere as its primary sharding middleware for financial‑grade data partitioning.
4. ShardingSphere‑JDBC solution
ShardingSphere‑JDBC is a lightweight Java framework that acts as an enhanced JDBC driver, fully compatible with existing ORM tools and requiring no additional deployment. Its key features—mature product, excellent performance, minimal migration effort, and flexible extensibility—matched JD Baitiao’s requirements.
5. Implementation and benefits
The migration involved a four‑week cut‑over using a custom HASH strategy, creating nearly ten thousand data nodes. DBRep fed changes into ShardingSphere, while parallel clusters allowed thorough data validation. Post‑migration benefits include simplified upgrade paths, reduced development effort, and flexible scaling to handle peak events such as “618” and “11.11”.
6. Future outlook – Database Plus
Facing an increasingly fragmented database ecosystem, JD Baitiao embraces the “Database Plus” concept promoted by ShardingSphere’s community, aiming to provide a pluggable, unified data‑governance layer that can add capabilities like horizontal scaling and encryption without altering underlying databases.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.