Databases 14 min read

Evolution of JD Baitiao’s Data Architecture: From MySQL to Apache ShardingSphere

This article chronicles JD Baitiao’s journey from a monolithic MySQL setup through NoSQL and DBRep to a mature ShardingSphere‑based sharding solution, highlighting the technical motivations, architectural decoupling strategies, evaluation criteria, performance comparison, and the operational benefits achieved for a high‑traffic financial service.

Architecture Digest

May 16, 2022

Evolution of JD Baitiao’s Data Architecture: From MySQL to Apache ShardingSphere

JD Baitiao, a fast‑growing financial consumption platform of JD.com, has continuously evolved its data architecture to support billions of users and massive daily traffic. The article outlines the progression from the early MySQL‑centric design to a multi‑stage evolution involving Solr + HBase, MongoDB, and finally a DBRep‑driven pipeline feeding Elasticsearch and HBase.

1. Technology lifecycle – MySQL → NoSQL → DBRep

During 2014‑2015, a Solr + HBase solution split read/write loads but introduced operational overhead. In 2015‑2016, MongoDB was adopted for month‑based sharding, improving query speed for hot data yet suffering from memory consumption and limited scalability. By 2016‑2017, data volume exceeded hundreds of billions, prompting the introduction of DBRep to capture MySQL binlog changes and route them to a messaging hub, ultimately persisting to Elasticsearch and HBase for real‑time analytics.

2. Decoupling the backend architecture

To reduce code coupling, the team identified four decoupling dimensions: data‑architecture, technical‑architecture, business‑relationship, and development‑process. They evaluated sharding middleware against four essential criteria—product maturity, extreme performance, massive data handling, and flexible extensibility—concluding that a mature, stable solution was required for a financial‑grade system.

3. Choosing Apache ShardingSphere

A comparative test between a self‑developed sharding framework and ShardingSphere showed both high performance, but ShardingSphere offered lower code coupling, reduced business intrusion, easier upgrades, and better scalability. The comparison table below summarizes the results:

Self‑Developed Sharding

ShardingSphere

Performance

High

Code Coupling

High

Low

Business Intrusion

High

Low

Upgrade Difficulty

High

Low

Scalability

Average

Good

Consequently, JD Baitiao adopted Apache ShardingSphere as its primary sharding middleware for financial‑grade data partitioning.

4. ShardingSphere‑JDBC solution

ShardingSphere‑JDBC is a lightweight Java framework that acts as an enhanced JDBC driver, fully compatible with existing ORM tools and requiring no additional deployment. Its key features—mature product, excellent performance, minimal migration effort, and flexible extensibility—matched JD Baitiao’s requirements.

5. Implementation and benefits

The migration involved a four‑week cut‑over using a custom HASH strategy, creating nearly ten thousand data nodes. DBRep fed changes into ShardingSphere, while parallel clusters allowed thorough data validation. Post‑migration benefits include simplified upgrade paths, reduced development effort, and flexible scaling to handle peak events such as “618” and “11.11”.

6. Future outlook – Database Plus

Facing an increasingly fragmented database ecosystem, JD Baitiao embraces the “Database Plus” concept promoted by ShardingSphere’s community, aiming to provide a pluggable, unified data‑governance layer that can add capabilities like horizontal scaling and encryption without altering underlying databases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

databases Data Architecture financial technology Apache ShardingSphere

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.