Why Split Databases? An Introduction to Vertical and Horizontal Sharding
This article explains the reasons for splitting databases, describes vertical and horizontal sharding techniques, compares their benefits and drawbacks, and advises performing horizontal sharding before vertical partitioning to improve scalability and performance in large‑scale applications.
1. Why Split Databases?
In a monolithic project, a single MySQL instance is often sufficient, but as data volume and load grow—such as in large e‑commerce platforms like Taobao—the original architecture can no longer meet performance requirements, prompting a move from MySQL to Oracle and eventually to a multi‑node, split‑database design.
Database splitting can be done in two ways: vertical partitioning and horizontal partitioning. The recommended order is to perform horizontal splitting first, followed by vertical splitting.
2. What Is Vertical Partitioning?
Vertical partitioning separates databases based on business domains, placing tables of the same business category into an independent database while other categories reside in separate databases.
For example, an e‑commerce system might place all product‑related tables in one database (forming a product system) and all inventory‑related tables in another database (forming a warehouse system).
Problems Solved by Vertical Partitioning
It reduces the load on a single database node by distributing tables across multiple MySQL instances, thereby lowering per‑node read/write pressure.
Problems Not Solved by Vertical Partitioning
It does not shrink individual tables; large tables (e.g., a product table with over 20 million rows) still suffer performance degradation.
3. What Is Horizontal Partitioning?
Horizontal partitioning splits data according to a rule on a specific field, distributing rows of a large table into multiple tables, which effectively reduces table size.
Many mistakenly think horizontal partitions must reside on different MySQL nodes, but they can also be stored on the same node using MySQL’s built‑in partitioning feature, which can spread data across multiple disks to increase I/O capacity.
Uses of Horizontal Partitioning
It is suitable for large tables such as user, product, address, or order tables in e‑commerce systems; smaller tables like brand or supplier tables usually do not need partitioning.
Drawbacks of Horizontal Partitioning
Different tables may require different partitioning rules, so a middleware with rich sharding capabilities (e.g., MyCat, Atlas, ProxySQL) is needed. MyCat, written in Java, acts as an SQL router and supports various sharding strategies such as modulo, range, and date‑based partitioning.
Another drawback is the high cost of scaling; adding a new shard often requires 4–8 MySQL nodes. A better approach is hot‑cold data separation and periodic archiving, for which TokuDB (offering 6–14× write speed compared to InnoDB) is recommended as an archival engine.
4. Why Perform Horizontal Partitioning Before Vertical Partitioning?
When data volume grows, the first step is to shard data across multiple disks to increase I/O and storage capacity at low cost.
Later, as data continues to expand, horizontal sharding can be combined with middleware (e.g., MyCat) and read/write separation, while business systems adopt load balancing and distributed architectures.
After horizontal sharding reaches its limits, vertical partitioning is introduced to separate independent business modules into distinct databases, preventing a single system from becoming a bottleneck.
Source: http://www.imooc.com/article/288363
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.