Databases 7 min read

Database Sharding: Vertical vs Horizontal Partitioning, Hash Modulo and Range Schemes

This article explains vertical and horizontal database sharding, compares hash‑modulo and range‑based partitioning schemes, and discusses their advantages, drawbacks, and the impact on scaling and hotspot problems in large‑scale applications.

Architecture Digest
Architecture Digest
Architecture Digest
Database Sharding: Vertical vs Horizontal Partitioning, Hash Modulo and Range Schemes

Preface

In medium‑to‑large projects, when data volume becomes large, developers usually split the data. There are two main approaches: vertical splitting and horizontal splitting.

Vertical Splitting

Vertical splitting is straightforward: a single database is divided into multiple databases from a business perspective, such as separating an order database and a user database.

Horizontal Splitting

Horizontal splitting means that the same business data set is divided horizontally when its volume grows.

In the example, the order table reaches 40 million rows, while MySQL recommends keeping a single table under a few hundred thousand rows. Without splitting, performance degrades, so the data can be divided into four tables (or more), possibly combined with database‑level splitting.

Sharding Schemes

Common sharding schemes include hash‑modulo and range‑based partitioning. The core of any scheme is a routing algorithm that maps a routing key to a specific shard.

Hash‑Modulo Scheme

Assume we expect 40 million orders and each table can hold 10 million rows, we can design four tables.

The routing works by taking the routing key (e.g., id ) modulo the total number of tables. In the illustration, id=12 modulo 4 yields 0, so the order goes to table 0; id=13 modulo 4 yields 1, so it goes to table 1. The modulo base is 4 because there are four tables.

Advantages

Orders are evenly distributed across the four tables, avoiding hotspot issues.

Hotspot definition : A hotspot occurs when operations concentrate on a single table while other tables see little activity. Because order IDs are time‑based, recent orders tend to be inserted into the same table, creating a pressure imbalance.

Disadvantages

Data migration and scaling become difficult. If the business grows beyond 40 million rows and we need to add more tables, the modulo base changes, causing existing rows to map to different tables and making data inaccessible without migration.

For example, adding four more tables changes the modulo base to 8; an order that previously hashed to table 0 (e.g., id=12 ) would now map to table 4, so the original record cannot be found.

To accommodate the new tables, a full data migration is required, which can be painful for large companies that cannot afford downtime.
Migration tools can be built to automate the process, but each scaling event still incurs significant operational effort.

Is there a scheme that avoids migration? See below.

Range Scheme

The range scheme splits data by predefined ID ranges.

For example, id=12 goes to table 0, id=13 million goes to table 1. The ranges are defined in advance and routing is based on the ID.

Advantages

Future scaling does not require data migration; newly added tables handle IDs beyond the previous maximum, while existing ranges stay unchanged.

Disadvantages

Hotspots can still appear because IDs increase monotonically, causing recent orders to concentrate in the highest‑range table.

Summary

Hash‑Modulo Scheme : No hotspot issue, but scaling requires painful data migration. Range Scheme : No migration needed for scaling, but it may create hotspot problems.

database shardingHorizontal Splittingvertical splittinghash modulorange partitioning
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.