Why a Single MySQL Table Should Not Exceed 20 Million Rows: Theory and Estimation
The article explains that, although MySQL’s theoretical limits are far higher, practical B+‑tree page and index calculations suggest keeping a single table under about twenty million one‑kilobyte rows to avoid performance degradation, illustrated with interview examples and estimation formulas.
A qualified DBA often advises sharding when a single table exceeds ten million rows, claiming that such size degrades query performance. This article digs into the underlying principles to explain why a single table should generally stay below twenty million rows.
Interview scenario : The author recounts a technical interview where the candidate described a case of splitting tables and databases because daily operation logs exceeded 50 million rows. The interviewer asked why a single table would hurt performance, and the candidate answered that the industry consensus is to keep a MySQL table under 20 million rows.
1. Auto‑increment primary‑key perspective
If the primary key is INT , the maximum value is 2³²‑1 (~2.1 billion). If it is BIGINT , the theoretical limit is 2⁶⁴‑1, which is far beyond disk capacity. A TINYINT maxes at 255, causing an overflow at ID = 256.
2. Data‑page perspective
In InnoDB each table is stored in a .ibd file divided into 16 KB pages. After accounting for page number, pointers, directory, and checksum, roughly 15 KB is usable for actual rows. Assuming a BIGINT primary key (8 bytes) and a 4‑byte page number, each row occupies 12 bytes of index space, allowing about 1280 index entries per page (X ≈ 1280). If a typical row size is 1 KB, each page can hold about 15 rows (Y ≈ 15).
Data pages are linked via B+‑tree structures: leaf pages store rows, non‑leaf pages store pointers, checksums, and a page directory. The height of the B+‑tree (N) determines how many leaf pages exist: X^(N‑1) leaf pages.
3. Estimation
For a 2‑level B+‑tree (N = 2): M ≈ 1280¹ × 15 ≈ 2 × 10⁴ rows.
For a 3‑level tree (N = 3): M ≈ 1280² × 15 ≈ 2.5 × 10⁵ rows.
For a 4‑level tree (N = 4): M ≈ 1280³ × 15 ≈ 3 × 10⁹ rows.
Based on these calculations, the practical recommendation is to keep a single table under twenty million rows (assuming ~1 KB per row). This is a guideline, not a hard rule.
Thought exercise
Question: How many records can a 4‑level B+‑tree store if the primary key is BIGINT and each record averages 1 KB?
Answer: Using M = X^(N‑1) × Y, with X = 1280, N = 4, Y = 15, we get M = 1280³ × 15 ≈ 3 × 10⁹ records.
The article concludes with a reminder that the numbers are based on simplified assumptions (1 KB rows, no fragmentation) and encourages readers to share the knowledge.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.