How to Solve MySQL Deep Pagination: Causes and 6 Effective Optimization Strategies
Deep pagination in MySQL becomes a performance bottleneck when tables reach millions of rows, causing excessive row scans and costly index lookups; this article explains its root causes and presents six practical optimization techniques—including keyset pagination, delayed joins, covering indexes, data partitioning, and external search engines—to dramatically improve query speed.
In everyday development we often need page pagination and report statistics; MySQL works fine with small data, but when tables reach millions of rows, deep pagination occurs, leading to very slow or timed‑out queries.
1. Causes of Deep Pagination
MySQL has primary key indexes and secondary indexes. The leaf nodes of a primary key index store the actual row data, while non‑leaf nodes store page numbers and primary key values. The leaf nodes of a secondary index store only the primary key values; to retrieve full rows the database must perform a “back‑to‑table” lookup using the primary key.
Back‑to‑table query: fetch the IDs from the secondary index leaf nodes, then use the primary index to retrieve the full rows.
Example SQL:
<code>select * from order where pay_time > '2021-10-01 10:00:00' limit 10000,10</code>Although only 10 rows are needed, MySQL first scans 10,010 rows, discards the first 10,000, and then returns the last 10, causing heavy I/O and long execution time.
2. Optimization Techniques
To mitigate deep pagination we aim to reduce back‑to‑table operations and avoid scanning irrelevant rows.
2.1 Keyset Pagination (Tag Method)
Record the last ID from the previous page and query the next page starting from that ID, eliminating the need to scan earlier rows.
<code>SELECT * FROM order WHERE id > #{last_id} and pay_time > '2021-10-01 10:00:00' ORDER BY id LIMIT 10;</code>Drawbacks: primary key must be auto‑increment and random page jumps are not supported.
2.2 Delayed Join (Late Association)
Move the filter condition to the primary key index to reduce back‑to‑table lookups, using an inner join instead of a subquery.
<code>select * from order as t inner join (select id from order w where w.pay_time >= '2021-10-01 10:00:00' order by w.pay_time limit 10000,10) as tmp on tmp.id = t.id;</code>Drawbacks: possible inaccurate results when ORDER BY depends on joined tables, increased query complexity, and the optimizer may choose a sub‑optimal execution plan.
2.3 Covering Index
A covering index allows MySQL to satisfy the query directly from the index without accessing the table, reducing I/O.
2.4 Data Pre‑processing
Pre‑compute query results during data updates and store them in a separate table; queries then read from this table, avoiding deep pagination. This approach is rarely used in practice due to maintenance overhead.
2.5 Partitioning and Sharding
Distribute data across multiple tables or partitions (e.g., by ID range) to limit the amount of data scanned per query.
<code># Create partition tables
CREATE TABLE order1 LIKE table;
CREATE TABLE order2 LIKE table;
# Insert based on id range
SELECT * FROM order1 WHERE id <= 5000;
INSERT INTO order1 ...;
SELECT * FROM order2 WHERE id > 5000;
INSERT INTO order2 ...;</code>2.6 External Search Engines
For full‑text search and complex queries, tools like Elasticsearch or Solr provide better pagination performance, though they can also suffer from deep pagination issues.
Choosing the appropriate solution depends on the specific business scenario and requirements.
Lobster Programming
Sharing insights on technical analysis and exchange, making life better through technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.