Optimizing MySQL LIMIT Offset Queries with Subqueries to Reduce Table Scans
This article demonstrates how to dramatically improve MySQL LIMIT offset query performance by moving the filter into a subquery that selects only primary keys, then joining to retrieve full rows, and validates the approach with buffer pool statistics and execution time measurements.
There is a financial transaction table containing 9,555,695 rows that originally used a simple LIMIT pagination query. The unoptimized query took 16.938 seconds (execution: 16.831 s, fetching: 107 ms). After applying the optimization described below, the same query completed in 347 ms (execution: 163 ms, fetching: 184 ms).
Operation: Move the filtering condition into a subquery that only selects the primary‑key IDs, then join the subquery result back to the main table to fetch the remaining columns.
Principle: Reduce the number of table‑row lookups (back‑to‑table operations).
-- Optimized before SQL
SELECT various_fields
FROM `table_name`
WHERE various_conditions
LIMIT 0,10; -- Optimized after SQL
SELECT various_fields
FROM `table_name` main_table
RIGHT JOIN (
SELECT primary_key_only
FROM `table_name`
WHERE various_conditions
LIMIT 0,10
) temp_table ON temp_table.primary_key = main_table.primary_key;1. Introduction
MySQL version used: 5.7.17.
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.17 |
+-----------+
1 row in set (0.00 sec)Table structure:
mysql> desc test;
+--------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| val | int(10) unsigned | NO | MUL | 0 | |
| source | int(10) unsigned | NO | | 0 | |
+--------+---------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)The table holds about 5.24 million rows. When using LIMIT offset, count with a large offset, performance degrades because MySQL must scan many rows before returning the requested subset.
mysql> select * from test where val=4 limit 300000,5;
+---------+-----+--------+
| id | val | source |
+---------+-----+--------+
| 3327622 | 4 | 4 |
| 3327632 | 4 | 4 |
| 3327642 | 4 | 4 |
| 3327652 | 4 | 4 |
| 3327662 | 4 | 4 |
+---------+-----+--------+
5 rows in set (15.98 sec)Rewriting the query as an inner join with a subquery that first selects the primary keys dramatically reduces the execution time:
mysql> select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id;
+---------+-----+--------+---------+
| id | val | source | id |
+---------+-----+--------+---------+
| 3327622 | 4 | 4 | 3327622 |
| 3327632 | 4 | 4 | 3327632 |
| 3327642 | 4 | 4 | 3327642 |
| 3327652 | 4 | 4 | 3327652 |
| 3327662 | 4 | 4 | 3327662 |
+---------+-----+--------+---------+
5 rows in set (0.38 sec)The reason for the speedup is that the original query forces MySQL to read 300,005 index leaf nodes and then 300,005 clustered‑index rows, discarding the first 300,000 rows. The rewritten query only reads the 5 needed leaf nodes and then fetches the corresponding 5 clustered rows, resulting in far fewer random I/O operations.
To verify this behavior, the buffer‑pool page statistics were examined before and after each query. The first query loaded 4,098 data pages and 208 index pages into the buffer pool, while the second query loaded only 5 data pages and 390 index pages.
-- Buffer pool stats after first query
SELECT index_name, COUNT(*)
FROM information_schema.INNODB_BUFFER_PAGE
WHERE INDEX_NAME IN ('val','primary') AND TABLE_NAME LIKE '%test%'
GROUP BY index_name;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY | 4098 |
| val | 208 |
+------------+----------+
2 rows in set (0.04 sec) -- Buffer pool stats after optimized query
SELECT index_name, COUNT(*)
FROM information_schema.INNODB_BUFFER_PAGE
WHERE INDEX_NAME IN ('val','primary') AND TABLE_NAME LIKE '%test%'
GROUP BY index_name;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY | 5 |
| val | 390 |
+------------+----------+
2 rows in set (0.03 sec)This demonstrates that the optimized query not only runs faster but also reduces buffer‑pool pollution by loading far fewer pages.
To ensure a clean buffer pool on each MySQL restart, the following options should be disabled:
innodb_buffer_pool_dump_at_shutdown
innodb_buffer_pool_load_at_startupReferences
1. https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/ 2. https://dev.mysql.com/doc/refman/5.7/en/innodb-information-schema-buffer-pool-tables.html
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.