Databases 9 min read

Optimizing MySQL LIMIT Offset Queries with Subqueries to Reduce Table Scans

This article demonstrates how to dramatically improve MySQL LIMIT offset query performance by moving the filter into a subquery that selects only primary keys, then joining to retrieve full rows, and validates the approach with buffer pool statistics and execution time measurements.

Architect

Dec 10, 2020

Optimizing MySQL LIMIT Offset Queries with Subqueries to Reduce Table Scans

There is a financial transaction table containing 9,555,695 rows that originally used a simple LIMIT pagination query. The unoptimized query took 16.938 seconds (execution: 16.831 s, fetching: 107 ms). After applying the optimization described below, the same query completed in 347 ms (execution: 163 ms, fetching: 184 ms).

Operation: Move the filtering condition into a subquery that only selects the primary‑key IDs, then join the subquery result back to the main table to fetch the remaining columns.

Principle: Reduce the number of table‑row lookups (back‑to‑table operations).

-- Optimized before SQL
SELECT  various_fields
FROM `table_name`
WHERE  various_conditions
LIMIT 0,10;

-- Optimized after SQL
SELECT  various_fields
FROM `table_name` main_table
RIGHT JOIN (
    SELECT  primary_key_only
    FROM `table_name`
    WHERE  various_conditions
    LIMIT 0,10
) temp_table ON temp_table.primary_key = main_table.primary_key;

1. Introduction

MySQL version used: 5.7.17.

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.17    |
+-----------+
1 row in set (0.00 sec)

Table structure:

mysql> desc test;
+--------+---------------------+------+-----+---------+----------------+
| Field  | Type                | Null | Key | Default | Extra          |
+--------+---------------------+------+-----+---------+----------------+
| id     | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
| val    | int(10) unsigned    | NO   | MUL | 0       |                |
| source | int(10) unsigned    | NO   |     | 0       |                |
+--------+---------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

The table holds about 5.24 million rows. When using LIMIT offset, count with a large offset, performance degrades because MySQL must scan many rows before returning the requested subset.

mysql> select * from test where val=4 limit 300000,5;
+---------+-----+--------+
| id      | val | source |
+---------+-----+--------+
| 3327622 | 4   | 4      |
| 3327632 | 4   | 4      |
| 3327642 | 4   | 4      |
| 3327652 | 4   | 4      |
| 3327662 | 4   | 4      |
+---------+-----+--------+
5 rows in set (15.98 sec)

Rewriting the query as an inner join with a subquery that first selects the primary keys dramatically reduces the execution time:

mysql> select * from test a inner join (select id from test where val=4 limit 300000,5) b on a.id=b.id;
+---------+-----+--------+---------+
| id      | val | source | id      |
+---------+-----+--------+---------+
| 3327622 | 4   | 4      | 3327622 |
| 3327632 | 4   | 4      | 3327632 |
| 3327642 | 4   | 4      | 3327642 |
| 3327652 | 4   | 4      | 3327652 |
| 3327662 | 4   | 4      | 3327662 |
+---------+-----+--------+---------+
5 rows in set (0.38 sec)

The reason for the speedup is that the original query forces MySQL to read 300,005 index leaf nodes and then 300,005 clustered‑index rows, discarding the first 300,000 rows. The rewritten query only reads the 5 needed leaf nodes and then fetches the corresponding 5 clustered rows, resulting in far fewer random I/O operations.

To verify this behavior, the buffer‑pool page statistics were examined before and after each query. The first query loaded 4,098 data pages and 208 index pages into the buffer pool, while the second query loaded only 5 data pages and 390 index pages.

-- Buffer pool stats after first query
SELECT index_name, COUNT(*)
FROM information_schema.INNODB_BUFFER_PAGE
WHERE INDEX_NAME IN ('val','primary') AND TABLE_NAME LIKE '%test%'
GROUP BY index_name;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY    | 4098     |
| val        | 208      |
+------------+----------+
2 rows in set (0.04 sec)

-- Buffer pool stats after optimized query
SELECT index_name, COUNT(*)
FROM information_schema.INNODB_BUFFER_PAGE
WHERE INDEX_NAME IN ('val','primary') AND TABLE_NAME LIKE '%test%'
GROUP BY index_name;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY    | 5        |
| val        | 390      |
+------------+----------+
2 rows in set (0.03 sec)

This demonstrates that the optimized query not only runs faster but also reduces buffer‑pool pollution by loading far fewer pages.

To ensure a clean buffer pool on each MySQL restart, the following options should be disabled:

innodb_buffer_pool_dump_at_shutdown
innodb_buffer_pool_load_at_startup

References

1. https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/ 2. https://dev.mysql.com/doc/refman/5.7/en/innodb-information-schema-buffer-pool-tables.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

query optimization MySQL buffer pool subquery limit offset

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.