Performance Testing and Optimization of Large‑Scale MySQL Queries
This article demonstrates how to generate, insert, and query millions of rows in a MySQL 5.7 table, measures pagination performance under varying offsets and result sizes, and presents several optimization techniques—including sub‑queries, indexed look‑ups, and column selection—to dramatically reduce query latency.
Preparation of Test Data
The article begins by creating a table user_operation_log with an auto‑increment primary key and a dozen varchar columns. The full CREATE TABLE statement is provided.
Data Generation Script
A stored procedure batch_insert_log() is defined to insert ten million rows using batch inserts, committing every 1,000 rows to improve speed. The complete procedure code is shown inside a pre block.
Test Execution
On a low‑end Windows 11 machine (i7, ~800 MB/s SSD) 3,148,000 rows are inserted, consuming about 5 GB of disk space and taking 38 minutes. A simple SELECT count(1) FROM user_operation_log confirms the row count.
Normal Pagination Queries
Using MySQL's LIMIT clause, the author runs three queries retrieving 10 rows from offset 10,000 and records execution times of 59 ms, 49 ms, and 50 ms, showing acceptable latency on a local database.
Same Offset, Different Result Sizes
Queries with the same offset (10,000) but increasing row limits (10, 100, 1,000, 10,000, 100,000, 1,000,000) are executed. The following table summarises the three runs for each size:
Rows
Run 1
Run 2
Run 3
10
53 ms
52 ms
47 ms
100
50 ms
60 ms
55 ms
1,000
61 ms
74 ms
60 ms
10,000
164 ms
180 ms
217 ms
100,000
1,609 ms
1,741 ms
1,764 ms
1,000,000
16,219 ms
16,889 ms
17,081 ms
Result: larger result sets take proportionally longer.
Same Data Size, Different Offsets
Queries retrieving 100 rows from offsets 100, 1,000, 10,000, 100,000, and 1,000,000 are executed. Execution times are shown below:
Offset
Run 1
Run 2
Run 3
100
36 ms
40 ms
36 ms
1,000
31 ms
38 ms
32 ms
10,000
53 ms
48 ms
51 ms
100,000
622 ms
576 ms
627 ms
1,000,000
4,891 ms
5,076 ms
4,856 ms
Result: larger offsets increase query latency.
How to Optimize
Optimizing Large Offsets
Instead of scanning the whole table, first locate the id at the desired offset and then fetch rows using that id . Example queries are provided, and timings show that using an index on id reduces the third query from ~4 s to ~0.2 s.
For tables where id is not strictly incremental, a nested SELECT ... IN (SELECT ... LIMIT ...) pattern is suggested.
Another approach is to use a range condition:
SELECT * FROM `user_operation_log` WHERE id BETWEEN 1000000 AND 1000100 LIMIT 100;Timing results (22 ms and 21 ms) demonstrate the speed of range scans.
Optimizing Large Result Sets
Fetching only required columns instead of * dramatically reduces execution time. Three queries are compared: * , selecting only id , and selecting all columns; the latter takes ~15 s, while the former two finish within 7–8 s.
The article also explains why SELECT * is discouraged: it forces the server to resolve many objects and increases network payload, especially when the client is remote.
Conclusion
The experiments confirm two key observations: (1) query time grows with both the amount of data scanned and the offset used; (2) using indexed sub‑queries, range conditions, or limiting selected columns can mitigate these performance penalties.
Laravel Tech Community
Specializing in Laravel development, we continuously publish fresh content and grow alongside the elegant, stable Laravel framework.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.