Optimizing MySQL Join Queries: Algorithms, Execution Plans, and Practical Tips
This article explains the characteristics of various MySQL join algorithms—including Simple Nested‑Loop, Index Nested‑Loop, Block Nested‑Loop, Hash Join, and Batched Key Access—demonstrates how to examine execution plans, configure optimizer settings, and apply best‑practice optimizations such as indexing, choosing the small driver table, and upgrading MySQL versions.
This article introduces the optimization of MySQL join queries, beginning with a summary of join algorithm characteristics. The main algorithms covered are Simple Nested‑Loop Join, Index Nested‑Loop Join, Block Nested‑Loop Join, Hash Join, and Batched Key Access (BKA).
1. Join Algorithms Overview
Simple Nested‑Loop Join scans each row of the driver table and performs a full table scan on the inner table for matching rows, resulting in O(m·n) row scans when the driver table has m rows and the inner table has n rows.
Index Nested‑Loop Join uses an index on the inner table's join column to locate matching rows for each driver row, greatly reducing scan cost.
Block Nested‑Loop Join reads the driver table into a join buffer and then compares each inner‑table row against this buffer, which is used when the inner join column lacks an index (pre‑MySQL 8.0.20).
Hash Join loads the driver table into memory, builds a hash table on the join column, and probes it with each inner‑table row, offering lower I/O than Block Nested‑Loop.
Batched Key Access (BKA) combines the strengths of NLJ and BNL: it batches join column values from the driver table, sends them to the Multi‑Range Read (MRR) interface, which sorts the primary‑key IDs and reads rows sequentially, reducing random I/O.
2. Example Setup
use martin;
DROP TABLE IF EXISTS t1;
CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`a` int DEFAULT NULL,
`b` int DEFAULT NULL,
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'record creation time',
`update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'record update time',
PRIMARY KEY (`id`),
KEY `idx_a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
DROP PROCEDURE IF EXISTS insert_t1;
DELIMITER ;;
CREATE PROCEDURE insert_t1()
BEGIN
DECLARE i int;
SET i = 1;
WHILE i <= 10000 DO
INSERT INTO t1(a,b) VALUES(i,i);
SET i = i + 1;
END WHILE;
END;;
DELIMITER ;
CALL insert_t1();
DROP TABLE IF EXISTS t2;
CREATE TABLE t2 LIKE t1;
INSERT INTO t2 SELECT * FROM t1 LIMIT 100;Two tables (t1 and t2) are created with indexed column a and non‑indexed column b for demonstration.
3. Execution Plan Inspection
Running EXPLAIN SELECT * FROM t1 INNER JOIN t2 ON t1.a = t2.a; shows that MySQL chooses the inner table (t2) as the driver because it is smaller, and the plan uses NLJ (no "Using join buffer" in the Extra column).
When joining on non‑indexed column b , the plan shows Using join buffer (Block Nested Loop) on MySQL 5.7, while MySQL 8.0.25 shows Using join buffer (hash join) , indicating the switch to Hash Join after version 8.0.20.
4. Enabling BKA
SET optimizer_switch='mrr=on,mrr_cost_based=off,batched_key_access=on';After enabling these switches, the execution plan displays Using join buffer (Batched Key Access) , confirming that BKA is active.
5. Practical Optimization Recommendations
Add indexes on join columns to allow the optimizer to use Index Nested‑Loop Join or BKA instead of Block Nested‑Loop or Hash Join.
Prefer the smaller table as the driver (or use STRAIGHT_JOIN to force it) to reduce the number of scanned rows.
Upgrade to MySQL 8.0.20 or later so that Hash Join replaces Block Nested‑Loop Join, reducing I/O.
For large datasets, enable BKA (via optimizer_switch ) to batch key lookups and benefit from MRR.
Examples comparing execution time show that joining on indexed column a (Index Nested‑Loop) takes ~0.01 s, while joining on non‑indexed column b (Block Nested‑Loop) takes ~0.08 s.
Choosing the small table as the driver further reduces scanned rows, as demonstrated with SELECT * FROM t2 STRAIGHT_JOIN t1 ON t2.a = t1.a; versus the opposite order.
In summary, to optimize MySQL join queries: add appropriate indexes, let the optimizer pick the smallest driver table, enable BKA for large joins, and consider upgrading to newer MySQL versions for better join algorithms.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.