When Does MySQL Use Internal Temporary Tables? Analysis of UNION, GROUP BY and Other Scenarios
This article explains the circumstances under which MySQL creates internal temporary tables, illustrating the behavior with UNION, UNION ALL, GROUP BY, various hints, and additional query patterns, and provides practical examples and execution plans to help developers understand and optimise their queries.
Temporary tables are used by MySQL to store intermediate results that need to be cleared when the session ends. MySQL has two kinds of temporary tables: external temporary tables created explicitly with CREATE TEMPORARY TABLE , and internal temporary tables that the optimizer creates automatically.
The article investigates the question: When does MySQL use internal temporary tables? It analyses common scenarios such as UNION and GROUP BY , and lists many other cases where internal temporary tables appear.
1 UNION Scenario
First a test table employees is created and populated with 5,000 rows using a Python script that inserts random data via pymysql . The script is shown below:
CREATE TABLE `employees` (
`id` int NOT NULL AUTO_INCREMENT,
`first_name` varchar(100) COLLATE utf8mb4_bin DEFAULT NULL,
`last_name` varchar(100) COLLATE utf8mb4_bin DEFAULT NULL,
`sex` enum('M','F') COLLATE utf8mb4_bin DEFAULT NULL,
`age` int DEFAULT NULL,
`birth_date` date DEFAULT NULL,
`hire_date` date DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `last_name` (`last_name`),
KEY `hire_date` (`hire_date`)
) ENGINE=InnoDB AUTO_INCREMENT=500002 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
# Python script (simplified)
import random, pymysql
from faker import Faker
from datetime import datetime, timedelta
fake = Faker()
db_params = {
'host': 'localhost',
'user': 'root',
'password': 'root',
'db': 'db1',
'port': 3311
}
connection = pymysql.connect(**db_params)
cursor = connection.cursor()
for i in range(5000):
id = i + 1
first_name = fake.first_name()
last_name = fake.last_name()
sex = random.choice(['M', 'F'])
age = random.randint(20, 60)
birth_date = fake.date_between(start_date='-60y', end_date='-20y')
hire_date = fake.date_between(start_date='-30y', end_date='today')
query = f"""INSERT INTO employees (id, first_name, last_name, sex, age, birth_date, hire_date)
VALUES ('{id}', '{first_name}', '{last_name}', '{sex}', {age}, '{birth_date}', '{hire_date}');"""
cursor.execute(query)
if (i + 1) % 1000 == 0:
connection.commit()
connection.commit()
cursor.close()
connection.close()After the data is ready, the following UNION query is explained:
explain (select 5000 as res from dual) union (select id from employees order by id desc limit 2);The execution plan shows that the second SELECT uses the primary key index, and the UNION result uses a temporary table (Extra = "Using temporary"). UNION creates an in‑memory internal temporary table to eliminate duplicates.
When UNION ALL is used, the optimizer does not need a temporary table because duplicate elimination is not required. The plan for UNION ALL shows no "Using temporary" entry.
2 GROUP BY Scenario
The article then examines a GROUP BY query that groups rows by id % 5 and counts the rows in each group:
explain select id%5 as complement, count(*) from employees group by complement order by 1;The plan contains Using index; Using temporary; Using filesort , indicating that MySQL creates an internal temporary table to store the grouping columns and then sorts the result.
A note explains that MySQL 5.7 automatically orders by the GROUP BY column, while MySQL 8.0 does not, so an explicit ORDER BY is added to reproduce the ordering.
During GROUP BY (or DISTINCT) processing, MySQL can be instructed to avoid the in‑memory temporary table with hints such as SQL_BIG_RESULT (forces a disk‑based temporary table) or SQL_SMALL_RESULT (forces an in‑memory temporary table). The article provides a table of these hints:
hint
description
SQL_BIG_RESULT
Explicitly use a disk‑based internal temporary table; suitable for large data sets; works with InnoDB and Memory engines.
SQL_SMALL_RESULT
Explicitly use an in‑memory internal temporary table; faster for small data sets; works with Memory engine.
An example using SQL_BIG_RESULT shows that the Using Temporary flag disappears from the Extra column, confirming that the hint avoided the in‑memory temporary table.
3 Other Scenarios
Beyond UNION and GROUP BY, MySQL may create internal temporary tables in many other situations, including:
Evaluating UNION statements (with some exceptions).
Evaluating certain views that use the TEMPTABLE algorithm, UNION, or aggregation.
Evaluating derived tables.
Evaluating common table expressions.
Materialising tables for subqueries or semi‑joins.
Evaluating statements with ORDER BY and different GROUP BY clauses, especially when they reference columns from tables other than the first in the join order.
Combining DISTINCT with ORDER BY.
Queries using the SQL_SMALL_RESULT modifier (unless they need disk storage).
INSERT … SELECT statements that select from the same table they insert into.
Multi‑table UPDATE statements.
Evaluating GROUP_CONCAT() or COUNT(DISTINCT) expressions.
Evaluating window functions when a temporary table is required.
Some conditions force MySQL to use a disk‑based temporary table instead of an in‑memory one, such as tables containing BLOB/TEXT columns, very long string columns in UNION/UNION ALL, or SHOW COLUMNS/DESCRIBE results that involve BLOB types.
References
[1] Ding Qi, MySQL 45 Lectures , Chapter 37 – When are internal temporary tables used?
[2] MySQL 8.0 Reference Manual – Internal Temporary Table Use: https://dev.mysql.com/doc/refman/8.0/en/internal-temporary-tables.html
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.