Lessons Learned from Misusing INSERT INTO SELECT in MySQL: Avoid Full Table Scans and OOM
This article recounts a costly MySQL data‑migration failure caused by an unchecked INSERT INTO SELECT that triggered full‑table scans, lock contention and out‑of‑memory errors, and explains how proper indexing and awareness of transaction isolation can prevent such incidents.
The author warns that using INSERT INTO SELECT carelessly can lead to severe production problems, recounting a real incident where a colleague’s migration task caused a loss of nearly 100,000 CNY and resulted in termination.
Background
The company processes millions of rows daily in a single MySQL table without sharding, and needed to migrate historical data to keep the table performant.
Two migration strategies were considered:
Programmatically query data, insert into a history table, then delete the original rows.
Let MySQL handle everything with INSERT INTO SELECT .
The first approach OOMed when loading all data at once; after batch processing it caused excessive I/O and latency, so the team chose the second approach, which passed tests and was deployed.
First Approach Code (Pseudo‑code)
// 1. Query data to migrate
List
list = selectData();
// 2. Insert data into history table
insertData(list);
// 3. Delete original rows
deleteByIds(ids);The OOM issue is obvious: the entire dataset is loaded into memory.
Second Approach Details
The migration kept only the last ten days of data (about 10 k rows) by executing a statement similar to:
INSERT INTO target SELECT * FROM source WHERE dateTime < (NOW() - INTERVAL 10 DAY)
The job was scheduled as a nightly task at 20:00. While the test environment handled the operation fine, the production run caused intermittent insert failures after midnight, leading to missing payment records and a financial discrepancy.
Root Cause Analysis
EXPLAIN of the SELECT part showed a full‑table scan, meaning the migration took a long time and held locks on the large table.
Under MySQL’s default isolation level, INSERT INTO a SELECT b locks table a entirely while locking rows of b one by one. This caused lock contention: some rows were locked and could not be inserted, resulting in time‑outs or outright failures.
The test missed the problem because the test data volume and concurrent workload did not reflect the real production scenario.
Solution
Prevent full‑table scans by adding appropriate indexes on the columns used in the WHERE clause so that the SELECT part can use an index lookup.
After adding the index, the migration runs quickly without locking the whole table, and the intermittent failures disappear.
Can INSERT INTO SELECT Still Be Used?
Yes, but only when the query is properly indexed and the impact on locking is understood.
Conclusion
When using INSERT INTO SELECT for large‑scale data migration, always ensure the SELECT statement is indexed to avoid full‑table scans and be aware of the locking behavior under the current transaction isolation level.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.