Backend Development 7 min read

Lessons Learned from Misusing INSERT INTO SELECT for MySQL Data Migration

A real‑world case study shows how using INSERT INTO SELECT without proper indexing caused full‑table scans, lock contention, and data loss, ultimately leading to a costly incident and a colleague's dismissal.

Selected Java Interview Questions

Nov 13, 2023

Lessons Learned from Misusing INSERT INTO SELECT for MySQL Data Migration

A blood‑curdling lesson: careless use of INSERT INTO SELECT caused a colleague to be fired after the company lost nearly 100,000 RMB.

1. Origin of the Issue

The company processes a large volume of transactions on a single MySQL table, with daily increments of about one million rows and no sharding, so data migration was needed to maintain performance.

A colleague proposed two solutions:

Fetch data via a program, insert into a history table, then delete the original rows.

Let the database handle everything with INSERT INTO SELECT.

The first approach caused an out‑of‑memory (OOM) error when loading all data at once; batch processing would generate excessive I/O and take too long, so the second approach was chosen, tested successfully, and deployed—until the colleague was dismissed.

2. What Actually Happened?

First solution – pseudo code

// 1. Query the data to be migrated
List<Object> list = selectData();

// 2. Insert data into the history table
insertData(list);

// 3. Delete the original rows
deleteByIds(ids);

The OOM occurred because the entire dataset was loaded into memory at once.

Second solution – using INSERT INTO SELECT with a date filter:

```sql INSERT INTO target_table SELECT * FROM source_table WHERE dateTime < (NOW() - INTERVAL 10 DAY); ```

The task was scheduled at 8 PM, assuming low load. The test environment with 10 k rows passed, but after deployment the nightly job caused payment‑record insert failures, leading to missing data and financial mismatches.

Investigation showed that the migration task was the root cause; stopping the task eliminated the problem.

3. Post‑mortem

Where was the problem?

The INSERT INTO SELECT statement performed a full‑table scan, as shown by its EXPLAIN plan, resulting in long execution time and extensive locking.

When the condition was changed to use an indexed column, the full scan disappeared and the issue was resolved.

Conclusion: The full‑table scan caused the incident.

Root Cause

Under the default transaction isolation level, INSERT INTO a SELECT b locks table a entirely while locking rows of table b one by one. This leads to lock contention on the payment‑flow table, causing timeouts or failures for some rows.

Why did testing not reveal the bug?

The test environment used real‑world data but did not simulate the massive concurrent inserts that occur during the actual migration, so the problem only manifested in production.

4. Solution

Avoid full‑table scans by adding appropriate indexes on the columns used in the WHERE clause so that the SELECT part can use an index.

5. Can INSERT INTO SELECT still be used?

Yes, it can be used safely if proper indexing and locking considerations are applied.

6. Summary

When using INSERT INTO SELECT, be cautious and ensure that the query is indexed to prevent full‑table scans and lock contention.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Migration Performance indexing INSERT INTO SELECT

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.