Databases 11 min read

Performance Comparison of Different Batch Insert Strategies in MySQL

This article evaluates single‑threaded bulk‑insert performance in MySQL by testing various batch sizes, analyzes memory usage and transaction merging, and provides practical recommendations for efficiently generating millions of records.

政采云技术

Mar 29, 2022

Performance Comparison of Different Batch Insert Strategies in MySQL

Background

In many projects we need to add large amounts of data, such as generating test data, simple data migration, or statistical reports. Inserting or updating massive data sets in a single operation can take a long time and fail to meet expectations.

This article explores how to quickly generate millions of rows, first by using multithreading and then by optimizing single‑threaded batch inserts to achieve the greatest possible speedup.

Technology Stack

Java 1.8

MyBatis

MySQL 5.7

Preparation

A simple user table is created for testing:

create table zcy_user (
    id          int auto_increment primary key,
    user_name   varchar(32)  null,
    sex         int          null,
    email       varchar(50)  null,
    address     varchar(100) null,
    create_time timestamp    null
);

The MyBatis XML mapper is extended with a batch‑insert statement:

insert into t_user(`user_name`,`sex`,`email`,`address`) VALUES
<foreach collection="list" item="emp" separator=",">
  (#{emp.userName}, #{emp.sex}, #{emp.email} ,#{emp.address})
</foreach>

Test Timing (Single‑Threaded)

Using a controlled variable method, each test inserts 1,000,000 rows in a single thread and records the elapsed time.

Insert 1 row per batch

1 000 rows take 311.65 seconds, so 1 000 000 rows are estimated at ~31 165 seconds.

Insert 100 rows per batch

100 rows per batch take 373.58 seconds.

Insert 1 000 rows per batch

1 000 rows per batch take 62.08 seconds.

Insert 10 000 rows per batch

10 000 rows per batch take 31.83 seconds.

Insert 50 000 rows per batch

50 000 rows per batch take 29.30 seconds.

Insert 100 000 rows per batch

The operation fails because the packet size exceeds the MySQL limit (5 900 064 KB > 4 194 304 KB).

Comparison Table

Batch Size

Time (s)

Memory

Speed

Speed‑up Ratio

31165

Idle

Slow

100

373

Good

Slow

83×

1000

Good

Medium

502×

10000

Medium

Fast

1005×

50000

Large

Fast

1074×

100000

Very Large

Extended Knowledge

MySQL Memory Structure

According to the MySQL documentation, the InnoDB storage engine consists of memory storage (Buffer Pool, Log Buffer) and disk storage. Since random disk I/O is slow, InnoDB first reads data into the Buffer Pool and writes back via I/O threads.

Buffer Pool

The Buffer Pool caches frequently accessed data and indexes to avoid disk I/O. It uses an LRU algorithm, but plain LRU can be inefficient for full‑table scans. MySQL improves this by splitting the pool into hot (5/8) and cold (3/8) regions, moving pages with request intervals longer than 1 second to the cold region.

# View hot/cold split time (default 1000ms)
mysql> show variables like 'innodb_old_blocks_time';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| innodb_old_blocks_time | 1000  |
+------------------------+-------+
1 row in set (0.00 sec)

Change Buffer

When a page to be updated is not in memory, InnoDB stores the modification in the Change Buffer. The actual page is loaded later, and the buffered changes are merged, reducing disk reads and improving performance. If the system crashes, Redo Log ensures consistency.

Performance Improvement Analysis

Transaction Merging

Batch inserts combine many VALUES clauses into a single transaction, generating only one Redo Log entry and reducing disk I/O. However, the log file size has an upper limit; exceeding it prevents further gains.

# Single transaction with many VALUES
insert into t_user(`user_name`,`sex`,`email`,`address`) VALUES
(1,1,1,1),(2,2,2,2),...,(n,n,n,n);

# Multiple single‑row statements within one transaction
begin;
insert into t_user(`user_name`,`sex`,`email`,`address`) value(1,1,1,1);
insert into t_user(`user_name`,`sex`,`email`,`address`) value(2,2,2,2);
... 
insert into t_user(`user_name`,`sex`,`email`,`address`) value(n,n,n,n);
commit;

Ensuring Sequential Data

InnoDB stores data in B+‑tree indexes. Inserting rows with sequential primary keys keeps pages contiguous, enabling sequential I/O. Random keys cause scattered page writes, leading to costly random I/O.

Use Cases

Data report import

Bulk data initialization

Massive data migration (millions of rows)

Log table recording

Conclusion

When projects need to handle millions of rows, performance must be guaranteed. The tests show that merging many inserts into larger batches dramatically reduces execution time. By combining requests into a single transaction, optimizing batch size, and ensuring sequential keys, database load and latency are minimized, resulting in a more stable application.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance testing mysql Database Optimization buffer pool Bulk Insert transaction merging

Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Technology Stack

Preparation

Test Timing (Single‑Threaded)

Insert 1 row per batch

Insert 100 rows per batch

Insert 1 000 rows per batch

Insert 10 000 rows per batch

Insert 50 000 rows per batch

Insert 100 000 rows per batch

Comparison Table

Extended Knowledge

MySQL Memory Structure

Buffer Pool

Change Buffer

Performance Improvement Analysis

Transaction Merging

Ensuring Sequential Data

Use Cases

Conclusion

政采云技术

How this landed with the community

Was this worth your time?

0 Comments

Insert 1 000 rows per batch

Insert 10 000 rows per batch

Insert 50 000 rows per batch

Insert 100 000 rows per batch