Databases 16 min read

What’s the Optimal Batch Size for MySQL Inserts? A Deep Performance Test

This article investigates how many rows should be inserted per batch in MySQL by measuring the impact of packet size limits, buffer pool usage, insert buffers, transaction handling and index structures, and it provides practical recommendations based on tests with millions of rows.

Liangxu Linux
Liangxu Linux
Liangxu Linux
What’s the Optimal Batch Size for MySQL Inserts? A Deep Performance Test

Introduction

When inserting large volumes of data—such as millions of rows—into MySQL, batch insertion is the preferred technique, but the optimal batch size is unclear. The article explores how batch size affects performance when inserting into a temporary table.

Preparation

The author initially used a loop that inserted 1000 rows per batch, mirroring other projects, and decided to experiment with different batch sizes.

Checking MySQL version

mysql> select version();
+------------+
| version()  |
+------------+
| 5.6.34-log |
+------------+
1 row in set (0.00 sec)

Table schema

The temporary table contains four columns: three int(10) fields and one varchar(10) field. The total size of a single row is calculated as 4+4+4+40 = 52 bytes (assuming UTF‑8‑mb4 characters for the varchar).

Time distribution of a single insert

Link overhead (30%)
Send query (20%)
Parse query (20%)
Insert operation (10% * row count)
Insert index (10% * index count)
Close connection (10%)

The dominant cost is connection and parsing, not the actual INSERT execution, which motivates larger batch sizes to reduce round‑trip overhead.

SQL size limits

The max_allowed_packet variable controls the maximum packet size. By default MySQL 5.7 allows up to 1 MiB for a client query (16 MiB for the client library, 4 MiB for the server). The current setting on the test server is 33554432 bytes (32 MiB).

Maximum rows per batch

Using the 1 MiB limit and the 52‑byte row size, the theoretical maximum rows per batch is (1024*1024)/52 ≈ 20165. To stay safely below the limit, the author caps the batch at 20000 rows, which yields up to 20000 * 32 = 640000 rows for a 32 MiB packet.

Performance tests

11 W rows (≈110 000)

110000 rows, batch 10      → 2.361 s
110000 rows, batch 600     → 0.523 s
110000 rows, batch 1000    → 0.429 s
110000 rows, batch 20000   → 0.426 s
110000 rows, batch 80000   → 0.352 s

Increasing the batch size reduces total time, but the improvement plateaus around 20 000–80 000 rows.

24 W rows (≈241 397)

241397 rows, batch 10   → 4.445 s
241397 rows, batch 600 → 1.187 s
241397 rows, batch 1000→ 1.130 s
241397 rows, batch 20000→ 0.933 s
241397 rows, batch 80000→ 0.753 s

Here a batch of 24 W rows (≈240 000) shows the best performance, suggesting that the earlier 20 000 limit is not yet reached.

42 W rows (≈418 859)

418859 rows, batch 1000 → 2.216 s
418859 rows, batch 80000→ 1.777 s
418859 rows, batch 1.6 W → 1.523 s
418859 rows, batch 2 W   → 1.432 s
418859 rows, batch 3 W   → 1.362 s
418859 rows, batch 4 W   → 1.764 s

Performance improves up to about 30 W rows per batch, after which it degrades, likely because the packet size approaches the max_allowed_packet limit and memory consumption rises.

Other factors affecting insert speed

Buffer pool

If the InnoDB buffer pool has less than 25 % free space, inserts can fail with DB_LOCK_TABLE_FULL. The test server shows innodb_buffer_pool_size = 128 MiB, allowing up to 64 MiB for the insert buffer.

Insert buffer (IBUF)

InnoDB uses an insert buffer to batch non‑clustered index updates, reducing random I/O. When the insert buffer consumes a large portion of the buffer pool, it can impact other operations.

Transactions

Wrapping many inserts in a single transaction reduces per‑statement overhead. Example:

START TRANSACTION;
INSERT INTO `insert_table` (`datetime`,`uid`,`content`,`type`) VALUES ('0','userid_0','content_0',0);
INSERT INTO `insert_table` ...
COMMIT;

The transaction size should stay below innodb_log_buffer_size (≈64 MiB) to avoid flushing delays.

Configuration tuning

Increasing innodb_buffer_pool_size can improve read/write throughput if memory is available.

Adjusting max_allowed_packet upward allows larger batches but does not guarantee optimal performance.

Indexes

Multiple indexes increase insert cost because each row must update the index structures. Inserting in primary‑key order (append‑only) is fastest; inserting into the middle of a B‑tree forces page splits and more disk I/O.

Conclusion

The author recommends using a batch size roughly half of max_allowed_packet (e.g., 30 W–32 W rows for a 32 MiB packet) as a practical compromise between speed and memory usage. However, the true optimum depends on server configuration, buffer pool size, transaction limits, and index design, so testing with real data is essential.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

transactionMySQLBatch Insertbuffer poolmax_allowed_packet
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.