Databases 16 min read

Optimizing MySQL Batch Insert Performance: Determining the Ideal Batch Size

This article analyzes MySQL batch insertion performance, explains how row size, max_allowed_packet, buffer pool, transaction handling, and index design affect throughput, and presents empirical tests that suggest using roughly half of the max_allowed_packet size as the optimal batch size for large data loads.

Top Architect
Top Architect
Top Architect
Optimizing MySQL Batch Insert Performance: Determining the Ideal Batch Size

When inserting massive amounts of data into MySQL, batch inserts are far more efficient than single-row statements, but the optimal batch size depends on several factors such as row size, server configuration, and storage engine behavior.

1. Introduction

Large tables or log files often require bulk loading; the key question is how many rows should be inserted per batch for best performance.

2. Preparation for Batch Inserts

The author originally used a loop inserting 1,000 rows per batch because other projects did so, but decided to test other sizes.

First, check the MySQL version because behavior varies across versions.

mysql> select version();
+------------+
| version()  |
+------------+
| 5.6.34-log |
+------------+
1 row in set (0.00 sec)

2.1 Table Schema

The temporary table has four columns: three int(10) fields and one varchar(10) field, keeping each row small.

字段1 int(10)
字段2 int(10)
字段3 int(10)
字段4 varchar(10)

2.2 Row Size Calculation

In InnoDB, each int occupies 4 bytes, and a varchar(10) in UTF‑8mb4 can take up to 40 bytes, so a row uses roughly 52 bytes.

2.3 Time Distribution of an Insert

链接耗时 (30%)
发送query到服务器 (20%)
解析query (20%)
插入操作 (10% * 记录数)
插入index (10% * index数)
关闭链接 (10%)

The connection and query parsing dominate the cost, which batch inserts aim to reduce by minimizing round‑trips.

3. Batch Insert Experiments

Tests were run with different batch sizes on datasets of 110 k, 240 k, and 420 k rows.

3.1 SQL Size Limit

The max_allowed_packet variable controls the maximum packet size; default is 1 MiB (client 16 MiB, server 4 MiB). In the author's environment it is 32 MiB.

show variables like '%max_allowed_packet%';
+--------------------------+------------+
| Variable_name            | Value      |
+--------------------------+------------+
| max_allowed_packet       | 33554432   |
| slave_max_allowed_packet | 1073741824 |
+--------------------------+------------+

3.2 Calculating Maximum Rows per Batch

Using a 1 MiB limit and a 52‑byte row, about 20 000 rows fit safely; with a 32 MiB limit, up to 640 000 rows are possible.

3.3 Test Results

For 110 k rows:

10 rows   → 2.361 s
600 rows → 0.523 s
1 000 rows → 0.429 s
20 000 rows → 0.426 s
80 000 rows → 0.352 s

For 240 k rows:

10 rows   → 4.445 s
600 rows → 1.187 s
1 000 rows → 1.130 s
20 000 rows → 0.933 s
80 000 rows → 0.753 s

For 420 k rows, performance improves up to ~30 k rows per batch, then degrades due to memory pressure.

1 000 rows → 2.216 s
80 000 rows → 1.777 s
160 000 rows → 1.523 s
200 000 rows → 1.432 s
300 000 rows → 1.362 s
400 000 rows → 1.764 s

The author concludes that the optimal batch size is roughly half of max_allowed_packet, i.e., around 320 k rows for a 32 MiB packet, balancing throughput and memory usage.

4. Other Factors Influencing Insert Performance

4.1 Buffer Pool Utilization

If the InnoDB buffer pool has less than 25 % free space, inserts can fail with DB_LOCK_TABLE_FULL, unrelated to max_allowed_packet.

4.2 Insert Buffer

InnoDB’s Insert Buffer merges many secondary‑index inserts, reducing random I/O, but it also consumes up to half of the buffer pool.

4.3 Transactions

Wrapping many inserts in a single transaction reduces per‑statement overhead. Example:

START TRANSACTION;
INSERT INTO `insert_table` (`datetime`,`uid`,`content`,`type`) VALUES ('0','userid_0','content_0',0);
INSERT INTO `insert_table` (`datetime`,`uid`,`content`,`type`) VALUES ('1','userid_1','content_1',1);
... 
COMMIT;

However, overly large transactions can exhaust innodb_log_buffer_size (64 MiB in the test environment) and degrade performance.

4.4 Configuration Tweaks

Increasing innodb_buffer_pool_size improves read/write speed if memory is available.

4.5 Index Impact

Multiple indexes increase insert cost; inserting rows in primary‑key order minimizes B‑tree splits.

5. Summary

Empirical testing shows that the best batch size is about half of the max_allowed_packet value, but the true optimum also depends on buffer pool size, transaction size, and index design. Tuning these parameters together yields the highest insert throughput.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance tuningInnoDBmysqlmax_allowed_packet
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.