What’s the Optimal Batch Size for MySQL Inserts? A Deep Performance Test
This article investigates how many rows should be inserted per batch in MySQL, analyzes the limits of SQL statement size, buffer pool, transaction overhead, and indexes, and presents empirical tests that reveal the most efficient batch size for different data volumes.
Introduction
Batch inserting rows into MySQL reduces the number of client‑server round‑trips and is the preferred method when loading large tables or log files. The following experiment uses a temporary InnoDB table to identify a practical batch size.
Preparation before batch insert
Table definition
The test table has four columns: three int(10) columns and one varchar(10) column. Keeping the schema minimal limits memory consumption.
Row size calculation
In InnoDB an int occupies 4 bytes regardless of the display width. A varchar(10) that stores Chinese characters under utf8mb4 can use up to 40 bytes (4 bytes per character). Therefore a single row consumes roughly 4 + 4 + 4 + 40 = 52 bytes.
Typical time distribution of a single INSERT
Connection establishment – 30 %
Sending query to server – 20 %
Parsing query – 20 %
Insert operation – 10 % × number of rows
Index updates – 10 % × number of indexes
Connection close – 10 %
Because the majority of time is spent on connection and parsing, batch inserts aim to minimise the number of executions.
Batch insert tests
SQL statement size limit
The server variable max_allowed_packet controls the maximum packet size. On the test server it is set to 32 MiB (33,554,432 bytes), so each INSERT statement must stay well below this limit.
Maximum rows per batch
Using the 52‑byte row size, a 1 MiB packet can hold about 20,165 rows. To stay safe the author caps a batch at 20,000 rows. With the full 32 MiB limit the theoretical maximum is roughly 640,000 rows.
Empirical insertion speed
110 000 rows were inserted with different batch sizes. The measured execution times are:
Batch size → Time (seconds)
10 → 2.361
600 → 0.523
1 000 → 0.429
20 000 → 0.426
80 000 → 0.352Increasing the batch size reduces total time, but the improvement plateaus after a few thousand rows. Additional tests with 240 000 and 420 000 rows show that performance continues to improve up to roughly 300 000 rows; beyond that the time rises again, likely because the server’s memory allocation for the packet becomes a bottleneck.
Other factors affecting insert performance
Buffer‑pool usage
If the InnoDB buffer pool has less than 25 % free space, inserts can fail with DB_LOCK_TABLE_FULL. The test server’s buffer pool is 128 MiB, leaving up to 64 MiB for the insert buffer.
Insert buffer
InnoDB’s insert buffer caches changes to secondary indexes, reducing random I/O but consuming buffer‑pool memory. By default the insert buffer may occupy up to half of the buffer pool.
Transactions
START TRANSACTION;
INSERT INTO insert_table (datetime, uid, content, type) VALUES (...);
... (multiple rows)
COMMIT;Wrapping many rows in a single transaction eliminates the per‑row commit overhead. The transaction size must stay within innodb_log_buffer_size (≈64 MiB on the test server).
Configuration tweaks
Increasing innodb_buffer_pool_size improves read/write throughput, provided sufficient RAM is available.
Index impact
Each secondary index adds work to an INSERT because the index structures must be updated. Inserting rows in primary‑key order minimizes B‑tree splits and page reorganisations, yielding higher throughput.
Conclusion
The most reliable rule of thumb is to set the batch size to roughly half of max_allowed_packet. On the tested system (row size ≈ 52 bytes, max_allowed_packet = 32 MiB) this translates to about 30 000–40 000 rows per INSERT. The exact optimum, however, depends on row size, index layout, buffer‑pool health, and transaction configuration, so each workload should be tuned individually.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
