How an INSERT Statement Writes to Disk in MySQL InnoDB
This article explains the complete disk‑write journey of a MySQL INSERT statement, detailing the roles and order of redo log, binlog, double‑write, insert buffer, and data files, and clarifying InnoDB’s buffer‑pool and flushing mechanisms.
Authors: Huang Yan, Chief Technology Officer at iKex; Wang Yue, member of iKex R&D team, responsible for database management platform development and troubleshooting.
Source: Reprinted from the public account "Illustrated MySQL". All rights reserved; please contact the editor for permission.
Question
Which files are involved when an INSERT statement is written to disk, and what is the order of these operations?
Below we use two diagrams to illustrate the INSERT statement’s disk‑write journey.
Figure 1: Log file writes before transaction commit
The INSERT reaches the server layer where preliminary checks are performed; no disk I/O occurs at this stage.
After passing the checks, the statement enters the storage engine layer. InnoDB caches data pages in the memory buffer pool , so the INSERT modifies the cached page without immediately writing to disk.
The dirty page in the buffer pool is not required to be flushed before the transaction commits; its flushing is handled later.
If innodb_flush_log_at_trx_commit=1 , each transaction commit triggers a flush of the redo log to disk (redo log is written sequentially, offering higher write efficiency than direct data‑file updates).
If binary logging is enabled, the transaction’s logical changes are also written to the binlog; setting sync_binlog=1 forces the binlog to be flushed to disk on every commit.
In summary, when the buffer pool is sufficiently large and the two parameters above are set to 1, the actual disk writes performed on a successful INSERT are to the redo log and the binlog, not directly to the MySQL data files.
Double‑write background InnoDB’s buffer‑pool pages are 16 KB each. If a crash occurs after only part of a page (e.g., the first 4 KB) is written, the page becomes corrupted and data is lost. To avoid this, InnoDB employs a double‑write mechanism: dirty pages are first written to a shared tablespace called the double‑write buffer, and this write is forced to disk. The redo log alone cannot recover a corrupted page because it records only offsets, not the full page content.
Insert buffer background InnoDB stores data according to the clustered index, so inserts that follow the primary‑key order result in sequential disk writes. However, non‑clustered indexes may require non‑sequential writes, slowing insertion. The insert buffer collects changes for non‑clustered indexes, merges them, and writes them to the actual index pages in batches.
Figure 2: Data file writes after transaction commit
When the buffer pool accumulates enough dirty pages or the I/O pressure is low, InnoDB triggers a flush of dirty pages.
If double‑write is enabled, dirty pages are first copied to the double‑write buffer; because the buffer’s pages are stored contiguously, the disk write is sequential and incurs little performance overhead.
Regardless of double‑write, the dirty pages are eventually flushed to the tablespace data files, after which the corresponding buffer‑pool space is released.
The insert buffer is part of the buffer pool; when the pool needs to evict pages, insert‑buffer pages may be written to the shared tablespace’s insert‑buffer file.
If innodb_stats_persistent=ON , InnoDB statistics are also flushed to the system tables innodb_table_stats and innodb_index_stats , avoiding real‑time recomputation.
In certain cases, double‑write can be bypassed, such as when it is disabled or for operations that do not require double‑write protection (e.g., DROP TABLE ).
In short, an INSERT statement’s data traverses the redo log, binlog, (double‑write, insert buffer) shared tablespace, and finally lands in the user’s tablespace.
Click "Read Original" to view the full column content.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.