Understanding InnoDB Storage Engine: Logs, Buffer Pool, and Data Write Process
This article explains the core components of MySQL's InnoDB storage engine—including undo log, redo log, buffer pool, change buffer, and binlog—detailing their roles in data writing, crash recovery, MVCC, and master‑slave synchronization, and describing related flushing strategies and LRU management.
Introduction
This article focuses on the InnoDB storage engine. In MySQL, various logs and buffers such as binlog, undo log, and redo log play crucial roles—for example, binlog enables master‑slave recovery, while undo log supports data rollback.
The article explains the purpose of each region during MySQL runtime.
Data Write Process
All read/write operations in MySQL are performed in memory. The diagram below shows the detailed data‑write flow.
Write an undo log before the actual data to enable rollback.
Write to the buffer pool or change buffer to record data in memory.
Record a redo log to protect against crashes; the redo log is first written to its buffer and then flushed to disk according to the chosen flushing strategy.
After the redo log, write a binlog entry to support master‑slave synchronization and data recovery.
Use a two‑phase commit: after binlog succeeds, write a commit record to the redo log buffer to keep binlog and redo log consistent.
Three redo‑log flushing strategies are described.
Flush the buffer pool to disk.
undo log
The undo log records the state of data before a transaction starts, enabling rollback and supporting MVCC for non‑locking reads.
Rollback: The undo log stores the pre‑transaction state so that a rollback can restore the original data.
MVCC: The undo log allows reading historical versions of rows, providing snapshot isolation without locks.
buffer pool
Most queries and writes in InnoDB operate on the buffer pool , which acts as the engine's cache. Its usage can be inspected with show engine innodb status and its size configured via innodb_buffer_pool_size . InnoDB employs a variant of the LRU algorithm, dividing the pool into a young (hot) region and an old (cold) region to avoid polluting the cache with large table scans.
The flush list manages dirty pages: when data is written, it first goes to the buffer pool , becomes dirty, and is later flushed to the .ibd data file.
change buffer
The change buffer temporarily stores updates for pages that are not yet in the buffer pool , allowing InnoDB to avoid loading those pages from disk. Its size relative to the buffer pool can be set with innodb_change_buffer_max_size . However, if a page must be loaded (e.g., for a unique‑index update), the change buffer cannot be used.
redo log and redo log buffer
The redo log protects against data loss when MySQL crashes before the buffer pool is flushed. It is a circular file that records physical page changes sequentially, enabling fast writes and recovery on restart. Writes go first to the redo log buffer and are flushed to the redo log file according to the innodb_flush_log_at_trx_commit setting, which offers three strategies:
Log buffer written to file; background thread flushes to disk every second.
Log buffer written to file and flushed to disk immediately.
Log buffer written to file; background thread flushes to disk every second (same as 1 but described differently).
For high data integrity, set the parameter to 1.
Note: In step 5 of the first diagram, a second commit occurs. If a transaction is found uncommitted during recovery, MySQL checks the binlog; if the binlog contains the data, it is restored, otherwise it is discarded.
binlog
The binary log records data‑change events in a compact binary format, enabling efficient storage, data recovery, and replication. By default, binlog is disabled; it can be enabled with log-bin and server-id in the MySQL configuration. Binlog records are written at transaction commit and can be stored in three formats:
Statement: Stores each executed SQL statement, but may cause inconsistencies with nondeterministic functions.
Row: Stores the exact row changes, eliminating statement‑level inconsistencies but increasing size for large updates.
Mixed: Combines statement and row formats, chosen per statement.
Conclusion
In InnoDB, operations prioritize in‑memory processing for performance. The undo log enables rollback and MVCC, the redo log safeguards against crashes, and the binlog supports data synchronization and recovery. This article provides a brief overview of each component; future articles will delve deeper into each area.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.