Understanding MySQL Binlog, Undo Log, Redo Log, and Change Buffer
This article provides a comprehensive overview of MySQL’s logging mechanisms—including Binlog, Undo log, Redo log, and Change Buffer—explaining their concepts, roles in replication and crash‑recovery, recording formats, flush timing, two‑phase commit, checkpoint handling, and how they interact during data modifications.
Writing Background
Everyone knows that logs are an important component of MySQL databases, recording various state information during operation. MySQL logs mainly include error log , general query log , slow query log , binary log (binlog) , and transaction logs (redo log, undo log).
Among them, the binary log and transaction logs are especially important, but most people only understand them superficially. This article consolidates weeks of research on MySQL logs to help readers master these mechanisms.
Necessary Concept Dictionary Introduction
Before diving deeper, it is useful to review essential terminology. A separate “concept dictionary” article is referenced for readers who encounter unfamiliar terms.
Understanding Binary Log (Binlog)
Binlog Concept
Binlog is a logical log that records write operations (queries are not recorded). The Server layer records it independently of the storage engine. It is written in an append‑only fashion, and its size can be limited by the max_binlog_size parameter.
Binlog Role
Binlog is mainly used in two scenarios:
Master‑Slave Replication: the master enables Binlog, which is then sent to slaves for replay, ensuring data consistency.
Data Recovery: the mysqlbinlog tool can be used to restore data.
Binlog Recording Process and Flush Timing
Binlog is first written to the Binlog Buffer, then flushed to the OS buffer, and finally persisted to the Binlog file on disk according to the sync_binlog setting (0, 1, or N).
Binlog Format
Before MySQL 5.7.7 the default format was STATEMENT ; from that version onward the default is ROW . The format can be changed with the binlog-format parameter.
Understanding Transaction Log (Undo Log)
Undo Log Concept
Undo log is a logical log used for rollback. For example, an UPDATE that adds 3 will generate an Undo log entry that subtracts 3, allowing the transaction to be rolled back if an error occurs.
Undo Log Role
Rollback: ensures atomicity by reverting to the state before the transaction.
MVCC Consistency View: provides the version information needed for multi‑version concurrency control.
Undo Log Summary
The key points of Undo log are rollback and MVCC . Long‑running transactions generate many Undo log entries; they are created when a transaction starts and deleted when no longer needed.
Understanding Transaction Log (Redo Log)
Redo Log Concept
Redo log is a physical log belonging to the InnoDB engine. It records changes to data pages (e.g., adding a record to a page).
Redo Log Role
Crash‑Safe Forward Operation: guarantees that committed transactions are not lost after a power failure.
Performance Improvement: writes are sequential (WAL), reducing random‑write I/O.
Redo Log Two‑Phase Commit
After updating memory, the engine writes a Redo log entry and marks the transaction as prepare . The server then writes a Binlog entry and finally commits, marking the transaction as commit . This ensures consistency between Binlog and Redo log.
Redo Log Disaster Recovery Process
If the Redo log is complete (commit), it is used directly for recovery. If it is only in the prepare state, the Binlog is checked; a complete Binlog allows committing the Redo log, otherwise the transaction is rolled back.
Redo Log Flush Timing
Redo log is first written to the Redo Log Buffer, then flushed to the OS buffer and finally to disk according to innodb_flush_log_at_trx_commit (values 0, 1, 2). Value 1 is the safest and default.
Redo Log Storage Method
Redo log files are stored in a circular buffer: when the end is reached, writing wraps to the beginning. Two pointers— write pos and checkpoint —track the current write location and the oldest data to be overwritten.
Redo Log Checkpoint
Checkpoints are triggered under various conditions (e.g., buffer pool pressure, log size). During a checkpoint, dirty pages are flushed to disk and the checkpoint pointer advances.
Redo Log LSN (Log Sequence Number)
LSN is a logical sequence number that grows as logs are written. It is used to locate positions within the circular log files and to coordinate checkpoints, data pages, and redo logs.
Understanding Change Buffer
Why Mention Change Buffer
Change Buffer and Redo log are often confused; both use memory to reduce disk I/O, but they optimize different stages of the update process.
Change Buffer Concept and Role
When an update targets a page that is not in memory, the change is recorded in the Change Buffer instead of reading the page from disk. This saves random‑read I/O; the buffered changes are later merged.
Difference Between Change Buffer and Redo Log
Redo log reduces random‑write I/O by converting writes to sequential logs, while Change Buffer reduces random‑read I/O by avoiding page reads.
Change Buffer Merge Process
During a merge, the engine reads the affected page into memory, applies all buffered changes, and then writes a new redo log entry reflecting the final page state.
Log Integration (U‑R‑B) – Putting It All Together
Demo Data
Test statements:
1、insert into ta(a,b) values(2,5),(7,5)
2、select * from t where a in (2, 7)Assume page1 is in memory, page2 is not. The first row lands on page1, the second on page2.
Process Without Logs or Change Buffer
1. Read the target page from disk into memory. 2. Apply the modification in memory. 3. Write the updated page back to disk.
Process With All Logs and Change Buffer (InnoDB Flow)
Two‑Phase Commit
Update page1 directly in memory; record change for page2 in Change Buffer.
Write Undo log (cached, flushed according to its own parameters).
Write Redo log (including buffered changes).
Mark log state as prepare .
Write Binlog.
Commit transaction, changing state to commit .
Merge Process
When a query accesses a page not in memory (e.g., a=7), the engine merges the Change Buffer entry:
Read page2 into memory.
Apply buffered changes to obtain the latest page version.
Write a new Redo log entry reflecting the final page state.
Data Flush Process
Data pages are flushed to disk under four conditions: Redo log full, memory pressure, idle system, or normal shutdown. Flushing also advances the checkpoint, coordinating with Redo log persistence.
Conclusion
The article covered Binlog, Undo log, Redo log, and introduced Change Buffer, providing a holistic view of MySQL’s logging architecture. Readers are encouraged to experiment and ask questions, such as why Binlog lacks crash‑safe features and why two separate logs are needed for crash‑safety.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.