Understanding MVCC and Transaction Isolation Levels in InnoDB
This article explains the fundamentals of MVCC, transaction isolation levels, and related concepts such as ACID properties, dirty reads, non‑repeatable reads, phantom reads, hidden columns, undo logs, version chains, read views, and demonstrates how InnoDB implements these mechanisms with practical SQL examples and diagrams.
Preface
MVCC implementation is a high‑frequency interview topic; during the holiday we discuss its principles.
1. Review of Related Database Knowledge
1.1 What Is a Database Transaction and Why It Exists
A transaction is a finite sequence of database operations that must either all succeed or all fail, forming an indivisible unit of work.
Example: A transfers 100 ¥ to B. If the debit from A succeeds but the credit to B fails due to a system error, the transaction must roll back to keep the balances consistent.
Why need transactions? To guarantee final data consistency.
1.2 Transaction Characteristics (ACID)
Transactions have four typical properties: Atomicity, Consistency, Isolation, Durability.
Atomicity: All operations in the transaction are executed together or not at all.
Consistency: Data remains valid before and after the transaction.
Isolation: Concurrent transactions do not interfere with each other.
Durability: Once committed, changes are permanently stored.
1.3 Concurrency Problems in Transactions
Concurrent transactions can cause dirty reads, non‑repeatable reads, and phantom reads .
1.3.1 Dirty Read
A dirty read occurs when a transaction reads data modified by another uncommitted transaction.
Example: Transaction B updates a row but has not committed; Transaction A reads the uncommitted value, resulting in a dirty read.
1.3.2 Non‑Repeatable Read
The same transaction reads the same row twice and gets different results.
Example: Transaction A reads a row, Transaction B modifies and commits it, then Transaction A reads the same row again and sees the new value.
1.3.3 Phantom Read
A phantom read occurs when a transaction re‑executes a range query and sees rows that were inserted by another transaction after the first execution.
Example: Transaction A queries rows with id > 2, Transaction B inserts a new row with id = 4 and commits, then Transaction A repeats the query and sees an extra row.
1.4 Four Isolation Levels
To solve dirty, non‑repeatable, and phantom reads, databases define four isolation levels: Read Uncommitted, Read Committed, Repeatable Read, Serializable .
1.4.1 Read Uncommitted
Allows reading uncommitted changes, leading to dirty, non‑repeatable, and phantom reads.
1.4.2 Read Committed
Only reads data committed by other transactions, eliminating dirty reads but still allowing non‑repeatable and phantom reads.
1.4.3 Repeatable Read
Prevents non‑repeatable reads; however, phantom reads may still occur unless gap locks are used.
1.4.4 Serializable
The strictest level; transactions execute in a serial order, avoiding all three concurrency problems at the cost of performance.
Isolation Level
Dirty Read
Non‑Repeatable Read
Phantom Read
Read Uncommitted
✓
✓
✓
Read Committed
✗
✓
✓
Repeatable Read
✗
✗
✓
Serializable
✗
✗
✗
1.5 How Databases Ensure Transaction Isolation
Isolation is mainly achieved through locking . For example, the Serializable level uses strict locks, which can degrade performance.
To mitigate lock‑induced performance loss, databases adopt MVCC (Multi‑Version Concurrency Control) , allowing reads without locks while writes still create new versions.
2. What Is MVCC?
MVCC stands for Multi‑Version Concurrency Control. It maintains multiple versions of a row, each identified by a hidden transaction ID, enabling snapshot reads without locking.
In MVCC, each row version carries a hidden trx_id and a roll_pointer to the undo log; visibility is decided by the transaction’s read view.
Both Read‑Committed and Repeatable‑Read isolation levels in InnoDB rely on MVCC.
3. Key MVCC Concepts
3.1 Transaction Version Number
Each transaction obtains an auto‑incremented ID; this ID determines the order of operations and serves as the version number.
3.2 Implicit Columns
In InnoDB each row has hidden columns trx_id , roll_pointer , and optionally row_id when no primary key exists.
Column
Required
Description
row_id
No
Monotonically increasing row identifier (6 bytes).
trx_id
Yes
Transaction ID that modified the row.
roll_pointer
Yes
Pointer to the undo‑log entry for the previous version.
3.3 Undo Log
The undo log records the previous state of a row before it is modified, enabling rollback and snapshot reads.
3.4 Version Chain
Each row’s multiple versions are linked via roll_pointer , forming a version chain.
3.5 Snapshot Read vs Current Read
Snapshot read: Reads a visible version without locking (ordinary SELECT).
select * from core_user where id > 2;Current read: Reads the latest version and acquires locks (SELECT … FOR UPDATE, SELECT … LOCK IN SHARE MODE).
select * from core_user where id > 2 for update;
select * from account where id > 2 lock in share mode;3.6 Read View
A read view captures the state of active transactions at a point in time. Its key fields are:
m_ids : IDs of currently active (uncommitted) transactions.
min_limit_id : Smallest active transaction ID.
max_limit_id : Next transaction ID to be assigned.
creator_trx_id : ID of the transaction that created the read view.
Visibility rules compare a row’s trx_id with these limits to decide if the version is visible.
4. MVCC Implementation Analysis
4.1 Querying a Record Using MVCC
Obtain the transaction’s own version number.
Generate a read view.
Fetch rows and compare each row’s trx_id with the read view.
If the version is not visible, retrieve the appropriate snapshot from the undo log.
Return the visible data.
In InnoDB, MVCC is realized by combining Read View and Undo Log .
4.2 Read‑Committed (RC) Isolation – Non‑Repeatable Read Example
Two transactions A and B operate on core_user . Transaction A reads the row, B updates it, commits, and A reads again. Because RC creates a new read view for each SELECT, A sees different values, demonstrating a non‑repeatable read.
4.3 Repeatable‑Read (RR) Isolation – Solving Non‑Repeatable Reads
RR creates a single read view for the whole transaction, so repeated SELECTs see the same snapshot, eliminating non‑repeatable reads.
4.4 Does MVCC Solve Phantom Reads?
RR uses gap locks for SELECT … LOCK IN SHARE MODE, preventing inserts into the queried range and thus avoiding phantom rows for current reads. However, snapshot reads can still exhibit phantom behavior in certain scenarios, so phantom reads are not completely eliminated.
References
[1] Database Fundamentals (Four) – InnoDB MVCC Implementation Principles: https://zhuanlan.zhihu.com/p/52977862
Recommended reading and community links follow the article.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.