Databases 22 min read

Understanding MySQL Index Structures, Execution Plans, Transaction Isolation Levels, MVCC, and Buffer Pool Mechanisms

This article provides a comprehensive technical guide on MySQL internals, covering the underlying data structures of indexes (B‑tree, B+‑tree, hash), how execution plans are generated and interpreted, the four transaction isolation levels, MVCC implementation, and the InnoDB buffer‑pool architecture.

Top Architect
Top Architect
Top Architect
Understanding MySQL Index Structures, Execution Plans, Transaction Isolation Levels, MVCC, and Buffer Pool Mechanisms

1. MySQL Index Data Structures and Algorithms

Indexes are essential for MySQL to retrieve data efficiently and in sorted order. MySQL mainly uses B+ trees and hash structures; hash is fast for direct lookups but unsuitable for ordering.

Tree‑based indexes rely on binary search, offering low time complexity, but can suffer from tree degeneration and increased disk I/O as the tree height grows.

Binary Tree : In the worst case it degrades to a linked list, increasing query time.

Red‑Black Tree : A balanced binary tree that mitigates degeneration but still faces height growth.

B‑Tree : A multi‑way tree that stores more data per node, reducing height but each node stores full row data, leading to larger memory usage.

B+‑Tree : An evolution of B‑Tree where only leaf nodes store row data and internal nodes contain only keys; leaf nodes are linked (MySQL uses a double‑linked list) to support efficient range scans.

In MySQL, a single index node defaults to 16 KB, which corresponds to a disk block, allowing many index entries per block and benefiting from OS pre‑fetching.

2. SQL Execution Plan Analysis

Execution plans are crucial for SQL tuning. Using EXPLAIN you can view columns such as:

id : execution order of SELECT statements.

table : the table being accessed.

type : optimization level (system > const > eq_ref > ref > range > index > ALL).

key : the actual index used.

rows : estimated rows examined.

Extra : additional info (e.g., Using index, Using where, Using temporary, Using filesort).

3. How a SQL Statement Executes in MySQL

Client maintains a long‑lived TCP connection to the MySQL server.

MySQL checks its query cache (LRU eviction) for a cached plan.

The SQL is parsed by a C‑language parser.

The optimizer generates an execution plan, possibly applying FORCE_INDEX .

The executor invokes the appropriate storage engine (InnoDB, MyISAM, Memory, etc.).

The execution engine uses the chosen index to locate rows and updates the index as needed.

4. MySQL Locks and Transaction Isolation Levels

Concurrent transactions can cause dirty writes, dirty reads, non‑repeatable reads, and phantom reads. MySQL solves these with transaction isolation levels and locking mechanisms.

ACID properties of a transaction are:

Atomicity

Consistency

Isolation

Durability

The four isolation levels are:

Read Uncommitted : can see uncommitted changes.

Read Committed : sees only committed data.

Repeatable Read : same query returns the same rows within a transaction.

Serializable : highest isolation, effectively locking rows.

Lock types include optimistic vs. pessimistic, read (shared) vs. write (exclusive), and table‑level vs. row‑level locks. InnoDB supports row‑level locks and transactions, while MyISAM does not.

Example commands:

show variables like '%isolation%';
set transaction_isolation='REPEATABLE-READ';

5. Deep Dive into MVCC (Multi‑Version Concurrency Control)

MVCC allows reads without locking and provides high concurrency for read‑heavy workloads. It works via a ReadView and an undo_log version chain .

Each row stores hidden columns trx_id and roll_pointer that link to previous versions. The version chain is a singly linked list where the tail holds the latest version.

ReadView captures the list of active (uncommitted) transaction IDs at the start of a read. In Read‑Committed isolation a new ReadView is created for each query; in Repeatable‑Read the same ReadView is reused throughout the transaction, preventing phantom reads.

6. InnoDB Buffer Pool Mechanism

MySQL does not write directly to disk for each request; instead it uses a Buffer Pool to cache pages in memory, reducing random I/O.

Query loads the required page from disk into the Buffer Pool.

Before modification, the original row is written to the undo log.

The Buffer Pool page is updated.

Changes are recorded in the redo‑log buffer.

On transaction commit, the redo‑log buffer is flushed to the redo log.

In parallel, a background thread writes the binary log.

The redo log entry is marked as committed to guarantee durability.

This design enables high‑throughput reads and writes while ensuring data consistency.

--- For further reading, the original article includes many diagrams and screenshots illustrating the concepts.

IndexingMySQLSQL OptimizationTransaction IsolationMVCCBuffer Pool
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.