MySQL Metadata Lock (MDL): Usage, Implementation, Deadlock Detection, and Lock‑Free Optimizations
This article explains MySQL's Metadata Lock (MDL) subsystem, describing how MDL protects metadata, the lock types and modes, the transaction‑level implementation introduced in MySQL 5.5 and refined in 5.7, the lock acquisition and release algorithms, deadlock detection mechanisms, and the lock‑free hash table used to store lock objects.
1. Introduction
Metadata Lock (MDL) protects metadata in MySQL. It was introduced in MySQL 5.5, moving from statement‑level protection to transaction‑level protection, which solves two major problems: the inability to achieve repeatable‑read (RR) isolation and replication errors caused by metadata lock handling.
Example scenario where RR isolation cannot be achieved:
tx1: BEGIN;
tx1: SELECT * FROM tbl; -- acquires metadata lock (c1, c2), then releases it
tx2: ALTER TABLE tbl DROP COLUMN c2; -- acquires metadata lock, succeeds, releases it
tx1: SELECT * FROM tbl; -- returns (c1), cannot repeat‑read
tx1: END;Another scenario where replication stops due to lock ordering problems:
tx1: BEGIN;
tx1: INSERT INTO tbl (c1, c2) VALUES (v1, v2);
tx2: ALTER TABLE tbl DROP COLUMN c2; -- commits and writes binlog 1
tx1: END; -- commits, writes binlog 2, replica replays binlog 1 then binlog 2 and failsMySQL 5.5 introduced MDL with transaction‑level semantics, and later versions added many optimizations such as a new hash function for lock keys, partitioned lock hash tables, and lock‑free MDL hash tables (5.7).
2. MDL Usage
Key concepts for using the MDL subsystem:
MDL_key : a three‑tuple namespace:dbname:objectname that uniquely identifies the lock target. Namespaces classify objects (e.g., GLOBAL, SCHEMA, TABLESPACE, COMMIT for scoped locks; TABLE for object locks).
MDL_type : the lock mode, such as IX, S, X, SH, SR, SW, SU, etc. Each mode has specific semantics for metadata and table access.
MDL_duration : lifetime of the lock – statement, transaction, or explicit release.
MDL_request : a structure that bundles type, duration, and key and is passed to the lock‑acquire API.
3. MDL Implementation
3.1 Basic Elements of the Lock
The implementation relies on two fundamental building blocks:
Atomic operations : a boolean variable (0/1) represents the lock state. Atomic primitives such as TAS or CAS (e.g., xchg, cmpxchg) are used to modify this variable safely.
Wait queue, sleep and wake‑up : when a lock cannot be obtained, the thread is placed on a wait queue and sleeps. Wake‑up occurs when the lock is released. MySQL uses pthread mutexes, condition variables, and the futex system call for this purpose.
MySQL’s implementation of these primitives is based on libpthread (pthread_mutex_t, pthread_cond_t, etc.).
3.2 Detailed Implementation
Important data structures:
struct MDL_key {
enum enum_mdl_namespace { GLOBAL, TABLESPACE, SCHEMA, TABLE, COMMIT, ... };
char m_ptr[]; // triple‑string stored consecutively
}; enum enum_mdl_type { MDL_INTENTION_EXCLUSIVE, MDL_SHARED, MDL_SHARED_HIGH_PRIO, MDL_SHARED_READ, MDL_SHARED_WRITE, MDL_SHARED_UPGRADABLE, MDL_SHARED_NO_WRITE, MDL_SHARED_NO_READ_WRITE, MDL_EXCLUSIVE, ... }; enum enum_mdl_duration { MDL_STATEMENT, MDL_TRANSACTION, MDL_EXPLICIT }; class MDL_request {
enum_mdl_type type;
enum_mdl_duration duration;
MDL_ticket *ticket; // result of lock acquisition
MDL_key key;
};Core classes:
class MDL_ticket {
enum_mdl_type m_type;
enum_mdl_duration m_duration;
MDL_context *m_ctx;
MDL_lock *m_lock;
}; class MDL_wait {
enum enum_wait_status m_wait_status;
mysql_mutex_t m_LOCK_wait_status;
mysql_cond_t m_COND_wait_status;
void timed_wait();
}; class MDL_context {
MDL_wait m_wait;
Ticket_list m_tickets; // all locks held by the thread
MDL_wait_for_subgraph *m_waiting_for; // the lock the thread is currently waiting for
}; class MDL_map {
LF_HASH m_locks; // lock‑free hash table for all non‑GLOBAL/COMMIT locks
MDL_lock *m_global_lock; // single GLOBAL lock
MDL_LOCK *m_commit_lock; // single COMMIT lock
};The lock acquisition flow:
Call try_acquire_lock_impl() . If the lock can be granted immediately, return.
If not, create an MDL_ticket and insert it into the appropriate MDL_lock 's wait queue.
Run the deadlock detector ( find_deadlock() ). If no deadlock is found, the thread sleeps via MDL_wait::timed_wait() .
When awakened, the thread checks whether the lock was granted, timed‑out, or aborted, and cleans up accordingly.
Fast‑path acquisition is possible for non‑obtrusive modes (IX, S, SH, SR, SW). The lock’s m_fast_path_state holds counters for each mode; a CAS loop atomically increments the appropriate counter, granting the lock without taking the heavy rwlock.
If the fast path cannot be used, the algorithm takes the lock’s rwlock, checks the compatibility matrix (granted vs. waiting bitmaps), and either grants the lock or places the ticket on the waiting list.
Release logic mirrors acquisition:
Fast‑path: atomically decrement the corresponding counter and call reschedule_waiters() to wake a waiting thread.
Slow‑path: take the rwlock, remove the ticket from the granted list, update the fast‑path state, and invoke reschedule_waiters() .
4. Deadlock Detection
Each MDL_context can wait on at most one lock (stored in m_waiting_for ). The lock object knows which threads hold it ( m_granted ) and which threads are waiting ( m_waiting ). By traversing these relationships a wait‑for graph is built.
If a cycle is found, the detector selects the victim with the lowest dead‑lock weight (provided by MDL_ticket::getdeadlockweight() ) and forces it to release its locks, breaking the cycle. The detection algorithm combines a breadth‑first scan of immediate neighbours with a depth‑first search for deeper dependencies, using the Deadlock_detection_visitor helper class.
5. Lock‑Free Hash Table (LF_HASH)
All lock objects (except GLOBAL and COMMIT) are stored in a global lock‑free hash table. The table uses a copy‑on‑write strategy: updates create a new version of the bucket and atomically replace the global pointer.
Memory reclamation is performed with hazard pointers (implemented as LFPIN structures). Each thread records the pointers it is currently accessing; when a bucket is replaced, the old version is placed in a “purgatory” list. Once no thread’s hazard pointer references an old version, the memory is returned to the LFALLOCATOR pool.
The double‑check pattern ensures that a thread never dereferences a pointer that has been reclaimed:
void *ref = NULL;
while (1) {
ref = g_ptr; // 1. read global pointer
g_hp_local_array[thread_id] = ref; // 2. publish as hazard pointer
if (ref == g_ptr) break; // 3. verify it was not changed
}This mechanism guarantees safe concurrent reads and writes without global locks.
6. Summary
The MDL subsystem provides transaction‑level metadata protection in MySQL, offering a rich set of lock modes, fast‑path lock‑free acquisition, a robust deadlock detector, and a lock‑free hash table for scalable lock storage.
Tencent Database Technology
Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.