Databases 19 min read

Understanding MySQL Parallel Replication: From Lag to Group Commit

This article explains why master‑slave lag occurs in MySQL, describes the evolution of parallel replication schemes—including group‑commit, Commit‑Parent‑Based, Lock‑Based, and WRITESET approaches—shows benchmark results, and provides practical configuration steps to enable high‑performance parallel replication.

Efficient Ops
Efficient Ops
Efficient Ops
Understanding MySQL Parallel Replication: From Lag to Group Commit

Anyone who has maintained MySQL in production knows that master‑slave replication lag is a painful problem that can cause stale reads and affect high‑availability failover.

Contents:

Impact of replication lag

Overview of parallel replication schemes

Group‑commit based parallel replication (Commit‑Parent‑Based and Lock‑Based)

WRITESET scheme

Benchmark results

How to enable parallel replication

1. Impact of Replication Lag

Lag leads to two main issues: (1) read‑write split workloads may read stale data, and (2) large lag hampers the speed of failover because the replica must apply all pending binlog events before switching.

If the replica waits for all binlog changes before failover, service availability is reduced.

If it switches immediately, un‑applied changes are lost, which many applications cannot tolerate.

2. Parallel Replication Overview

MySQL has introduced several parallel replication schemes over time:

MySQL 5.6 – database‑level parallelism (rarely used in single‑database, multi‑table environments).

MySQL 5.7 – group‑commit based parallelism.

MySQL 8.0 – WRITESET based parallelism.

3. Group‑Commit Based Parallel Replication

3.1 Commit‑Parent‑Based Scheme

Transactions are split into a Prepare phase and a Commit phase. InnoDB uses pessimistic locking, so if two transactions are both in the Prepare phase they have no lock conflict and can be replayed in parallel on the replica.

Implementation details:

The master maintains a global counter that is incremented before a transaction commits at the storage engine level.

Before entering Prepare, the current counter value is stored in the transaction as the commit‑parent .

The commit‑parent is written into the binlog header.

During replay, if two transactions share the same commit‑parent they are executed in parallel.

Example of seven transactions:

<code>Trx1 ------------P----------C--------------------------------&gt;</code>
<code>                            |</code>
<code>Trx2 ----------------P------+---C----------------------------&gt;</code>
<code>                            |   |</code>
<code>Trx3 -------------------P---+---+-----C----------------------&gt;</code>
<code>                            |   |     |</code>
<code>Trx4 -----------------------+-P-+-----+----C-----------------&gt;</code>
<code>                            |   |     |    |</code>
<code>Trx5 -----------------------+---+-P---+----+---C-------------&gt;</code>
<code>                            |   |     |    |   |</code>
<code>Trx6 -----------------------+---+---P-+----+---+---C----------&gt;</code>
<code>                            |   |     |    |   |   |</code>
<code>Trx7 -----------------------+---+-----+----+---+-P-+--C-------&gt;</code>
<code>                            |   |     |    |   |   |  |</code>

Result:

Trx1, Trx2, Trx3 execute in parallel.

Trx4 executes serially.

Trx5, Trx6 execute in parallel.

Trx7 executes serially.

3.2 Lock‑Based Scheme

This scheme introduces the concept of a locking interval , defined from the moment the last DML statement acquires a lock in the Prepare phase to the moment the lock is released before the storage‑engine commit.

If two transactions have overlapping locking intervals, they have no lock conflict and can be replayed in parallel.

<code>Trx1 -----L---------C------------&gt;</code>
<code>Trx2 ----------L---------C-------&gt;</code>

Conversely, non‑overlapping intervals cannot be parallelized:

<code>Trx1 -----L----C-----------------&gt;</code>
<code>Trx2 ---------------L----C-------&gt;</code>

The master tracks four variables:

global.transaction_counter – transaction counter.

transaction.sequence_number – per‑transaction sequence number.

global.max_committed_transaction – maximum committed sequence number.

transaction.last_committed – maximum committed sequence number before the transaction enters Prepare.

These values are written to the binlog (as

GTID_LOG_EVENT

for GTID replication or

ANONYMOUS_GTID_LOG_EVENT

otherwise).

3.3 Parallel Replay Logic on the Replica

The replica maintains an ordered

transaction_sequence

queue sorted by

sequence_number

. A new transaction can be inserted only if its

last_committed

is smaller than the

sequence_number

of the first transaction in the queue.

<code>transaction.last_committed &lt; transaction_sequence[0].sequence_number</code>

Applying the earlier seven‑transaction example yields the same parallel execution pattern described for the Commit‑Parent scheme, but the Lock‑Based scheme achieves higher overall parallelism.

4. WRITESET Scheme

Introduced in MySQL 8.0, the WRITESET scheme is primarily used by Group Replication for conflict detection during the certification phase. Two concurrent transactions from different nodes are considered non‑conflicting if they do not modify the same row.

4.1 Generating the Writeset

Extract primary‑key, unique‑index, and foreign‑key information for each modified row and concatenate them into a string.

Hash the string using the algorithm defined by

transaction_write_set_extraction

(default

XXHASH64

).

Insert the hash value into the transaction’s writeset.

4.2 Implementation Details

<code>void Writeset_trx_dependency_tracker::get_dependency(THD *thd,
                                                     int64 &amp;sequence_number,
                                                     int64 &amp;commit_parent) {
  Rpl_transaction_write_set_ctx *write_set_ctx =
      thd-&gt;get_transaction()-&gt;get_transaction_write_set_ctx();
  std::vector&lt;uint64&gt; *writeset = write_set_ctx-&gt;get_write_set();
  // ... (logic to decide whether WRITESET can be used, update m_writeset_history, etc.)
}
</code>

The function determines whether a transaction can use WRITESET based on factors such as writeset size, matching

transaction_write_set_extraction

settings, foreign‑key relationships, and history‑size limits.

If WRITESET cannot be used, the transaction falls back to the Lock‑Based scheme.

4.3 Relevant Parameters

binlog_transaction_dependency_tracking – selects the dependency‑tracking scheme (COMMIT_ORDER, WRITESET, WRITESET_SESSION).

transaction_write_set_extraction – hash algorithm for writeset (OFF, MURMUR32, XXHASH64).

binlog_transaction_dependency_history_size – maximum number of entries stored in the writeset history (default 25000).

5. Benchmark Results

MySQL’s official benchmarks compare COMMIT_ORDER, WRITESET_SESSION, and WRITESET under three workloads (OLTP read/write, indexed‑column update, write‑only) on a 16‑core SSD master with 8 M rows across 16 tables.

Key findings:

COMMIT_ORDER benefits from higher master concurrency; replication speed increases with more threads.

WRITESET’s performance is largely independent of master concurrency; even a single thread outperforms COMMIT_ORDER with 256 threads.

WRITESET_SESSION behaves like COMMIT_ORDER but still achieves good throughput at lower thread counts (4–8).

6. Enabling Parallel Replication

On the replica, set the following parameters (requires a replication restart):

<code>slave_parallel_type = LOGICAL_CLOCK</code>
<code>slave_parallel_workers = 16</code>
<code>slave_preserve_commit_order = ON</code>

To use the WRITESET scheme on the master, configure:

<code>binlog_transaction_dependency_tracking = WRITESET_SESSION</code>
<code>transaction_write_set_extraction = XXHASH64</code>
<code>binlog_transaction_dependency_history_size = 25000</code>
<code>binlog_format = ROW</code>

WRITESET works only when the binlog format is ROW.

7. References

WL#6314: MTS – Prepared transactions slave parallel applier

WL#6813: MTS – ordered commits (sequential consistency)

WL#7165: MTS – Optimizing MTS scheduling by increasing the parallelization window on master

WL#8440: Group Replication – Parallel applier support

WL#9556: Writeset‑based MTS dependency tracking on master

WriteSet parallel replication (Chinese article)

Improving the Parallel Applier with Writeset‑based Dependency Tracking

Performance BenchmarkMySQLbinloggroup commitwritesetparallel replication
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.