Databases 19 min read

Understanding MySQL Parallel Replication: From Lag to Group Commit

This article explains why master‑slave lag occurs in MySQL, describes the evolution of parallel replication schemes—including group‑commit, Commit‑Parent‑Based, Lock‑Based, and WRITESET approaches—shows benchmark results, and provides practical configuration steps to enable high‑performance parallel replication.

Efficient Ops

Nov 13, 2023

Understanding MySQL Parallel Replication: From Lag to Group Commit

Anyone who has maintained MySQL in production knows that master‑slave replication lag is a painful problem that can cause stale reads and affect high‑availability failover.

Contents:

Impact of replication lag

Overview of parallel replication schemes

Group‑commit based parallel replication (Commit‑Parent‑Based and Lock‑Based)

WRITESET scheme

Benchmark results

How to enable parallel replication

1. Impact of Replication Lag

Lag leads to two main issues: (1) read‑write split workloads may read stale data, and (2) large lag hampers the speed of failover because the replica must apply all pending binlog events before switching.

If the replica waits for all binlog changes before failover, service availability is reduced.

If it switches immediately, un‑applied changes are lost, which many applications cannot tolerate.

2. Parallel Replication Overview

MySQL has introduced several parallel replication schemes over time:

MySQL 5.6 – database‑level parallelism (rarely used in single‑database, multi‑table environments).

MySQL 5.7 – group‑commit based parallelism.

MySQL 8.0 – WRITESET based parallelism.

3. Group‑Commit Based Parallel Replication

3.1 Commit‑Parent‑Based Scheme

Transactions are split into a Prepare phase and a Commit phase. InnoDB uses pessimistic locking, so if two transactions are both in the Prepare phase they have no lock conflict and can be replayed in parallel on the replica.

Implementation details:

The master maintains a global counter that is incremented before a transaction commits at the storage engine level.

Before entering Prepare, the current counter value is stored in the transaction as the commit‑parent .

The commit‑parent is written into the binlog header.

During replay, if two transactions share the same commit‑parent they are executed in parallel.

Example of seven transactions:

Trx1 ------------P----------C--------------------------------></code>
<code>                            |</code>
<code>Trx2 ----------------P------+---C----------------------------></code>
<code>                            |   |</code>
<code>Trx3 -------------------P---+---+-----C----------------------></code>
<code>                            |   |     |</code>
<code>Trx4 -----------------------+-P-+-----+----C-----------------></code>
<code>                            |   |     |    |</code>
<code>Trx5 -----------------------+---+-P---+----+---C-------------></code>
<code>                            |   |     |    |   |</code>
<code>Trx6 -----------------------+---+---P-+----+---+---C----------></code>
<code>                            |   |     |    |   |   |</code>
<code>Trx7 -----------------------+---+-----+----+---+-P-+--C-------></code>
<code>                            |   |     |    |   |   |  |

Result:

Trx1, Trx2, Trx3 execute in parallel.

Trx4 executes serially.

Trx5, Trx6 execute in parallel.

Trx7 executes serially.

3.2 Lock‑Based Scheme

This scheme introduces the concept of a locking interval , defined from the moment the last DML statement acquires a lock in the Prepare phase to the moment the lock is released before the storage‑engine commit.

If two transactions have overlapping locking intervals, they have no lock conflict and can be replayed in parallel.

Trx1 -----L---------C------------></code>
<code>Trx2 ----------L---------C------->

Conversely, non‑overlapping intervals cannot be parallelized:

Trx1 -----L----C-----------------></code>
<code>Trx2 ---------------L----C------->

The master tracks four variables:

global.transaction_counter – transaction counter.

transaction.sequence_number – per‑transaction sequence number.

global.max_committed_transaction – maximum committed sequence number.

transaction.last_committed – maximum committed sequence number before the transaction enters Prepare.

These values are written to the binlog (as GTID_LOG_EVENT for GTID replication or ANONYMOUS_GTID_LOG_EVENT otherwise).

3.3 Parallel Replay Logic on the Replica

The replica maintains an ordered transaction_sequence queue sorted by sequence_number. A new transaction can be inserted only if its last_committed is smaller than the sequence_number of the first transaction in the queue.

transaction.last_committed < transaction_sequence[0].sequence_number

Applying the earlier seven‑transaction example yields the same parallel execution pattern described for the Commit‑Parent scheme, but the Lock‑Based scheme achieves higher overall parallelism.

4. WRITESET Scheme

Introduced in MySQL 8.0, the WRITESET scheme is primarily used by Group Replication for conflict detection during the certification phase. Two concurrent transactions from different nodes are considered non‑conflicting if they do not modify the same row.

4.1 Generating the Writeset

Extract primary‑key, unique‑index, and foreign‑key information for each modified row and concatenate them into a string.

Hash the string using the algorithm defined by transaction_write_set_extraction (default XXHASH64).

Insert the hash value into the transaction’s writeset.

4.2 Implementation Details

void Writeset_trx_dependency_tracker::get_dependency(THD *thd,
                                                     int64 &sequence_number,
                                                     int64 &commit_parent) {
  Rpl_transaction_write_set_ctx *write_set_ctx =
      thd->get_transaction()->get_transaction_write_set_ctx();
  std::vector<uint64> *writeset = write_set_ctx->get_write_set();
  // ... (logic to decide whether WRITESET can be used, update m_writeset_history, etc.)
}

The function determines whether a transaction can use WRITESET based on factors such as writeset size, matching transaction_write_set_extraction settings, foreign‑key relationships, and history‑size limits.

If WRITESET cannot be used, the transaction falls back to the Lock‑Based scheme.

4.3 Relevant Parameters

binlog_transaction_dependency_tracking – selects the dependency‑tracking scheme (COMMIT_ORDER, WRITESET, WRITESET_SESSION).

transaction_write_set_extraction – hash algorithm for writeset (OFF, MURMUR32, XXHASH64).

binlog_transaction_dependency_history_size – maximum number of entries stored in the writeset history (default 25000).

5. Benchmark Results

MySQL’s official benchmarks compare COMMIT_ORDER, WRITESET_SESSION, and WRITESET under three workloads (OLTP read/write, indexed‑column update, write‑only) on a 16‑core SSD master with 8 M rows across 16 tables.

Key findings:

COMMIT_ORDER benefits from higher master concurrency; replication speed increases with more threads.

WRITESET’s performance is largely independent of master concurrency; even a single thread outperforms COMMIT_ORDER with 256 threads.

WRITESET_SESSION behaves like COMMIT_ORDER but still achieves good throughput at lower thread counts (4–8).

6. Enabling Parallel Replication

On the replica, set the following parameters (requires a replication restart):

slave_parallel_type = LOGICAL_CLOCK</code>
<code>slave_parallel_workers = 16</code>
<code>slave_preserve_commit_order = ON

To use the WRITESET scheme on the master, configure:

binlog_transaction_dependency_tracking = WRITESET_SESSION</code>
<code>transaction_write_set_extraction = XXHASH64</code>
<code>binlog_transaction_dependency_history_size = 25000</code>
<code>binlog_format = ROW

WRITESET works only when the binlog format is ROW.

7. References

WL#6314: MTS – Prepared transactions slave parallel applier

WL#6813: MTS – ordered commits (sequential consistency)

WL#7165: MTS – Optimizing MTS scheduling by increasing the parallelization window on master

WL#8440: Group Replication – Parallel applier support

WL#9556: Writeset‑based MTS dependency tracking on master

WriteSet parallel replication (Chinese article)

Improving the Parallel Applier with Writeset‑based Dependency Tracking

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Benchmark mysql binlog Group Commit WriteSet parallel replication

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.