Databases 25 min read

Vitess Two-Phase Commit Implementation and Distributed Transaction Management

This article explains the fundamentals of database transactions, introduces Vitess's architecture, and details how Vitess implements a two‑phase commit protocol for distributed transactions, including transaction IDs, metadata management, preparation, commit, rollback, error handling, and performance considerations, with illustrative Go code snippets.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Vitess Two-Phase Commit Implementation and Distributed Transaction Management

Before diving into Vitess, the article reviews the four ACID properties of transactions—Atomicity, Consistency, Isolation, and Durability—using a bank withdrawal example to illustrate each concept.

It then describes a typical distributed transaction scenario where a user must perform a credit‑check (白条扣款) and create an order across two sharded MySQL databases, highlighting the risk of partial commits when the second shard fails after the first has committed.

The core of the article focuses on Vitess’s architecture: vtgate acts as the transaction manager (coordinator) and vttablet together with MySQL forms each shard (resource manager). The diagram shows vtgate routing requests to one or multiple shards.

Vitess implements a two‑phase commit (2PC) with the following steps:

Generate a distributed transaction ID (DTID) based on the first participant shard (e.g., Shard1:TX1 ).

Record transaction metadata (state, participants, start time) in the metadata shard (mmShard).

Prepare phase: each participant stores its redo log and moves its local transaction connection to a special preparedPool keyed by the DTID.

StartCommit phase: the coordinator updates the metadata state to Committed .

CommitPrepare phase: participants retrieve their connections from preparedPool , delete redo logs, and commit the local transaction.

ConcludeTransaction phase: the metadata entry is removed.

Key Go code snippets illustrate the 2PC flow. For example, the commit function:

func (txc *TxConn) commit2PC(ctx context.Context, session *SafeSession) error {
    if len(session.ShardSessions) <= 1 {
        return txc.commitNormal(ctx, session)
    }
    participants := make([]*querypb.Target, 0, len(session.ShardSessions)-1)
    for _, s := range session.ShardSessions[1:] {
        participants = append(participants, s.Target)
    }
    mmShard := session.ShardSessions[0]
    dtid := dtids.New(mmShard)
    err := txc.gateway.CreateTransaction(ctx, mmShard.Target, dtid, participants)
    if err != nil {
        txc.Rollback(ctx, session)
        return err
    }
    // Prepare, StartCommit, CommitPrepared, ConcludeTransaction steps follow...
}

The article also covers error handling for each phase, including how Vitess uses a watchdog on each vttablet to detect long‑running transactions and trigger a Resolve operation, ensuring idempotency and preventing partial commits even if vtgate or a shard crashes.

Finally, it answers common questions about the necessity of the preparedPool , concurrency of Resolve calls, the meaning of commit errors, the completeness of ACID guarantees in Vitess, alternative transaction models, performance impact (typically <5 ms overhead), and real‑world applicability to e‑commerce order systems.

databaseShardingGoDistributed Transactionstransaction managementTwo-Phase CommitVitess
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.