Databases 23 min read

Understanding Distributed Transactions, Consistency Models, Sharding, and Commit Protocols

This article explains the fundamentals of distributed transactions, including ACID properties, consistency models, sharding strategies, and the two‑phase, three‑phase, and TCC protocols, while discussing CAP and BASE theories and the challenges of implementing reliable distributed databases.

Architect's Guide
Architect's Guide
Architect's Guide
Understanding Distributed Transactions, Consistency Models, Sharding, and Commit Protocols

What Is a Distributed Transaction

In everyday life a transaction is an all‑or‑nothing operation, such as transferring money between accounts. When data is stored across multiple databases, guaranteeing that all steps either complete or roll back becomes difficult due to network latency, failures, and power loss.

Atomicity (Atomic)

Atomicity means a transaction cannot be divided into smaller parts; it must be executed as a whole, similar to how a molecule is the smallest unit that retains chemical properties.

Consistency

Consistency ensures that the total amount of money before and after a transaction remains unchanged, preventing intermediate states that would break data integrity. In distributed systems, all replicas must eventually reflect the same result.

Isolation

Isolation guarantees that a transaction in progress is invisible to other concurrent transactions, preventing interference such as double‑spending.

Durability

Durability means that once a transaction commits, its effects are permanently stored on disk and survive crashes.

Consistency Discussion

Three consistency levels are described:

Strong consistency : every read sees the most recent write across all nodes.

Weak consistency : reads may see stale data; the system tolerates temporary divergence.

Eventual consistency : replicas converge to the same state over time, even if they diverge temporarily.

Practical systems choose the appropriate level based on business needs—for example, payment processing requires strong consistency, while product inventory can tolerate weaker guarantees.

Sharding (Database Partitioning)

Sharding splits data across multiple databases to alleviate single‑node bottlenecks.

Vertical Sharding

Vertical sharding separates tables based on business domains, storing loosely coupled tables in different databases. This resembles micro‑service data isolation.

Horizontal Sharding

Horizontal sharding distributes rows of a single table across databases, often by ID range or hash.

Problems Introduced by Sharding

When a transaction spans multiple shards, it becomes a distributed transaction, requiring coordination across nodes and exposing the system to higher latency, increased conflict probability, and potential deadlocks.

CAP Principle

CAP states that a distributed system can only simultaneously guarantee two of the three properties: Consistency, Availability, and Partition tolerance. Partition tolerance is mandatory; thus, designers must trade off between consistency and availability.

BASE Theory

BASE (Basically Available, Soft state, Eventually consistent) relaxes ACID constraints for large‑scale internet systems, accepting temporary inconsistency in exchange for higher availability.

Two‑Phase Commit (2PC)

2PC splits a distributed transaction into a voting phase and a commit phase, using a single coordinator and multiple participants. It ensures atomicity but suffers from single‑point‑of‑failure, blocking, and possible inconsistency if the coordinator crashes.

Three‑Phase Commit (3PC)

3PC adds a pre‑commit (can_commit) phase to reduce blocking time and introduces timeout handling. It improves safety over 2PC but is more complex and still not as widely adopted due to performance costs.

TCC (Try‑Confirm‑Cancel) Pattern

TCC moves the transaction logic to the service layer. The Try step reserves resources, Confirm finalizes them, and Cancel releases them. Each step is a local transaction, making the overall process more resilient to failures.

Conclusion

The article provides a comprehensive overview of distributed transaction concepts, consistency models, sharding techniques, and the main coordination protocols (2PC, 3PC, TCC), helping architects choose the right strategy based on system requirements and trade‑offs.

ShardingtccACIDBASEdistributed transactionsCAPTwo-Phase Committhree-phase commit
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.