Understanding Transactions, ACID Properties, CAP Theorem, and BASE Theory in Distributed Systems
This article explains the fundamentals of transactions, the ACID properties, isolation levels, distributed transactions, the CAP theorem, and the BASE model, illustrating how they shape consistency, availability, and reliability in modern database and distributed system design.
Transaction
Definition of transaction:
A transaction is a unit of program execution composed of a series of operations that access and update data in the system; in the narrow sense, it specifically refers to a database transaction.
Purpose of transaction:
When multiple applications access the database concurrently, a transaction provides isolation between them, preventing operations from interfering with each other.
A transaction offers a way to recover from failures while maintaining data consistency even in abnormal states.
Transactions have four properties—Atomicity, Consistency, Isolation, and Durability—collectively known as the ACID properties.
ACID
Atomicity
Atomicity means a transaction must be an indivisible sequence of operations: either all operations succeed or none do. Any failure causes the entire transaction to roll back.
Consistency
Consistency requires that a transaction never violates the integrity of the database; the database must be in a consistent state before and after the transaction.
Example: In a bank transfer, if account A transfers 50 units to account B, the total balance remains unchanged (200 units) before and after the transaction.
Isolation
Isolation ensures that concurrent transactions do not affect each other. The SQL standard defines four isolation levels: Read Uncommitted, Read Committed, Repeatable Read, and Serializable.
The table below shows the phenomena each isolation level prevents:
Isolation Level
Dirty Read
Repeatable Read
Phantom Read
Read Uncommitted
Exists
No
Exists
Read Committed
None
No
Exists
Repeatable Read
None
Yes
Exists
Serializable
None
Yes
None
Higher isolation levels provide stronger data integrity but increase the impact on concurrency performance.
For most applications, setting the isolation level to Read Committed offers a good balance: it prevents dirty reads while maintaining acceptable performance. When issues such as non‑repeatable reads or phantom reads arise, developers can use pessimistic or optimistic locking as needed.
Durability
Durability guarantees that once a transaction commits, its changes persist permanently, even if the system crashes or machines fail; the database must be able to recover to the committed state after a restart.
Distributed Transactions
Transactions are also widely used in distributed computing. While a single‑node database can easily satisfy ACID, achieving ACID across multiple nodes is challenging.
A distributed transaction involves participants, transaction managers, and resource servers located on different nodes, often operating on multiple data sources or business systems.
Example: A cross‑bank transfer involves a withdrawal service on one bank and a deposit service on another. Both steps must either succeed together or be rolled back together to avoid inconsistency.
Such a transaction consists of multiple sub‑transactions (e.g., the withdrawal and deposit operations). Coordinating these sub‑transactions while preserving ACID makes distributed transaction systems complex.
CAP Theorem
CAP theorem:
A distributed system cannot simultaneously guarantee Consistency (C), Availability (A), and Partition Tolerance (P); it can satisfy at most two of these properties.
Consistency
In a distributed context, consistency means that all replicas hold the same data. Strong consistency ensures that a read after a successful write returns the latest value.
Availability
Availability requires that every request receives a response within a bounded time. If the response time exceeds the bound, the system is considered unavailable.
For example, an online search engine must return results within 0.5 seconds, whereas a data‑warehouse query might tolerate 20–30 seconds.
Partition Tolerance
Partition tolerance means the system continues to operate correctly despite network partitions that isolate subsets of nodes.
When a network partition occurs, the system is split into isolated segments, each still functioning internally.
Because a distributed system must choose which property to sacrifice, the following table lists typical scenarios for dropping one of the CAP properties:
CAP Trade‑off
Explanation
Give up P
Place all data on a single node to avoid partition issues, sacrificing scalability.
Give up A
During a partition the system becomes unavailable to preserve consistency.
Give up C
Accept eventual (weak) consistency while maintaining availability and partition tolerance.
Architects must balance consistency and availability based on business requirements.
BASE Theory
BASE stands for Basically Available , Soft state , and Eventually consistent . It is an evolution of CAP, acknowledging that many large‑scale internet systems relax strong consistency in favor of availability and eventual consistency.
Basically Available
Basic availability means the system continues to operate despite failures, though performance or functionality may degrade.
Response‑time loss: An online search engine normally returns results within 0.5 seconds, but during a failure it may take 1–2 seconds.
Feature loss: During peak shopping events, some users may be redirected to a degraded page to protect system stability.
Soft state
Soft state allows intermediate data states during synchronization between replicas, acknowledging that such states do not affect overall availability.
Eventually consistent
Eventual consistency guarantees that, after updates stop, all replicas will converge to the same state, even though they may be temporarily inconsistent.
Common variants of eventual consistency include:
Causal consistency
Read‑your‑writes
Session consistency
Monotonic reads
Monotonic writes
Modern relational databases (e.g., MySQL, PostgreSQL) often implement primary‑secondary replication using synchronous or asynchronous methods, achieving either strong or eventual consistency depending on the replication strategy.
References
《从Paxos到ZooKeeper——分布式一致性原理与实践》
Copyright statement: Content sourced from the internet; original authors retain rights. We will credit the source when known and remove it upon request.
-END-
Architecture Digest
ID: ArchDigest
Internet Application Architecture | Large‑Scale Websites | Big Data | Machine Learning
More great articles, click below: Read Original
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.