Why Distributed Systems Can’t Have It All: Unpacking the CAP and BASE Theories
The article explains the CAP theorem, detailing how distributed systems must trade off consistency, availability, and partition tolerance, and explores CP vs AP designs, then introduces the BASE model—Basically Available, Soft State, Eventual Consistency—as a practical complement for real‑world architectures.
CAP Theory
Definition:
In a distributed system, when read/write operations are involved, only two of the three properties—Consistency, Availability, Partition Tolerance—can be guaranteed; the third must be sacrificed.
Here, distributed system refers to a collection of interconnected nodes that share data; interconnection and data sharing are crucial, e.g., a memcache cluster; without them it is not a distributed system.
Consistency
For a given client, read operations must return the latest write results.
Availability
Non-faulty nodes must return reasonable responses within a reasonable time.
Partition Tolerance
When a network partition occurs, the system must continue to fulfill its duties.
In a distributed environment, one must choose P because network failures can cause partitions, making a CA architecture impossible; only CP or AP can be chosen.
CP (Consistency + Partition Tolerance)
For example, nodes N1 and N2; N1 updates data to y, but before synchronization, a network partition occurs, leaving N2 with old data x.
To guarantee consistency, when a client accesses N2, it cannot return x; it must signal an error, violating availability.
AP (Availability + Partition Tolerance)
In the same scenario, to preserve availability, N2 returns x (stale data), violating consistency.
When designing architecture with the CAP theory, keep these points in mind:
CAP focuses on the granularity of data, not the entire system.
CAP states that the three properties cannot be simultaneously satisfied, but it does not mean a system must be exclusively CP or AP; different data types within a system may require different trade‑offs.
Remember: CAP’s focus is on data . A system contains various data types; some require CP, others AP.
For example, in a user management system, account data (ID, password) needs CP, while user profile data (nickname, bio) can tolerate AP.
CAP ignores network latency.
Data replication between nodes always incurs delay—milliseconds within a data center, tens to hundreds of milliseconds across distant sites.
For extremely stringent scenarios (e.g., financial transactions), strong consistency may be unattainable in a distributed setting; a single‑node write with backups is used.
Note: The inability to distribute certain data types does not preclude the system from being distributed overall.
Giving up one property does not mean doing nothing; prepare for recovery after partitions.
CAP tells us we can only have two of the three properties, sacrificing the third. Partitions are inevitable, but under normal conditions CA can be achieved. During a partition, we may sacrifice consistency or availability and must be ready to restore CA after the partition resolves.
For instance, in a user system, account data uses CP; during a partition, node 1 can register new users while node 2 cannot. Node 1 should log unsynchronized registrations and sync them to node 2 after the partition ends.
Conversely, user profile data uses AP; during a partition both nodes can modify profiles, leading to inconsistencies that must be merged after recovery, e.g., using “last write wins”.
BASE Theory
BASE stands for:
Basically Available – the system remains partially available during failures, ensuring core functionality stays up.
For example, in a system with tens of millions of daily active users, login is core, registration is non‑core.
Soft State – the system may hold intermediate states that do not affect overall availability.
Eventual Consistency – all data replicas will converge to a consistent state after some time.
The core idea of BASE is that even if strong consistency cannot be achieved, appropriate techniques can ensure eventual consistency.
BASE extends and complements CAP; for AP solutions that sacrifice consistency during partitions, the system should achieve eventual consistency once the partition is resolved.
Content compiled from “Learning Architecture from Scratch”.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.