Quorum in Distributed Systems: Concepts, Variants, and Impact on Availability and Latency
Quorum, the core principle behind majority read/write and Paxos, can be defined in various ways—including weighted, hierarchical, and non‑majority quorums—to trade off system availability, latency, and fault tolerance, with examples illustrating how different quorum designs affect performance in distributed storage and coordination services.
Paxos can be viewed as two rounds of majority read/write to achieve strong consistency; the majority requirement can be relaxed by reducing the number of participants, leading to the broader concept of quorum‑rw.
In distributed systems, building reliable services over unreliable hardware relies on replication, and the consistency problem of replication reduces to Paxos. Quorum‑rw (also called quorum‑r) requires that a write be acknowledged by more than half of the nodes and a read checks that the same majority holds the latest value.
Wikipedia defines quorum‑rw as a system where each data copy gets a vote; a read succeeds if it gathers more than the read‑quorum votes (Vr) and a write succeeds if it gathers more than the write‑quorum votes (Vw), with the constraints Vr + Vw > V and Vw > V/2.
Example: with five nodes {a, b, c, d, e}, writing x=3 to nodes a, b, c forms a quorum; any read contacting any three nodes will see the value.
Quorum does not have to be a strict majority. Different quorum families are explored:
Weighted quorum : nodes {a, b, c, d, e} where {a, b, c} are grouped as a single weight of three votes. Any set containing this group plus optional d or e forms a quorum, while {d, e} alone does not.
Quorum containing a specific element : for three nodes {a, b, c}, requiring every quorum to include node a yields the set { {a}, {a,b}, {a,c}, {a,b,c}.
4/5 + 2/5 quorum : with five nodes, any four‑node set or the pair {a,b} is a quorum. The full list is shown in a {a, b, c, d} {a, b, c, e} {a, b, d, e} {a, c, d, e} {b, c, d, e} {a, b}
Hierarchical quorum (3×3 matrix) : nodes arranged in a 3×3 grid; a quorum must contain at least two rows with at least two nodes per row. Example quorum {a1, a2, b1, b2} intersects with {b2, b3, c2, c3} at node b2. The layout is illustrated with ASCII art inside a .------. |a1 a2| a3 | .--|---. |b1 |b2| b3| '------' | c1 |c2 c3| '------'
Hierarchical quorum (2×n) : with two rows {x, y}, the quorum set is { {x}, {x, y} } (or the symmetric version). For a 3×5 example, a quorum could be {a1, a3} ∪ {a1, a2, b1, b2, b3} etc., always guaranteeing intersection.
Choosing a quorum directly influences system performance: fewer required nodes reduce message count and latency, especially when nodes are geographically dispersed.
Availability analysis shows that any quorum set can be transformed into a majority quorum without decreasing the probability of being available; therefore, majority quorums provide the highest reliability, while non‑majority quorums trade some availability for lower latency or higher fault tolerance.
Application scenarios:
Zookeeper hierarchical quorum : a three‑data‑center deployment (DC‑1, DC‑2, DC‑3) with three nodes per center uses hierarchical quorums, allowing the system to tolerate up to five of nine nodes failing while often needing only two data centers for a read/write.
Edge storage with weighted quorum : a central data center c (weight 2) and three edge centers e₁‑e₃ (weight 1 each). A write at e₁ can succeed via quorum {e₁, c} if the link to c is healthy, otherwise via {e₁, e₂, e₃}.
In summary, quorum is the foundational concept for achieving strong consistency in distributed systems; by selecting appropriate quorum definitions—majority, weighted, or hierarchical—engineers can balance availability, latency, and fault tolerance to suit complex, heterogeneous deployments.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.