Key Distributed System Concepts: Bloom Filter, Consistent Hashing, Quorum, Leader/Follower, and More
This article introduces essential distributed‑system concepts—including Bloom filters, consistent hashing, quorum, leader/follower roles, heartbeats, fencing, WAL, segment logs, high‑water marks, leases, gossip protocol, Phi failure detection, CAP and PACELC theorems, hinted handoff, read repair, and Merkle trees—explaining their purpose and how they are applied in systems such as BigTable, Cassandra, Dynamo, and Kafka.
1. Bloom Filter : A space‑efficient probabilistic data structure used to test whether an element is a member of a set, commonly employed in storage systems like BigTable and Cassandra to reduce disk reads.
2. Consistent Hashing : Enables scalable data distribution by hashing keys onto a ring and assigning each key to the next clockwise node, providing stable incremental changes when nodes join or leave.
3. Quorum : Defines the minimum number of replicas that must acknowledge an operation for it to be considered successful, used in Cassandra writes and leader election protocols.
4. Leader and Follower : In replicated clusters, a designated leader makes decisions and propagates them to followers; leader election ensures fault tolerance and high availability.
5. Heartbeat : Periodic checks that detect leader failure, triggering a new election when necessary.
6. Fencing : Prevents a failed leader from accessing cluster resources by isolating it, using resource or node fencing techniques.
7. Write‑Ahead Log (WAL) : Records intended operations before they are applied to ensure durability and enable recovery after crashes.
8. Segment Log : Splits a large log file into smaller, manageable segments to avoid performance bottlenecks and simplify cleanup.
9. High‑Water Mark : The index of the last log entry replicated to a quorum of followers; leaders expose only entries up to this point.
10. Lease : A time‑bounded lock that remains valid even if the client disconnects, requiring renewal before expiration.
11. Gossip Protocol : A peer‑to‑peer communication mechanism where nodes periodically exchange state information to disseminate updates quickly.
12. Phi Accrual Failure Detection : An adaptive algorithm that outputs a suspicion level for a node rather than a binary up/down status, used by Cassandra.
13. Split‑Brain : A scenario where multiple leaders appear active; resolved by using a monotonically increasing generation number (epoch) to identify the true leader.
14. Checksum : Cryptographic hash (e.g., MD5, SHA‑256) stored with data to detect corruption during transfer or storage.
15. CAP Theorem : States that a distributed system can provide at most two of Consistency, Availability, and Partition tolerance; systems like Dynamo (AP) and BigTable (CP) make different trade‑offs.
16. PACELC Theorem : Extends CAP by adding latency vs. consistency trade‑offs when no partition exists (ELC).
17. Hinted Handoff : Temporarily stores writes for unavailable nodes and forwards them when the nodes recover.
18. Read Repair : During reads, detects stale replicas and pushes the latest version to them, used by Cassandra and Dynamo.
19. Merkle Trees : Binary hash trees that enable efficient comparison of large data sets by comparing root hashes and recursively checking mismatched sub‑trees, employed by Dynamo for anti‑entropy.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.