Fundamentals 10 min read

Explaining the Raft Consensus Algorithm with Animated Diagrams

This article uses animated diagrams to explain the Raft consensus algorithm, covering its overview, roles, single‑node and multi‑node scenarios, leader election process, term handling, election rules, heartbeat timeouts, and leader failure recovery, helping readers grasp distributed consistency fundamentals.

Wukong Talks Architecture

Jan 26, 2021

Explaining the Raft Consensus Algorithm with Animated Diagrams

1. Raft Overview

Raft 算法

is the preferred 共识算法 for distributed systems, used in projects like Etcd and Consul. Mastering it simplifies handling fault tolerance and consistency requirements in distributed configurations and NoSQL storage.

Raft achieves consensus by electing a leader and replicating logs across nodes.

2. Raft Roles

2.1 Roles

Follower : ordinary node that receives leader messages and becomes a candidate when the leader’s heartbeat times out.

Candidate : requests votes from other nodes; if it gains a majority, it becomes the leader.

Leader : handles write requests, replicates logs, and sends periodic heartbeats to assert its authority.

3. Single‑Node System

3.1 Database Server

Imagine a single node acting as a database server storing a value X.

3.2 Client

The client (green circle) interacts with node a (blue circle); “Term” denotes the election term.

3.3 Client Sends Data

The client updates the value to 8; in a single‑node setup consistency is trivial.

3.4 Multi‑Node Consistency

When multiple servers (a, b, c) form a cluster, Raft ensures all nodes store the same value after client updates, solving distributed consistency problems.

4. Leader Election Process

4.1 Initial State

All nodes start as followers; each node’s term is 0.

4.2 Becoming a Candidate

Nodes use random election timeouts. When node A’s timeout expires first, it becomes a candidate, increments its term to 1, and votes for itself.

Node A: Term = 1, Vote Count = 1

Node B: Term = 0

Node C: Term = 0

4.3 Voting

The candidate requests votes via RPC. Nodes B and C, having not voted in term 1, grant their votes to A, making A the new leader.

Step 1: Candidate A sends vote requests.

Step 2: Nodes B and C vote for A and update their terms.

Step 3: A receives a majority and becomes leader.

Step 4: A sends periodic heartbeats to B and C.

Step 5: B and C acknowledge the heartbeat.

4.4 Terms

Terms increase automatically when followers time out, are updated to larger values when receiving higher terms, and cause nodes to revert to followers if they discover a higher term.

4.5 Election Rules

Only one leader per term; a new election starts if the leader fails or network issues occur.

Each server may cast at most one vote per term.

4.6 Majority

In a cluster of N nodes, a majority is ⌊N/2⌋ + 1 (e.g., 2 out of 3).

4.7 Heartbeat Timeout

Randomized election timeouts ensure that typically only one node initiates an election, reducing split‑vote scenarios.

5. Leader Failure

If the leader crashes, remaining nodes trigger a new election. For example, when leader A fails, node C times out first, becomes candidate, gathers votes from B (and itself), and assumes leadership.

Step 1: Leader A fails; B and C miss heartbeats.

Step 2: C times out and becomes candidate.

Step 3: C requests votes.

Step 4: C receives votes, becomes leader.

Step 5: C sends heartbeats; B responds, A does not.

Summary

Raft ensures a single leader per term through term management, leader heartbeats, randomized election timeouts, first‑come‑first‑served voting, and majority vote rules, thereby minimizing election failures and maintaining distributed consistency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Raft Consensus leader election

Written by

Wukong Talks Architecture

Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.