Fundamentals 21 min read

How the Byzantine Generals Problem Shapes Modern Distributed Consensus

This article explains the Byzantine Generals Problem, maps its concepts to distributed consensus, distinguishes consensus from consistency, outlines oral‑message and signed‑message solutions, and analyzes their applicability and limitations in fault‑tolerant distributed systems.

Xiaokun's Architecture Exploration Notes
Xiaokun's Architecture Exploration Notes
Xiaokun's Architecture Exploration Notes
How the Byzantine Generals Problem Shapes Modern Distributed Consensus

Byzantine Generals Problem

The problem describes a scenario where each division in a Byzantine army has its own general, some of whom may be traitors, and communication occurs via messengers.

Every loyal general must receive the same plan.

A minority of traitors must not cause loyal generals to adopt a wrong plan.

Problem Restatement

All loyal generals receive identical information and reach the same decision.

A message from a loyal general must be adopted by all other loyal generals.

What Is the Byzantine Generals Problem?

With three divisions, each general proposes an attack or retreat plan and communicates through messengers; the final decision follows a “minority obeys majority” rule.

Minority‑Obeys‑Majority Voting Reliability

Each general can propose a plan and notify others via messengers.

Upon receiving proposals, generals adopt the majority decision.

Mapping to Distributed Consensus

In distributed systems, the concepts translate as:

Byzantine generals → service nodes.

Loyal generals → correctly functioning nodes.

Traitorous generals → faulty or malicious nodes.

Messengers killed → communication failures.

Messengers replaced by spies → message tampering or forgery.

Distributed consensus ensures that all non‑faulty nodes agree on the same value despite possible faults.

Consensus vs. Consistency

Consensus deals with agreeing on a value (e.g., transaction commit), while consistency ensures that all replicas return the same data for read operations.

Properties of Distributed Consensus

Termination: Every non‑faulty process eventually decides. Agreement: All non‑faulty processes decide on the same value. Validity: If all start with the same initial value, the final decision equals that value.

Application Scenarios

Leader election in a cluster.

Mutual exclusion for shared resources.

Fault detection of a node.

Clock phase synchronization.

Air Traffic Control systems requiring a consistent view.

Oral‑Message Solution

Requirements

Every message sent by a node must be correctly received.

Nodes must know the sender of each message without additional network exchange.

Nodes must detect heartbeats of other nodes (requires timeout).

Algorithm

A function major(V1,V2,…,Vn‑1) returns the value that obtains a majority vote, assuming the number of unavailable nodes m satisfies m < n/3 (i.e., n = 3m+1 ).

If no unavailable nodes (m=0): the leader broadcasts its proposal v to all replicas, which execute it; missing messages cause a default “retreat”.

If m > 0 : replicas that miss the proposal also default to “retreat” and may become new leaders to propagate their own proposals.

Analysis

With only three nodes, oral messages cannot solve the Byzantine problem because a single faulty node can prevent consensus.

Signed‑Message Solution

Requirements

Messages cannot be forged; forged messages are detectable.

Each node can verify the authenticity of a message’s signature.

Algorithm

Define choice(V) where V is an ordered set of signed messages:

If V contains one element v , then choice(V)=v .

If V is empty, choice(V)=RETREAT .

The leader signs a message v:0:i and sends it to all nodes. Upon receiving a signed message, a node appends its identifier and forwards it, building a chain of signatures. When no further signatures arrive, the node selects the proposal from the ordered set and executes it.

Analysis

Signed messages allow nodes to detect and discard forged messages, enabling consensus even with malicious nodes, suitable for decentralized public‑network environments.

Conclusion

Oral‑message algorithms work for crash‑fault tolerant (CFT) scenarios, while signed‑message algorithms address Byzantine fault tolerance, providing robust solutions for leader election, mutual exclusion, and distributed transactions.

fault tolerancedistributed consensusconsensus algorithmsByzantine Generalsoral messagessigned messages
Xiaokun's Architecture Exploration Notes
Written by

Xiaokun's Architecture Exploration Notes

10 years of backend architecture design | AI engineering infrastructure, storage architecture design, and performance optimization | Former senior developer at NetEase, Douyu, Inke, etc.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.