Big Data 8 min read

Why Kafka Dropped Zookeeper in Version 2.8: Design Philosophy and Alternatives

The article explains the design philosophy behind Kafka 2.8’s removal of Zookeeper, reviews Zookeeper’s classic leader‑election use cases, highlights its limitations, and shows how the Raft protocol provides a decentralized alternative for high‑availability leader selection in distributed messaging systems.

Full-Stack Internet Architecture

May 6, 2021

Why Kafka Dropped Zookeeper in Version 2.8: Design Philosophy and Alternatives

Kafka 2.8 officially removed its dependency on Zookeeper, prompting the question of whether this change is merely about reducing an external component or reflects a deeper design philosophy.

1. Classic Zookeeper Use Cases

Zookeeper emerged alongside the rise of big‑data and distributed systems to provide reliable storage on cheap, failure‑prone machines. By forming a cluster of replicas, applications achieve high availability through automatic leader election and failover.

The core function is Leader election : selecting a primary node that handles reads/writes while other nodes replicate its data, ensuring high availability.

Zookeeper’s temporary sequential nodes and watch mechanism make implementing leader election straightforward.

In the diagram, multiple members (t1, t2, …) compete to become the leader; only one serves clients at a time, and if it fails, the remaining members re‑elect a new leader.

Zookeeper clusters are deployed with strong consistency (CP model), tolerating up to half of the nodes failing, but they can suffer from availability issues during leader election or full GC pauses, which delete temporary nodes and break watch notifications.

2. Kafka’s Need for Zookeeper

Kafka relies heavily on leader election for each topic partition’s replicas. One replica is elected leader to handle client I/O, while followers replicate from it, and the leader’s write success determines commit acknowledgment.

Thus, Zookeeper’s leader‑election capabilities fit Kafka’s requirements perfectly, leading to a “honeymoon” integration.

3. Zookeeper’s Critical Weaknesses

Although Zookeeper provides strong consistency, its CP nature sacrifices availability. Cluster-wide leader election pauses service, and issues like frequent full GC can cause session timeouts, deleting all temporary nodes and breaking the election service.

From a high‑availability perspective, relying on an external component like Zookeeper is not an elegant long‑term solution.

With the rise of decentralized designs, the Raft consensus algorithm has become a compelling alternative. Raft combines leader election and log replication to achieve strong consistency without external dependencies, embedding the protocol directly into the application.

Consequently, Kafka 2.8 replaced Zookeeper with an internal Raft‑based quorum controller, eliminating the need for an external coordination service.

For readers interested in Raft, the author recommends a series of articles on the protocol.

Finally, the author encourages readers to follow, like, and comment as a form of support.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kafka distributed-systems Raft big-data leader-election

Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.