Backend Development 46 min read

Basic Introduction to Apache ZooKeeper: Architecture, Data Model, Sessions, ACLs, and Cluster Mechanisms

This article provides a comprehensive overview of Apache ZooKeeper, covering its ZAB protocol, data model hierarchy, node types, storage mechanisms, watch system, session lifecycle, ACL configurations, serialization with Jute, cluster roles, leader election, log management, and practical use cases such as distributed locks, ID generation, load balancing, and integration with frameworks like Dubbo and Kafka.

Wukong Talks Architecture
Wukong Talks Architecture
Wukong Talks Architecture
Basic Introduction to Apache ZooKeeper: Architecture, Data Model, Sessions, ACLs, and Cluster Mechanisms

ZooKeeper Basic Introduction

Apache ZooKeeper, originally a sub‑project of Apache Hadoop, offers efficient and reliable distributed coordination services for distributed applications.

Key Features

Sequential Consistency : Transactions from a client are applied in the exact order they were issued.

Atomicity : Either all changes of a transaction are applied or none.

Single View : All clients see a consistent data model regardless of the server they connect to.

Reliability : Once a transaction is committed, its effects persist until another transaction modifies them.

Real‑time Visibility : Clients can read the latest state immediately after a transaction succeeds.

Data Model

ZooKeeper stores data in a hierarchical tree similar to a file system, with a fixed root node (/) . Nodes are addressed using absolute paths separated by slashes, e.g., get /work/task . Each node holds a byte array, ACL information, optional children, and a stat structure.

Node Types

Persistent : Remains after the client session ends; must be deleted explicitly.

Ephemeral : Automatically removed when the client session expires; useful for tracking live servers (e.g., /servers/host ).

Sequential : ZooKeeper appends a monotonically increasing number to the node name, enabling ordered creation (e.g., works/task-00000001 ).

Node State Attributes

Attribute

Description

czxid

Creation transaction ID

ctime

Creation timestamp

mzxid

Last modification transaction ID

mtime

Last modification timestamp

pzxid

Last child‑modification transaction ID

cversion

Child version

version

Data version

aversion

ACL version

ephemeralOwner

Session ID of the creator (0 for persistent nodes)

dataLength

Length of the data byte array

numChildren

Number of child nodes

Storage

Data is kept in memory for fast access, while transaction logs and snapshots are persisted on local disks. Logs record local session operations for synchronization; snapshots periodically flush the in‑memory tree to disk.

Watch Mechanism

Clients can register a Watcher when creating a ZooKeeper instance or via getData , exists , and getChildren . Watches are one‑time triggers that fire when the watched node changes, requiring re‑registration after each event.

new ZooKeeper(String connectString, int sessionTimeout, Watcher watcher)

Session Management

A session consists of a session ID, timeout, and closing flag. Sessions transition through states such as CONNECTING, CONNECTED, RECONNECTING, and CLOSED. Heartbeat (ping or regular requests) refreshes the session expiration time, which is managed in bucketed queues to improve efficiency.

ACL (Access Control List)

ZooKeeper supports three permission schemes: IP range (e.g., ip:192.168.0.11/22 ), Digest (username:password hashed with SHA‑1+BASE64), and World (open to all). Permissions include create, write, read, delete, and admin.

Serialization (Jute)

ZooKeeper uses the Jute framework for serialization. Classes implement the Record interface with serialize and deserialize methods, using writeLong , writeString , etc.

class test_jute implements Record {
  private long ids;
  private String name;
  public void serialize(OutputArchive a, String tag) { ... }
  public void deserialize(InputArchive a, String tag) { ... }
}

Cluster Architecture

ZooKeeper clusters consist of Leader, Follower, and Observer nodes. Transactional requests are forwarded to the Leader, which replicates the changes to Followers/Observers. Leader election uses the FastLeaderElection algorithm and ZAB (ZooKeeper Atomic Broadcast) protocol.

Leader Election Process

Servers exchange votes containing logicClock , state , self_id , self_zxid , vote_id , and vote_zxid . The candidate with the highest zxid and ID wins after a majority of votes.

Log Management

ZooKeeper generates transaction logs and snapshots. Tools like PurgeTxnLog or Linux crontab scripts can be used to clean old logs and free disk space.

Practical Use Cases

Distributed Lock : Create an ephemeral sequential node under /lock ; the smallest node holds the lock. Watch the predecessor node to avoid the thundering herd problem.

Distributed ID Generation : Use sequential nodes as unique, ordered IDs.

Load Balancing : Store server connection counts under /servers and apply a minimum‑connection algorithm to select the target server.

Framework Integration : ZooKeeper serves as the registry for Dubbo, stores broker metadata for Kafka, and supports many other distributed systems.

References

《从Paxos到Zookeeper 分布式一致性原理与实践》

ZookeeperClusterDistributed CoordinationZab ProtocolACLWatch Mechanism
Wukong Talks Architecture
Written by

Wukong Talks Architecture

Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.