Understanding Zookeeper: Cluster Architecture, Data Model, Watcher Mechanism, and Common Use Cases
This article explains Zookeeper’s high‑performance distributed coordination system, covering its cluster roles, ZNode data model, watcher mechanism, provides Java code demos for creating and monitoring nodes, and outlines eight typical application scenarios such as configuration management, load balancing, naming services, master election, and distributed locks.
1. Zookeeper Cluster Roles
Zookeeper clusters consist of three types of nodes: Leader, which handles all write requests and internal scheduling; Follower, which serves read requests and forwards writes to the Leader; and Observer, introduced in version 3.3.0, which only processes reads and does not participate in elections.
Configuration of an Observer node is simple: add peerType=observer to the observer’s configuration file and mark the server as an observer in each node’s config, e.g., server.1:localhost:2181:3181:observer .
2. Zookeeper Data Model
Zookeeper stores data in a hierarchical ZNode tree, similar to a Unix file system but without separate file/directory concepts. Each ZNode can hold data and have child nodes. There are four ZNode types:
Persistent (PERSISTENT): remains after client disconnects until explicitly deleted.
Persistent Sequential (PERSISTENT_SEQUENTIAL): like persistent but with an automatically appended sequence number.
Ephemeral (EPHEMERAL): tied to the client session and removed when the session ends.
Ephemeral Sequential (EPHEMERAL_SEQUENTIAL): combines EPHEMERAL with a sequence number.
ZNodes also store metadata such as timestamps, transaction IDs, ACLs, and version numbers, and all operations are ordered and atomic.
3. Watcher Mechanism
The Watcher (listener) mechanism allows clients to register for notifications on ZNode events such as creation, deletion, or data changes. When an event occurs, Zookeeper asynchronously notifies all registered watchers, enabling use cases like distributed publish/subscribe, locks, and configuration updates.
4. Code Demonstration
Dependency (Maven):
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>3.4.5</version>
</dependency>Creating a persistent ZNode:
private final String ZK_ADDRS = "server01:2181,server02:2181,server03:2181";
private final int SESSION_TIMEOUT = 5000;
private String znodePath = "/my_node";
@Test
public void createZNode() throws IOException, KeeperException, InterruptedException {
ZooKeeper zkClient = new ZooKeeper(ZK_ADDRS, SESSION_TIMEOUT, watchedEvent -> {});
Stat exist = zkClient.exists(znodePath, false);
if (exist == null) {
zkClient.create(znodePath, "123".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}
zkClient.close();
}Registering a watcher and deleting the node:
@Test
public void TestWatcher() throws IOException, KeeperException, InterruptedException {
ZooKeeper zkClient = new ZooKeeper(ZK_ADDRS, SESSION_TIMEOUT, new Watcher() {
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getType() == Event.EventType.NodeDeleted && watchedEvent.getPath().equals(znodePath)) {
log.info(String.format("注意:ZNode '%s' is deleted !", znodePath));
}
}
});
Stat exist = zkClient.exists(znodePath, true);
if (exist != null) {
zkClient.delete(znodePath, -1);
}
zkClient.close();
}5. Typical Application Scenarios
Configuration Management / Pub‑Sub : Store configuration in a ZNode; clients read it at startup and watch for changes to update dynamically.
Load Balancing : Services register their IP:Port under a root ZNode; clients retrieve the list and apply a load‑balancing algorithm.
Naming Service : Similar to JNDI; store service names and addresses, or generate globally unique IDs using sequential nodes.
Distributed Coordination / Notification : Multiple clients watch a node; any change triggers notifications (e.g., master‑slave heartbeat, message broadcasting).
Cluster Management : Track active nodes, their status, and perform online/offline operations.
Master Election : Leverage Zookeeper’s strong consistency to ensure only one client can create a designated election node.
Distributed Lock : Create a lock ZNode; only the client that successfully creates it holds the lock, others wait.
Distributed Queue : Use sequential or temporary nodes to implement FIFO queues or barrier synchronization.
6. Summary
The article introduced Zookeeper’s cluster architecture, ZNode data model, watcher mechanism, provided Java code examples, and described eight common use cases. Zookeeper is widely adopted in big‑data components such as HDFS, HBase, and Kafka, making it an essential tool for building reliable distributed systems.
Big Data Technology Architecture
Exploring Open Source Big Data and AI Technologies
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.