Fundamentals 12 min read

Understanding Zookeeper: Cluster Architecture, Data Model, Watcher Mechanism, and Common Use Cases

This article explains Zookeeper’s high‑performance distributed coordination system, covering its cluster roles, ZNode data model, watcher mechanism, provides Java code demos for creating and monitoring nodes, and outlines eight typical application scenarios such as configuration management, load balancing, naming services, master election, and distributed locks.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Understanding Zookeeper: Cluster Architecture, Data Model, Watcher Mechanism, and Common Use Cases

1. Zookeeper Cluster Roles

Zookeeper clusters consist of three types of nodes: Leader, which handles all write requests and internal scheduling; Follower, which serves read requests and forwards writes to the Leader; and Observer, introduced in version 3.3.0, which only processes reads and does not participate in elections.

Configuration of an Observer node is simple: add peerType=observer to the observer’s configuration file and mark the server as an observer in each node’s config, e.g., server.1:localhost:2181:3181:observer .

2. Zookeeper Data Model

Zookeeper stores data in a hierarchical ZNode tree, similar to a Unix file system but without separate file/directory concepts. Each ZNode can hold data and have child nodes. There are four ZNode types:

Persistent (PERSISTENT): remains after client disconnects until explicitly deleted.

Persistent Sequential (PERSISTENT_SEQUENTIAL): like persistent but with an automatically appended sequence number.

Ephemeral (EPHEMERAL): tied to the client session and removed when the session ends.

Ephemeral Sequential (EPHEMERAL_SEQUENTIAL): combines EPHEMERAL with a sequence number.

ZNodes also store metadata such as timestamps, transaction IDs, ACLs, and version numbers, and all operations are ordered and atomic.

3. Watcher Mechanism

The Watcher (listener) mechanism allows clients to register for notifications on ZNode events such as creation, deletion, or data changes. When an event occurs, Zookeeper asynchronously notifies all registered watchers, enabling use cases like distributed publish/subscribe, locks, and configuration updates.

4. Code Demonstration

Dependency (Maven):

<dependency>
   <groupId>org.apache.zookeeper</groupId>
   <artifactId>zookeeper</artifactId>
   <version>3.4.5</version>
</dependency>

Creating a persistent ZNode:

private final String ZK_ADDRS = "server01:2181,server02:2181,server03:2181";
private final int SESSION_TIMEOUT = 5000;
private String znodePath = "/my_node";

@Test
public void createZNode() throws IOException, KeeperException, InterruptedException {
    ZooKeeper zkClient = new ZooKeeper(ZK_ADDRS, SESSION_TIMEOUT, watchedEvent -> {});
    Stat exist = zkClient.exists(znodePath, false);
    if (exist == null) {
        zkClient.create(znodePath, "123".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
    }
    zkClient.close();
}

Registering a watcher and deleting the node:

@Test
public void TestWatcher() throws IOException, KeeperException, InterruptedException {
    ZooKeeper zkClient = new ZooKeeper(ZK_ADDRS, SESSION_TIMEOUT, new Watcher() {
        @Override
        public void process(WatchedEvent watchedEvent) {
            if (watchedEvent.getType() == Event.EventType.NodeDeleted && watchedEvent.getPath().equals(znodePath)) {
                log.info(String.format("注意:ZNode '%s' is deleted !", znodePath));
            }
        }
    });
    Stat exist = zkClient.exists(znodePath, true);
    if (exist != null) {
        zkClient.delete(znodePath, -1);
    }
    zkClient.close();
}

5. Typical Application Scenarios

Configuration Management / Pub‑Sub : Store configuration in a ZNode; clients read it at startup and watch for changes to update dynamically.

Load Balancing : Services register their IP:Port under a root ZNode; clients retrieve the list and apply a load‑balancing algorithm.

Naming Service : Similar to JNDI; store service names and addresses, or generate globally unique IDs using sequential nodes.

Distributed Coordination / Notification : Multiple clients watch a node; any change triggers notifications (e.g., master‑slave heartbeat, message broadcasting).

Cluster Management : Track active nodes, their status, and perform online/offline operations.

Master Election : Leverage Zookeeper’s strong consistency to ensure only one client can create a designated election node.

Distributed Lock : Create a lock ZNode; only the client that successfully creates it holds the lock, others wait.

Distributed Queue : Use sequential or temporary nodes to implement FIFO queues or barrier synchronization.

6. Summary

The article introduced Zookeeper’s cluster architecture, ZNode data model, watcher mechanism, provided Java code examples, and described eight common use cases. Zookeeper is widely adopted in big‑data components such as HDFS, HBase, and Kafka, making it an essential tool for building reliable distributed systems.

JavaZookeeperDistributed CoordinationZNodeWatcher
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.