Kafka Components and Architecture Overview
This article provides a comprehensive overview of Kafka’s core components—including brokers, topics, partitions, producers, and consumers—explaining their roles, relationships, replication, leader/follower dynamics, and architectural diagrams, while also highlighting configuration details and operational considerations for building reliable distributed streaming systems.
1. Broker
Each Kafka server is called a Broker; multiple Brokers form a Kafka Cluster. A machine can host one or more Brokers, all connecting to the same ZooKeeper to constitute the cluster.
2. Topic
Kafka is a publish‑subscribe messaging system. A Topic represents a category of messages; each Topic can have multiple subscribers (consumers). Producers push messages to a Topic, and consumers pull them from the Topic.
3. Topic and Broker Relationship
A Broker can host one or more Topics, and the same Topic can be distributed across multiple Brokers within a cluster.
4. Partition Log
Each Topic is divided into multiple partitions, each mapped to a logical log file. When a message is published to a partition, the Broker appends it to the last segment of the log, which is periodically flushed to disk.
Messages are appended to the end of a segment; flushing can be time‑based or size‑based.
Each partition is an ordered, immutable sequence of records, each assigned a unique offset.
Brokers retain all published records for a configurable retention period (e.g., 2 days), after which they are discarded.
5. Partition Distribution
Partitions are distributed across multiple Brokers and replicated for fault tolerance. Each partition has one leader handling read/write requests and one or more followers that replicate data asynchronously.
Example: Partition 1 leader is broker1, followers are broker2\3; Partition 2 leader is broker2, followers are broker1\4; etc.
6. Producer
Producers create messages and send them to a specific Topic/partition, optionally using a custom partitioning algorithm or random selection.
7. Consumer
Consumers belong to consumer groups; each group can span multiple processes on different machines.
Each message can be consumed by multiple consumer groups, but only one consumer within a group processes it.
Consumers can subscribe to multiple topics.
Offsets are stored in ZooKeeper (pre‑0.9) or Kafka itself (post‑0.9) to track consumption progress.
Architecture Diagram
The diagram illustrates the roles of brokers, topics, partitions, producers, and consumers, showing leader/follower relationships and replication.
Since version 0.8, consumers no longer communicate directly with ZooKeeper; the architecture diagram has been updated accordingly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.