Understanding Kafka: Core Concepts, Architecture, and Reliability Explained
This article provides a comprehensive overview of Kafka, covering its overall architecture, key components such as brokers, producers, consumers, topics, partitions, replicas, and ZooKeeper, as well as logical and physical storage mechanisms, producer and consumer workflows, configuration parameters, partition assignment strategies, rebalancing, and the replication model that ensures data reliability.
Kafka Overall Architecture
Kafka decouples systems, smooths traffic spikes, and enables asynchronous communication, making it ideal for activity tracking, messaging, metrics, logging, and stream processing.
Key Components
Broker : A Kafka instance; multiple brokers form a cluster.
Producer : Writes messages to brokers.
Consumer : Reads messages from brokers.
Consumer Group : One or more consumers that share a subscription to a topic.
ZooKeeper : Manages cluster metadata and controller election.
Topic : Logical categorization of messages.
Partition : Subdivision of a topic for scalability and fault tolerance.
Replica : Copies of a partition for durability.
Leader and Follower : Leader handles reads/writes; followers replicate the leader.
Offset : Unique position of a message within a partition.
Logical Storage Model
Kafka stores data as an append‑only log, improving write performance. Each partition consists of multiple log segments, enabling efficient cleanup. Offsets guarantee ordering within a partition but not across partitions.
Writing Data (Producer Flow)
The producer workflow includes creating a record, applying interceptors, serialization, partition selection, batching in RecordAccumulator , sending requests, handling back‑pressure, and cleaning up resources. Important parameters include buffer.memory , batch.size , linger.ms , and max.block.ms .
Send Modes
Fire‑and‑forget (lowest latency, lowest reliability).
Sync (wait for broker acknowledgment, highest reliability).
Async (callback after acknowledgment).
Acknowledgment Settings (acks)
acks=1 (default): Leader acknowledgment.
acks=0 : No acknowledgment, possible data loss.
acks=all or -1 : All in‑sync replicas must acknowledge.
Reading Data (Consumer Flow)
Consumers use a pull‑based model, repeatedly invoking poll() to fetch records.
<code>while (true) {</code><code> records := consumer.poll();</code><code> for record := range records {</code><code> // process record</code><code> }</code><code>}</code>Offset Commit
After processing, consumers commit the next offset (e.g., 9528 after processing up to 9527 ). Automatic commits can cause duplicate processing or data loss if a consumer crashes before committing.
Partition Assignment Strategies
Range: Assigns contiguous partitions per consumer.
RoundRobin: Distributes partitions evenly in a round‑robin fashion.
Sticky: Tries to keep previous assignments while balancing load.
Rebalancing
Triggered by consumer joins/leaves, group coordinator changes, or topic/partition count changes. Steps: FindCoordinator → JoinGroup → SyncGroup → Heartbeat.
Physical Storage
Log Files and Segments
Data is stored in append‑only log files split into segments. Retention policies include time‑based deletion (default 7 days) and size‑based deletion (default 1 GB). Compaction retains only the latest value for each key.
Indexes
Kafka maintains a sparse offset index and a timestamp index to locate messages quickly without scanning entire logs.
Zero‑Copy Transfer
Kafka uses zero‑copy to move data from disk to network directly in kernel space, reducing CPU overhead and latency.
Reliability and Replication
Each partition has a set of replicas (AR). The in‑sync replica set (ISR) contains replicas that have caught up to the leader. Out‑of‑sync replicas (OSR) are lagging. The leader’s Log End Offset (LEO) marks the next write position; the High Watermark (HW) is the smallest LEO among ISR replicas, indicating the offset up to which all ISR replicas have persisted data and can be safely consumed.
Leader Epoch
Leader epoch is a monotonically increasing version number for the leader. Followers include the epoch in sync requests, preventing log truncation after leader changes and avoiding data loss.
This concise guide introduces Kafka’s essential concepts, architecture, storage layers, producer and consumer mechanics, configuration knobs, and reliability guarantees, providing a solid foundation for deeper exploration.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.