Comprehensive Overview of Apache Kafka Architecture and Core Concepts
This article provides an in‑depth introduction to Apache Kafka, covering its distributed streaming platform fundamentals, message‑queue models, topic and partition design, broker and cluster roles, producer partitioning logic, reliability guarantees, consumer group assignors, offset management, and performance optimizations such as sequential disk writes and zero‑copy techniques.
1. Introduction
Apache Kafka is a distributed streaming platform built on a publish/subscribe message‑queue model. It offers three key capabilities: publish/subscribe of stream records, durable storage with fault tolerance, and real‑time processing of records as they are produced.
1.1 Message‑Queue Modes
Point‑to‑Point : Producers send messages to a queue; a single consumer reads and removes each message, ensuring each message is consumed only once.
Publish/Subscribe : Producers publish to a topic; multiple consumers can subscribe, and each receives a copy of every message.
1.2 Suitable Scenarios
Building real‑time data pipelines (functionally similar to a message queue).
Developing real‑time stream processing applications that transform or act on incoming data.
1.3 Topics and Partitions
Topics categorize streams like database tables; each topic is split into partitions, which are ordered logs. Messages are appended to partitions, guaranteeing order only within a single partition.
1.4 Brokers and Clusters
A broker is an individual Kafka server that receives messages from producers, assigns offsets, and persists them to disk. Multiple brokers managed by a Zookeeper ensemble form a Kafka cluster, with one broker per partition acting as the leader.
1.5 Broker Request Handling
Brokers run an Acceptor thread per listening port, which hands connections to configurable Processor threads that queue requests and send responses.
2. Producer Details
2.1 Partition Strategy
The producer decides which partition a record belongs to. The Java implementation is shown below:
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) { List
partitions = cluster.partitionsForTopic(topic); int numPartitions = partitions.size(); if (keyBytes == null) { int nextValue = nextValue(topic); List
availablePartitions = cluster.availablePartitionsForTopic(topic); if (availablePartitions.size() > 0) { int part = Utils.toPositive(nextValue) % availablePartitions.size(); return availablePartitions.get(part).partition(); } else { return Utils.toPositive(nextValue) % numPartitions; } } else { return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions; } } private int nextValue(String topic) { AtomicInteger counter = topicCounterMap.get(topic); if (counter == null) { counter = new AtomicInteger(ThreadLocalRandom.current().nextInt()); AtomicInteger current = topicCounterMap.putIfAbsent(topic, counter); if (current != null) { counter = current; } } return counter.getAndIncrement(); }This logic ensures deterministic partitioning when a key is present and round‑robin distribution otherwise.
2.2 Reliability Guarantees
Message order is preserved within a partition.
A record is considered committed only after all in‑sync replicas have written it.
As long as at least one replica remains alive, committed data is not lost.
Consumers read only committed records.
Replication is managed by an in‑sync replica set (ISR); leaders acknowledge writes only after ISR members have replicated the data.
2.3 Acknowledgment (acks) Levels
0 : Producer does not wait for any broker response.
1 : Producer waits for leader acknowledgment.
-1 (all) : Producer waits for acknowledgment from all ISR replicas.
2.4 Message Sending Flow
Producers use an asynchronous model with a main thread feeding a RecordAccumulator and a sender thread pulling batches to the broker. Batching improves throughput, and optional compression reduces network and storage overhead.
3. Consumer Details
Consumers pull data (pull model) from brokers, which allows them to match consumption speed to processing capacity. Consumer groups enable multiple consumers to share a topic’s partitions.
3.1 Partition Assignment Strategies
RangeAssignor : Assigns contiguous partition ranges per consumer per topic.
RoundRobinAssignor : Distributes partitions evenly across consumers after sorting all partitions and consumers.
StickyAssignor : Tries to keep previous assignments stable while still balancing load.
Sample Java code for RangeAssignor:
@Override public Map
> assign(Map
partitionsPerTopic, Map
subscriptions) { Map
> consumersPerTopic = consumersPerTopic(subscriptions); Map
> assignment = new HashMap<>(); for (String memberId : subscriptions.keySet()) { assignment.put(memberId, new ArrayList<>()); } for (Map.Entry
> topicEntry : consumersPerTopic.entrySet()) { String topic = topicEntry.getKey(); List
consumersForTopic = topicEntry.getValue(); Integer numPartitionsForTopic = partitionsPerTopic.get(topic); if (numPartitionsForTopic == null) continue; Collections.sort(consumersForTopic); int numPartitionsPerConsumer = numPartitionsForTopic / consumersForTopic.size(); int consumersWithExtraPartition = numPartitionsForTopic % consumersForTopic.size(); List
partitions = AbstractPartitionAssignor.partitions(topic, numPartitionsForTopic); for (int i = 0, n = consumersForTopic.size(); i < n; i++) { int start = numPartitionsPerConsumer * i + Math.min(i, consumersWithExtraPartition); int length = numPartitionsPerConsumer + (i + 1 > consumersWithExtraPartition ? 0 : 1); assignment.get(consumersForTopic.get(i)).addAll(partitions.subList(start, start + length)); } } return assignment; }3.2 Offset Management
Consumers track the offset of the last consumed record. Since Kafka 0.9 offsets are stored in an internal __consumer_offsets topic, enabling reliable recovery after failures.
4. Performance Optimizations
Sequential Disk Writes : Kafka appends records to log files, achieving high throughput (≈600 MB/s) compared to random writes.
Zero‑Copy : Utilizes OS‑level zero‑copy mechanisms to transfer data between network and disk without CPU‑intensive copying, reducing latency and CPU usage.
5. References
Kafka Chinese documentation, Kafka authoritative guide, articles on offset lookup, partition assignors, log storage, and zero‑copy techniques.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.