An Introduction to Apache Kafka: Architecture, Concepts, and Operations
Apache Kafka is a distributed, fault‑tolerant streaming platform that uses broker clusters, topic partitions, and replicated logs to provide publish/subscribe and queue messaging, configurable retention, strong ordering within partitions, producer acknowledgments, consumer groups, and performance optimizations such as batching and zero‑copy for real‑time data pipelines.
Apache Kafka is a distributed streaming platform that implements a publish/subscribe message queue.
It has three main characteristics:
Publish and subscribe to streaming records, similar to traditional message queues.
Store streaming records with strong fault‑tolerance.
Process records as soon as they are produced.
Kafka supports two messaging patterns:
Point‑to‑point (queue) mode: Producers send messages to a queue; a single consumer retrieves and consumes each message. Once consumed, the message is removed from the queue.
Publish/subscribe (topic) mode: Producers publish messages to a topic; every subscriber receives a copy of the message.
Typical Kafka use cases include:
Building real‑time data pipelines that reliably move data between systems (similar to a message queue).
Creating real‑time stream processing applications that transform or react to the data (stream processing).
Key concepts:
A Kafka cluster consists of one or more broker servers.
Data is organized into topics, which are further divided into partitions. Each partition is an append‑only log.
Each record contains a key, a value, and a timestamp.
Topic and partition details:
Messages are written to partitions in order; ordering is guaranteed only within a single partition. A topic spans multiple brokers, providing scalability and fault tolerance.
Kafka retains all published records for a configurable retention period (time‑based or size‑based). For example, with a 2‑day retention, records older than two days are deleted to free disk space.
Broker and cluster:
A broker receives messages from producers, assigns offsets, and writes them to disk. Brokers serve consumer read requests and participate in replication. A set of brokers managed by the same ZooKeeper ensemble forms a Kafka cluster.
Each partition has a leader replica and zero or more follower replicas. The leader handles all read/write requests; followers replicate the leader’s log.
Broker request handling:
Each broker runs an Acceptor thread per listening port, which creates connections and hands them to Processor (network) threads. Processor threads read client requests, place them in a request queue, and send responses from a response queue.
Producers:
Producers distribute records across partitions. If a key is provided, the key’s hash determines the partition; otherwise a round‑robin or custom strategy is used. The following Java code shows a typical custom partitioner implementation:
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
List
partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (keyBytes == null) {
int nextValue = nextValue(topic);
List
availablePartitions = cluster.availablePartitionsForTopic(topic);
if (availablePartitions.size() > 0) {
int part = Utils.toPositive(nextValue) % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
return Utils.toPositive(nextValue) % numPartitions;
}
} else {
return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
}
private int nextValue(String topic) {
// maintain an AtomicInteger per topic
AtomicInteger counter = topicCounterMap.get(topic);
if (null == counter) {
counter = new AtomicInteger(ThreadLocalRandom.current().nextInt());
AtomicInteger currentCounter = topicCounterMap.putIfAbsent(topic, counter);
if (currentCounter != null) {
counter = currentCounter;
}
}
return counter.getAndIncrement();
}Reliability guarantees:
Message order is preserved within a partition.
A record is considered committed only after all in‑sync replicas (ISR) have written it.
As long as at least one replica remains alive, committed data is not lost.
Consumers read only committed records.
Acknowledgment (acks) levels:
0 : Producer does not wait for any broker response (lowest latency, possible data loss).
1 : Producer waits for the leader’s acknowledgment (possible loss if leader fails before followers sync).
-1 (all) : Producer waits for all ISR replicas to acknowledge (higher durability, possible duplicates if leader fails after followers ack).
Consumers:
Consumers use a pull model to fetch data from brokers. A consumer group consists of one or more consumers; each partition is consumed by only one member of the group. Offsets are stored in an internal __consumer_offsets topic (since Kafka 0.9).
Consumer group partition assignors:
RangeAssignor assigns partitions of each topic independently, sorting partitions and consumers and distributing them as evenly as possible.
RoundRobinAssignor sorts all partitions across all subscribed topics and all consumers, then distributes them round‑robin to achieve better balance when subscription sets are identical.
StickyAssignor tries to keep the new assignment as close as possible to the previous one while still balancing load.
Example of a custom assignor implementation (RangeAssignor style):
@Override
public Map
> assign(Map
partitionsPerTopic,
Map
subscriptions) {
Map
> consumersPerTopic = consumersPerTopic(subscriptions);
Map
> assignment = new HashMap<>();
for (String memberId : subscriptions.keySet())
assignment.put(memberId, new ArrayList<>());
for (Map.Entry
> topicEntry : consumersPerTopic.entrySet()) {
String topic = topicEntry.getKey();
List
consumersForTopic = topicEntry.getValue();
Integer numPartitionsForTopic = partitionsPerTopic.get(topic);
if (numPartitionsForTopic == null)
continue;
Collections.sort(consumersForTopic);
int numPartitionsPerConsumer = numPartitionsForTopic / consumersForTopic.size();
int consumersWithExtraPartition = numPartitionsForTopic % consumersForTopic.size();
List
partitions = AbstractPartitionAssignor.partitions(topic, numPartitionsForTopic);
for (int i = 0, n = consumersForTopic.size(); i < n; i++) {
int start = numPartitionsPerConsumer * i + Math.min(i, consumersWithExtraPartition);
int length = numPartitionsPerConsumer + (i + 1 > consumersWithExtraPartition ? 0 : 1);
assignment.get(consumersForTopic.get(i)).addAll(partitions.subList(start, start + length));
}
}
return assignment;
}File storage mechanism:
Each partition is stored as a series of log segments. Every segment consists of a .log file (the actual data) and a .index file (offset‑to‑position mapping). Segment files are named with the base offset of the first record in the segment, e.g., 0000000000000170410.log .
To locate a message with offset 170417, Kafka:
Uses binary search to find the segment whose base offset is ≤ 170417.
Computes the relative offset within the segment (170417‑170410 = 7) and finds the nearest index entry.
Scans the .log file from that position until the desired offset is reached.
Data expiration:
When a segment reaches the size limit ( log.segment.bytes , default 1 GB) or the time limit ( log.segment.ms ), it is closed. Closed segments become eligible for deletion based on the configured retention policy (time or size). The active segment is never deleted.
Performance optimizations:
Batching: Producers accumulate records into batches (controlled by batch.size and linger.ms ) before sending to reduce network overhead.
Zero‑copy: Kafka uses OS‑level zero‑copy techniques to transfer data directly from disk to network sockets, minimizing CPU copying.
Overall, Kafka provides a highly scalable, fault‑tolerant platform for real‑time data ingestion, storage, and processing.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.