Big Data 24 min read

Comprehensive Overview of Apache Kafka Architecture and Core Concepts

This article provides an in‑depth technical guide to Apache Kafka, covering its distributed streaming architecture, core concepts such as topics, partitions, brokers, producers and consumers, reliability guarantees, storage mechanisms, configuration parameters, and consumer assignment strategies, supplemented with Java code examples.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Comprehensive Overview of Apache Kafka Architecture and Core Concepts

Apache Kafka is a distributed streaming platform that functions as a publish/subscribe‑based message queue, enabling reliable, real‑time data transfer between systems.

The platform offers three key characteristics: it allows publishing and subscribing to streaming records, stores records with strong fault tolerance, and processes records as they are generated.

Kafka supports two messaging modes: point‑to‑point (queues) where each message is consumed by a single consumer, and publish/subscribe (topics) where multiple consumers can receive the same message.

Typical use cases include building real‑time data pipelines (similar to traditional message queues) and developing stream‑processing applications that transform or react to continuous data flows.

Key concepts include clusters composed of multiple brokers, topics that logically group messages, partitions that act as ordered log files, and records consisting of a key, value, and timestamp.

Topics are divided into partitions; each partition guarantees message order but not across partitions. Partitions can be replicated for fault tolerance, and increasing partition count improves scalability but may affect key‑based ordering.

Producers publish messages to topics, distributing them across partitions either randomly, by key hash, or via a custom partitioner. The default partitioner logic is shown below:

public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
    List
partitions = cluster.partitionsForTopic(topic);
    int numPartitions = partitions.size();
    if (keyBytes == null) {
        int nextValue = nextValue(topic);
        List
availablePartitions = cluster.availablePartitionsForTopic(topic);
        if (availablePartitions.size() > 0) {
            int part = Utils.toPositive(nextValue) % availablePartitions.size();
            return availablePartitions.get(part).partition();
        } else {
            return Utils.toPositive(nextValue) % numPartitions;
        }
    } else {
        return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
    }
}

private int nextValue(String topic) {
    AtomicInteger counter = topicCounterMap.get(topic);
    if (null == counter) {
        counter = new AtomicInteger(ThreadLocalRandom.current().nextInt());
        AtomicInteger currentCounter = topicCounterMap.putIfAbsent(topic, counter);
        if (currentCounter != null) {
            counter = currentCounter;
        }
    }
    return counter.getAndIncrement();
}

Consumers belong to consumer groups, track offsets to resume after failures, and can read messages using a pull model. Offsets are now stored in the internal _consumer_offsets topic.

Brokers handle client requests via acceptor and processor threads, maintain leader/follower replicas, and coordinate metadata to route requests to the correct partition leader.

Kafka stores data in log segments, each accompanied by an index file. Segment naming uses the base offset, enabling efficient binary search for a given offset. The lookup process involves locating the correct segment, calculating the relative offset, and scanning the log file.

Data retention is governed by configurable time‑based or size‑based policies; segments are closed when they reach size or age limits, after which they become eligible for deletion.

Reliability guarantees include per‑partition ordering, configurable acknowledgment levels (0, 1, -1/all), and ISR‑based replication ensuring that committed messages survive broker failures.

Producer performance can be tuned with parameters such as batch.size , linger.ms , client.id , and max.in.flight.requests.per.connection , balancing latency and throughput.

Consumer partition assignment strategies include RangeAssignor, RoundRobinAssignor, and StickyAssignor. The default RangeAssignor assigns partitions per topic, while RoundRobin balances across all topics, and StickyAssignor minimizes rebalancing changes. Example assignment code:

@Override
public Map
> assign(Map
partitionsPerTopic,
                                          Map
> subscriptions) {
    Map
> consumersPerTopic = consumersPerTopic(subscriptions);
    Map
> assignment = new HashMap<>();
    for (String memberId : subscriptions.keySet())
        assignment.put(memberId, new ArrayList<>());
    for (Map.Entry
> topicEntry : consumersPerTopic.entrySet()) {
        String topic = topicEntry.getKey();
        List
consumersForTopic = topicEntry.getValue();
        Integer numPartitionsForTopic = partitionsPerTopic.get(topic);
        if (numPartitionsForTopic == null) continue;
        Collections.sort(consumersForTopic);
        int numPartitionsPerConsumer = numPartitionsForTopic / consumersForTopic.size();
        int consumersWithExtraPartition = numPartitionsForTopic % consumersForTopic.size();
        List
partitions = AbstractPartitionAssignor.partitions(topic, numPartitionsForTopic);
        for (int i = 0, n = consumersForTopic.size(); i < n; i++) {
            int start = numPartitionsPerConsumer * i + Math.min(i, consumersWithExtraPartition);
            int length = numPartitionsPerConsumer + (i + 1 > consumersWithExtraPartition ? 0 : 1);
            assignment.get(consumersForTopic.get(i)).addAll(partitions.subList(start, start + length));
        }
    }
    return assignment;
}

Kafka achieves high throughput through sequential disk writes and zero‑copy techniques, reducing CPU overhead and maximizing I/O performance.

The article concludes with references to official documentation, tutorials, and books such as "Kafka: The Definitive Guide".

Message QueuesReliabilityConsumerproducerpartitioningApache KafkaDistributed Streaming
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.