Backend Development 22 min read

Kafka Overview: Architecture, Core Features, and Operational Details

This article provides a comprehensive technical overview of Apache Kafka, covering its distributed messaging architecture, key features such as high‑throughput read/write, replication, partitioning, consumer group mechanics, offset management, rebalance processes, and practical code examples for synchronous and asynchronous offset commits.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
Kafka Overview: Architecture, Core Features, and Operational Details

1. Kafka

Kafka is a distributed messaging system used for asynchronous processing, application decoupling, traffic shaping, and message communication.

2. Problems Solved

Message systems are typically applied in scenarios such as asynchronous processing, decoupling of services, handling traffic spikes, and inter‑service communication.

Asynchronous Processing

Producers write messages to a queue, while consumers pull messages asynchronously, improving processing capacity.

Application Decoupling

Kafka acts as a messaging medium, allowing subsystems to focus on their own responsibilities; the producer‑consumer model treats Kafka as a message queue.

Traffic Shaping

When upstream services generate high traffic, downstream services can consume at their own pace by pulling messages, thus smoothing spikes.

3. Key Features

Read/Write Efficiency

Kafka efficiently stores and retrieves large volumes of data, avoiding hardware bottlenecks.

Network Transmission

Batch reads and compression improve network utilization.

Concurrency

Message partitions guarantee order within a partition while allowing concurrent processing across partitions.

Persistence

Messages are persisted to disk using zero‑copy, sequential I/O, and page caching for high throughput.

Reliability

Replication with leader and follower replicas provides redundancy and fault tolerance.

Horizontal Scaling

Adding producers, brokers, and consumers, as well as increasing partition counts, enables linear scaling; consumer groups rebalance partitions when members change.

4. Fundamental Concepts

Message & Batch

A Kafka message consists of a key and a value (both byte arrays). Batching groups messages belonging to the same topic and partition to improve efficiency.

Topic & Partition

Topics are logical containers for messages; each topic is split into partitions, which are the basic storage units distributed across brokers, enabling horizontal scalability.

Log

Log Basics

Each partition maps to a log composed of multiple segments; a new segment is created when the current one exceeds a size limit. Kafka writes sequentially to the latest segment.

Log Retention & Compression

Retention can be time‑based or size‑based; older messages are deleted accordingly. Log compression retains only the latest value for each key, similar to a compacted database table.

Broker

A broker receives messages from producers, assigns offsets, stores them on disk, and serves consumer and inter‑broker requests.

Replica

Each partition can have multiple replicas for redundancy; all replicas store identical message sequences.

Producer

Producers generate messages and assign them to partitions based on key hashing, round‑robin, or custom logic.

Consumer

Consumers pull messages from assigned partitions and track their position using offsets.

Consumer Group

Multiple consumers sharing a group ID form a consumer group; each partition is consumed by only one member of the group, while the group as a whole receives all messages.

Consumer Details

Offsets are stored in the internal topic __consumer_offsets . Consumers can commit offsets automatically (enable.auto.commit=true) or manually, synchronously or asynchronously.

Heartbeat Mechanism

Consumers send periodic heartbeats to the broker coordinator; missing heartbeats beyond session.timeout.ms cause the consumer to be considered dead.

Rebalance Mechanism

Rebalancing occurs when consumer group membership changes, partition counts change, or subscription patterns change, triggering a redistribution of partitions among active consumers.

Avoiding Rebalance Issues

Set session.timeout.ms ≥ 3 × heartbeat.interval.ms and keep max.poll.interval.ms sufficiently large to prevent unnecessary rebalances.

Offset Management

Offsets can be stored in Zookeeper (legacy) or, since Kafka 0.8.2, in the __consumer_offsets topic. The offset key contains group ID, topic, and partition; the value holds the offset number.

Automatic Commit

Enabled via enable.auto.commit=true with interval auto.commit.interval.ms (default 5 s). The consumer commits offsets before each poll, which may cause duplicate consumption after a rebalance.

Manual Commit

Synchronous commit using consumer.commitSync() blocks until the broker acknowledges the offset. Asynchronous commit ( consumer.commitAsync() ) avoids blocking but requires error handling.

while (true) {
    ConsumerRecords
records =
        consumer.poll(Duration.ofSeconds(1));
    process(records); // handle messages
    try {
        consumer.commitSync();
    } catch (CommitFailedException e) {
        handle(e);
    }
}
try {
    while (true) {
        ConsumerRecords
records =
            consumer.poll(Duration.ofSeconds(1));
        process(records);
        commitAsync(); // non‑blocking commit
    }
} catch (Exception e) {
    handle(e);
} finally {
    try {
        consumer.commitSync(); // final synchronous commit
    } finally {
        consumer.close();
    }
}

Partitioning

Partitions are the basic unit of storage; increasing partition count enables horizontal scaling and parallel consumption.

Replication Mechanism

Each partition has a leader replica that handles all read/write requests; follower replicas asynchronously replicate the leader’s log. The In‑Sync Replica (ISR) set contains replicas that are up‑to‑date within replica.lag.time.max.ms (default 10 s).

Leader Election

If a leader fails, a follower is elected as the new leader. The unclean.leader.election.enable flag controls whether a non‑ISR replica may become leader (risking data loss).

High Watermark (HW) & Log End Offset (LEO)

HW marks the highest offset that all ISR replicas have replicated; consumers can only read up to HW. LEO is the offset of the next message to be written.

5. Summary

The article explains Kafka’s use cases, core features, fundamental concepts, and detailed operational mechanisms such as consumer groups, offset management, rebalancing, replication, and storage architecture, providing a solid foundation for further deep‑dive studies.

KafkaLog StorageReplicationMessage QueueDistributed MessagingpartitioningConsumer OffsetsRebalance
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.