Backend Development 16 min read

Key Concepts and Internal Mechanisms of Apache Kafka

This article explains Kafka's internal topics, preferred replicas, partition assignment processes, log directory layout, index files, offset and timestamp lookup, log retention and compaction policies, storage architecture, delayed operations, controller role, legacy consumer design flaws, rebalance workflow, and producer idempotence, providing a comprehensive overview of Kafka's backend architecture.

Big Data Technology Architecture

Oct 15, 2021

Key Concepts and Internal Mechanisms of Apache Kafka

Internal Kafka Topics and Their Functions

__consumer_offsets stores consumer offset information; __transaction_state holds transaction log messages.

Preferred Replica

The preferred replica is the first replica in the AR set, ideally the partition leader, and Kafka strives to evenly distribute preferred replicas across the cluster to achieve leader balance.

Where Partition Assignment Appears

Producer partition assignment determines the target partition for each message and can be customized by implementing the org.apache.kafka.clients.producer.Partitioner interface.

Consumer partition assignment defines which partitions a consumer can read; the strategy is set via the partition.assignment.strategy client property.

Partition replica assignment specifies the broker placement of replicas when creating a topic; it can be manually defined with the --replica-assignment option of kafka-topics.sh.

Kafka Log Directory Structure

Each partition corresponds to a log, which is divided into LogSegments to avoid oversized files. Every LogSegment consists of a log file and two index files (offset index and timestamp index), plus optional files such as .txnindex.

Kafka Index Files

Each LogSegment has an offset index mapping message offsets to physical file positions and a timestamp index mapping timestamps to offsets, enabling fast lookups.

Offset and Timestamp Lookup

Consumers use seek() after an initial poll() to start reading from a specific offset; if the offset is out of range, the behavior follows the auto.offset.reset setting.

For timestamp‑based lookup, the offsetsForTimes() method returns the earliest offset whose timestamp is greater than or equal to the requested timestamp.

Log Retention

Log retention deletes LogSegments that no longer satisfy configured policies. Retention can be based on time ( log.retention.ms, log.retention.minutes, log.retention.hours), size ( log.retention.bytes), or start offset ( logStartOffset), with deletable segments removed from the in‑memory skip list and later physically deleted after a configurable delay.

Log Compaction

When log.cleanup.policy=compact and log.cleaner.enable=true, Kafka retains only the latest record for each key, discarding older values.

Underlying Storage Mechanisms

Kafka relies on the operating system's page cache to keep frequently accessed log data in memory, reducing disk I/O. Writes modify pages in the cache, which are flushed to disk as dirty pages later.

Zero‑Copy (via sendfile() or Java's FileChannel.transferTo()) moves data directly from disk to the network interface, avoiding user‑space copies and improving throughput.

Delayed Operations

Delayed actions such as delayed produce, delayed fetch, and delayed delete are managed by the DelayedOperationPurgatory and timed out by a SystemTimer built on a timing wheel.

Kafka Controller

The controller broker manages partition leadership, handles ISR changes, and coordinates partition reassignments when topics are altered.

Legacy Scala Consumer Design Flaws

The old consumer stored group metadata in ZooKeeper, leading to herd‑effect overloads and split‑brain inconsistencies during rebalances.

Consumer Rebalance Process

Rebalances are triggered by consumer joins, leaves, failures, group coordinator changes, or topic/partition count changes. The process includes FIND_COORDINATOR, JOIN_GROUP (leader election and strategy selection), SYNC_GROUP (distribution of assignments), and HEARTBEAT phases.

Producer Idempotence

Idempotent producers use a producer ID (PID) and per‑partition sequence numbers. Brokers accept a record only if its sequence number is exactly one greater than the last stored value, discarding duplicates and detecting out‑of‑order sequences.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems zero-copy Idempotence Consumer Offsets Log Retention Partition Assignment

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.