Big Data 31 min read

Kafka Core Concepts, Architecture, Performance Optimization, and Operational Practices

This comprehensive guide explains Kafka's core value for decoupling and asynchronous processing, details its producer‑consumer model, cluster architecture, log segmentation, zero‑copy I/O, hardware sizing, network planning, operational commands, throughput tuning, exception handling, consumer group mechanics, offset management, rebalance strategies, and internal mechanisms such as LEO/HW, controllers, delayed tasks, and time‑wheel scheduling.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Kafka Core Concepts, Architecture, Performance Optimization, and Operational Practices

Kafka is a high‑throughput distributed message queue whose core value lies in decoupling services, enabling asynchronous processing, and controlling traffic spikes such as e‑commerce flash‑sale scenarios.

Key concepts include producers that write to the Kafka cluster, consumers that pull data, topics, partitions, and consumer groups. Each partition is stored as a directory on a broker, and replicas form leader‑follower pairs managed by an ISR (in‑sync replica) list.

In a Kafka cluster a broker is called a broker, a topic is a logical concept, and a partition is a physical directory on disk. Consumer groups require a unique group.id; members of the same group share the consumption of partitions, while different groups can achieve broadcast semantics.

Data performance relies on sequential disk writes (append‑only) and OS page cache. Kafka first writes to the OS cache, then flushes to disk, achieving write speeds comparable to memory. Zero‑copy techniques (sendfile) move data from OS cache directly to the network socket, reducing CPU and context‑switch overhead.

Log files are split into 1 GB segments; each segment ends with a *.log file. Sparse indexes are written every 4 KB of log data, enabling fast binary‑search lookups for offsets.

High‑concurrency network design uses a three‑layer NIO architecture (multiple selectors → multiple threads → internal queues). This design, together with proper hardware sizing, yields Kafka’s reputation for high throughput and availability.

Hardware evaluation : For a scenario of 1 billion daily requests (≈5.5 × 10⁴ QPS peak) and 276 TB of retained data, the guide recommends five physical servers, each with 11 × 7 TB SAS disks (≈77 TB per node), 64 GB RAM (128 GB preferred), and 16–32 CPU cores. Network interfaces should be 10 GbE where possible.

Cluster planning includes choosing physical machines, sizing disks, memory allocation for OS cache and JVM (≈10 GB for Kafka broker JVM, plus additional memory for OS cache), and CPU core count based on thread count (≥16 cores recommended).

Operational tools and commands :

kafka-topics.sh --create --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --replication-factor 1 --partitions 1 --topic test6
kafka-topics.sh --alter --zookeeper hadoop1:2181,hadoop2:2181,ha

JSON for reassigning partitions:

{"version":1,"partitions":[{"topic":"test6","partition":0,"replicas":[0,1,2]},{"topic":"test6","partition":1,"replicas":[0,1,2]},{"topic":"test6","partition":2,"replicas":[0,1,2]}]}

Custom partitioner example (Java):

public class HotDataPartitioner implements Partitioner {
    private Random random;
    @Override
    public void configure(Map configs) { random = new Random(); }
    @Override
    public int partition(String topic, Object keyObj, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        String key = (String) keyObj;
        List
partitionInfoList = cluster.availablePartitionsForTopic(topic);
        int partitionCount = partitionInfoList.size();
        int hotDataPartition = partitionCount - 1;
        return !key.contains("hot_data") ? random.nextInt(partitionCount - 1) : hotDataPartition;
    }
}

Producer throughput can be tuned via buffer.memory , compression.type , batch.size , and linger.ms . Larger batches increase throughput but add latency; compression (e.g., LZ4) reduces network load at the cost of CPU.

Exception handling covers LeaderNotAvailableException , NotControllerException , and NetworkException . Retries can be configured with retries and back‑off settings; idempotent delivery may require max.in.flight.requests.per.connection=1 to avoid out‑of‑order messages.

ACK configuration determines durability: acks=0 (fire‑and‑forget), acks=1 (leader only), acks=-1 (all ISR replicas). To guarantee no data loss, set acks=-1 , min.insync.replicas>=2 , and use at least two replicas per partition.

Consumer group rebalance is coordinated by a designated broker (the coordinator). Strategies include range , round‑robin , and sticky , with sticky preserving existing assignments as much as possible.

Offset management moved from ZooKeeper to the internal __consumer_offsets topic, which is compacted and typically has 50 partitions. Consumers periodically commit offsets; the committed offset is the key ( group.id+topic+partition ) and the value is the latest consumed offset.

Broker internals: LEO (Log End Offset) tracks the next offset to be written; HW (High Watermark) marks the highest offset replicated to all in‑sync replicas and is visible to consumers. The controller manages cluster metadata, leader election, and partition reassignment via ZooKeeper paths such as /controller/id and /broker/ids/ .

Kafka uses delayed operations (e.g., request timeout, follower fetch wait) managed by a custom time‑wheel implementation, providing O(1) insertion and removal. Multiple wheel levels handle tasks ranging from milliseconds to seconds.

Overall, the article provides a step‑by‑step methodology for evaluating resources, planning hardware, configuring Kafka parameters, and operating the cluster in production environments.

distributed systemsjavaperformance optimizationBig DataKafkaMessage QueueCluster Deployment
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.