Common Kafka Interview Questions: Delay Queues, Idempotence, ISR/AR, HW/LEO/LSO, Message Ordering, and Duplicate Consumption
This article reviews typical Kafka interview topics, explaining the implementation of delay queues with hierarchical time wheels, how idempotence is achieved via producer IDs, the meanings of ISR, OSR, AR, HW, LEO, LSO, strategies for guaranteeing message order, and practical solutions for handling duplicate consumption.
Before diving into the technical content, the author thanks supporters and mentions a brief personal update, then introduces the focus on common Kafka interview questions.
Kafka Delay Queue
Kafka implements delay queues using a hierarchical time‑wheel mechanism. Each wheel slot (TickMs) represents a time slice, and tasks are stored in a double‑ended queue. To avoid allocating an enormous array for long delays, multi‑level wheels (second‑level, third‑level) are used, with tasks being downgraded to lower‑level wheels as time progresses.
Kafka Idempotence
Principle
Since version 0.11, Kafka provides idempotent writes per partition by attaching a Producer ID (PID) to each message. The broker compares the received PID with the cached PID and decides to discard duplicates, detect gaps, or accept the message based on the PID difference.
Enabling Idempotence
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
// Enable idempotence requires acks=all
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// Enable idempotence
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
KafkaProducer
kafkaProducer = new KafkaProducer<>(props);
kafkaProducer.send(new ProducerRecord<>("truman_kafka_center", "1", "hello world.")).get();
kafkaProducer.close();ISR, OSR, AR
ISR (In‑Sync‑Replicas): the set of follower partitions that are fully synchronized with the leader.
OSR (Out‑of‑Sync‑Replicas): followers that are not in sync, including newly added or failed replicas.
AR (Assigned‑Replicas): the complete set of replicas for a partition, equal to ISR + OSR.
HW, LEO, LSO
HW (High‑Watermark): the offset up to which all replicas have been committed; only messages before HW are consumable.
LEO (Log‑End‑Offset): the offset where the next message will be written.
LSO (Last‑Stable‑Offset): the offset up to which messages are confirmed stable, used mainly for transactions.
Message Ordering
Kafka guarantees ordering only within a single partition. To achieve ordered consumption, either use a single partition with a single consumer (sacrificing throughput) or use multiple partitions with a consistent key‑based hashing strategy, being aware that changing the partition count can break ordering.
Duplicate Consumption
Duplicate consumption can occur when a consumer fails to commit offsets before a rebalance, causing another consumer to read the same messages. Solutions include reducing consumption rate and increasing heartbeat intervals, or implementing a deduplication layer (e.g., Redis) that tracks processed keys for a configurable time window.
Conclusion
The article summarizes these common Kafka interview questions and invites readers to share additional questions for future updates.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.