Common Kafka Interview Questions and Answers
This article reviews common Kafka interview questions, covering delay queues, idempotence, replica states, offsets, message ordering, and handling duplicate consumption, and includes example code for enabling idempotent producers along with explanations of time‑wheel mechanisms and practical solutions to consumer rebalance issues.
Kafka Delay Queue
Kafka implements a delay queue using a "time wheel" mechanism, where each slot represents a time interval (TickMs) and tasks are stored in a double‑ended queue; multiple levels of wheels (e.g., second‑level, third‑level) reduce memory usage by cascading tasks to finer‑grained wheels.
Kafka Idempotence
Idempotence Principle
Since version 0.11, Kafka provides per‑partition idempotence by attaching a Producer ID (PID) to each message; the broker compares the incoming PID with the cached PID to decide whether to accept, discard, or drop the message.
Enabling Idempotence
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
// Note: when idempotence is enabled, acks must be set to "all"
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// Enable idempotence
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<>(props);
kafkaProducer.send(new ProducerRecord<String, String>("truman_kafka_center", "1", "hello world.")).get();
kafkaProducer.close();ISR, OSR, AR
ISR (In‑Sync‑Replicas) are the follower partitions currently in sync; OSR (Out‑Sync‑Replicas) are out‑of‑sync or newly added followers; AR (Assigned‑Replicas) equals ISR plus OSR.
HW, LEO, LSO
HW (High‑Water) marks the offset up to which data can be consumed; LEO (Log‑End‑Offset) indicates the next write position; LSO (Last‑Stable‑Offset) denotes the offset up to which messages are committed and is used for transactions.
Message Ordering
Kafka guarantees ordering only within a single partition. To achieve ordered consumption, either use a single partition with a single consumer (sacrificing throughput) or use multiple partitions with a key‑based hash strategy, ensuring that messages with the same key land in the same partition.
Duplicate Consumption
Duplicate consumption can occur due to consumer rebalance when heartbeats are missed. Solutions include reducing consumption rate and increasing heartbeat intervals, or implementing a Redis‑based deduplication layer that discards messages with the same key within a short time window.
Conclusion
The article summarizes these common Kafka interview topics and invites readers to share additional questions for future updates.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.