Big Data 12 min read

Understanding Kafka Message Loss: Causes and Mitigation in Broker, Producer, and Consumer

This article explains why Kafka can lose messages at the broker, producer, and consumer levels, describes the underlying mechanisms such as page cache flushing and acknowledgment settings, and provides practical code examples and mitigation strategies to improve reliability.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding Kafka Message Loss: Causes and Mitigation in Broker, Producer, and Consumer

Kafka may lose messages in three components—Broker, Producer, and Consumer—due to its design for high throughput and asynchronous disk writes.

Broker side: Messages are first written to the Linux page cache and later flushed to disk based on three triggers: explicit sync/fsync calls, low memory thresholds, or when dirty data exceeds a time limit. If the system crashes before flushing, data in the page cache is lost. Kafka does not provide a synchronous flush mechanism; instead, it relies on configurable flush intervals, which can be tuned to reduce loss risk.

GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
service.putRequest(request);
boolean flushOK = request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout()); // flush

Producer side: To improve performance, the producer batches requests in a local buffer and sends them asynchronously. Message durability depends on the acks setting:

acks=0 – no acknowledgment, highest throughput, highest loss risk.

acks=1 – leader acknowledges after writing to its local log; loss can occur if the leader fails before followers replicate.

acks=all (or -1) – leader waits for all in‑sync replicas to acknowledge, providing the strongest durability guarantee.

Loss can also happen if the producer process stops abruptly, the buffer overflows, or memory pressure forces message dropping. Mitigation approaches include switching to synchronous sends, limiting the producer thread pool, enlarging the buffer, or persisting messages to disk before sending.

Consumer side: Consumption involves receiving, processing, and committing offsets. With automatic offset commits, the consumer may commit an offset before processing succeeds, leading to lost messages if processing fails. Example configuration for auto‑commit:

Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true"); // enable auto commit
props.put("auto.commit.interval.ms", "1000"); // 1 s interval
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer
consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
    ConsumerRecords
records = consumer.poll(100);
    for (ConsumerRecord
record : records) {
        insertIntoDB(record); // may take >1 s
    }
}

Switching to manual offset commits ensures at‑least‑once delivery but may cause duplicate processing. Example of manual commit with batch buffering:

Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "false"); // disable auto commit
// ... other deserializer configs
KafkaConsumer
consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
final int minBatchSize = 200;
List
> buffer = new ArrayList<>();
while (true) {
    ConsumerRecords
records = consumer.poll(100);
    for (ConsumerRecord
record : records) {
        buffer.add(record);
    }
    if (buffer.size() >= minBatchSize) {
        insertIntoDb(buffer);
        consumer.commitSync(); // commit after processing
        buffer.clear();
    }
}

Manual commits guarantee that messages are not lost after processing, though they require careful handling of offsets and may involve more complex low‑level APIs.

JavaKafkaConsumerBrokerproducerAckMessage Loss
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.