How to Quickly Resolve Massive Kafka Message Backlog in Production
This guide explains why Kafka message backlogs occur, how to diagnose bugs, optimize consumer logic, and use temporary topics for emergency scaling, while emphasizing monitoring, alerts, and proper offset handling to keep your streaming system healthy.
In daily development, many teams encounter massive Kafka message backlogs caused by bugs (e.g., not committing offsets) or producer speed exceeding consumer speed.
Step 1: Diagnose and fix bugs. Ensure consumers commit offsets after processing. Example of faulty code that never commits offsets and the corrected version that calls
consumer.commitSync().
<code>while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
process(record);
// missing commit
}
}
</code> <code>while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
process(record);
}
// commit offset
consumer.commitSync();
}
</code>Step 2: Optimize consumer logic. Use multithreading or reduce unnecessary computation to increase throughput. For example, two consumer machines processing 100 messages/s can be tuned to handle 500 messages/s, processing 3.6 million messages per hour.
Step 3: Emergency scaling with a temporary topic. When backlog is urgent, create a temporary topic with many more partitions (e.g., 10×) and forward messages there, allowing fast consumption. After clearing the backlog, revert to the original topology.
Key takeaways:
Set up monitoring and alerts to detect backlog early.
Always check for bugs and optimize consumer code before creating temporary topics.
If messages have expiration, schedule a retry job to reprocess timed‑out messages.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.