Message Queue Interview Guide: Why Use MQ, Its Pros & Cons, and Comparison of Kafka, ActiveMQ, RabbitMQ, and RocketMQ
This article explains why message queues are used, outlines their advantages and disadvantages, compares Kafka, ActiveMQ, RabbitMQ, and RocketMQ in terms of throughput, latency, reliability and features, and provides interview‑style guidance on high availability, idempotency, data loss prevention, ordering, and scaling strategies.
1. Why Use Message Queues?
Interviewers expect you to describe business scenarios that require decoupling, asynchronous processing, or traffic‑shaping (peak‑shaving), and to explain how MQ solves the technical challenges of tightly coupled synchronous calls.
Decoupling
When system A needs to send data to systems B, C, D and possibly E, direct synchronous calls become fragile; MQ allows A to publish once and let each consumer handle the data independently, reducing inter‑service dependencies.
After introducing MQ, the architecture becomes loosely coupled.
Asynchronous Processing
System A writes to its own database in 3 ms, while downstream systems B, C, D need 300 ms, 450 ms, and 200 ms respectively, resulting in a total response time of ~1 s. Using MQ to process downstream writes asynchronously reduces the perceived latency for the end user.
Peak‑Shaving
During high‑traffic periods (e.g., 0 am–11 am: 100 TPS, 11 am–1 pm: 10 000 TPS) the system can only handle 1 000 TPS. MQ buffers the burst and smooths the load for downstream services.
2. Advantages and Disadvantages of Message Queues
Advantages: Decoupling, asynchronous processing, and peak‑shaving in special scenarios.
Disadvantages:
Reduced system availability – adding an external dependency introduces a new failure point.
Increased complexity – you must handle duplicate consumption, message loss, and ordering.
Consistency challenges – if some downstream services succeed while others fail, data may become inconsistent.
3. Comparison of Kafka, ActiveMQ, RabbitMQ, and RocketMQ
Feature
ActiveMQ
RabbitMQ
RocketMQ
Kafka
Single‑node throughput
Ten‑thousands, one order of magnitude lower than RocketMQ/Kafka
Ten‑thousands, same as ActiveMQ
Hundreds of thousands, high‑throughput
Hundreds of thousands, highest throughput; often paired with big‑data workloads
Impact of topic count on throughput
—
—
Hundreds‑to‑thousands of topics cause only slight throughput drop; supports many topics
Throughput drops sharply when topics exceed a few hundred; keep topic count modest
Latency
ms level
µs level (lowest latency)
ms level
ms level
Availability
High (master‑slave)
High (master‑slave)
Very high (distributed)
Very high (distributed, multiple replicas)
Message reliability
Low probability of loss
—
Zero loss with proper configuration
Zero loss with proper configuration
Feature support
Very complete MQ feature set
Built on Erlang, strong concurrency, low latency
Complete MQ features, good extensibility
Core features only, but excels in high throughput and is the de‑facto standard for real‑time big‑data pipelines
Pros & Cons Summary
Mature and widely used, but community activity is declining; not ideal for massive throughput.
Excellent performance, active community, but lower throughput and complex cluster expansion.
High throughput, Alibaba backing, good for large‑scale scenarios, but Java‑centric and may require custom development.
Very high throughput, ms latency, high availability; downside is possible duplicate consumption and limited topic count.
4. Ensuring High Availability of MQ
RabbitMQ
Three deployment modes: single node (demo only), classic cluster (queues reside on a single node, metadata replicated), and mirrored cluster (queues and messages replicated to multiple nodes). Mirrored clusters provide HA but increase network and storage overhead.
Kafka
Kafka uses a broker‑cluster with topics divided into partitions, each having multiple replicas. A leader handles reads/writes; followers replicate data. HA is achieved by configuring replication factor > 1, min.insync.replicas > 1, producer acks=all , and unlimited retries.
5. Idempotency and Duplicate Consumption
Both RabbitMQ and Kafka can deliver duplicate messages. To achieve idempotency, use business keys, database unique constraints, or check‑before‑write logic (e.g., query by primary key before insert, or store a UUID in Redis).
6. Preventing Message Loss
Producers should use transaction or confirm mode (RabbitMQ) and set deliveryMode=2 for persistent messages. Kafka producers must set acks=all and ensure sufficient replication. Consumers must acknowledge only after successful processing.
7. Maintaining Message Order
For RabbitMQ, use separate queues per logical stream or a single consumer with an internal ordering buffer. For Kafka, keep one partition per ordered stream and consume with a single thread.
8. Handling Massive Backlog and TTL
When messages accumulate for hours, scale out by creating more partitions/queues and deploying additional consumers, or temporarily divert traffic to a high‑capacity topic. If TTL causes loss, rebuild the missing data and re‑publish.
9. Designing Your Own Message Queue
Key considerations: scalability (partitioned storage like Kafka), persistence (sequential disk writes), high availability (replication and leader election), zero‑loss guarantees (acks, retries), and operational monitoring.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.