Backend Development 29 min read

Understanding Message Queues: From Basic Queues to Redis, Kafka, and Pulsar

The article compares basic in‑memory queues, Redis lists and streams, Kafka’s partitioned log architecture, and Pulsar’s compute‑storage separation, explaining each system’s core mechanisms, strengths, and limitations so readers can choose the most suitable message‑queue solution for their workloads.

Tencent Cloud Developer

Apr 12, 2021

Understanding Message Queues: From Basic Queues to Redis, Kafka, and Pulsar

The article introduces the landscape of message middleware, comparing popular solutions such as RabbitMQ, Kafka, RocketMQ, Pulsar, and Redis. It aims to help readers understand the architectures and principles of these systems by examining Redis, Kafka, and Pulsar in depth.

1. The Most Basic Queue

A simple queue can be implemented as a double‑ended queue (deque) using a doubly linked list. The basic operations are push_front (add to the head) and pop_tail (remove from the tail). Producers add messages to the queue, while consumers retrieve them.

Although this in‑memory structure is easy to build, achieving high performance under massive concurrent reads and writes requires careful engineering.

2. Redis Queue

Redis provides a list data type that directly maps to the deque abstraction. The relevant commands are: lpush: insert an element at the head of the list; rpop: remove an element from the tail.

Using Redis lists as a message queue offers high‑throughput, well‑optimized memory handling, and built‑in persistence (AOF/RDB). However, Redis queues have several drawbacks:

Message persistence : AOF/RDB are not fully reliable; a crash can cause data loss.

Hot‑key performance : A single list can become a bottleneck because all operations target the same Redis instance.

No acknowledgment mechanism : Once rpop succeeds, the message is permanently removed, making it impossible to recover from consumer failures.

Single consumer : Only one consumer can read a message; multiple subscribers (e.g., monitoring, BI, tracing) cannot share the same stream.

No support for re‑consumption : After a consumer crashes, the lost offset cannot be reset.

Some of these issues can be mitigated by using RocksDB/LevelDB‑based KV stores that speak the Redis protocol, but many limitations remain.

Redis 5.0 introduced a stream type that adopts Kafka‑like concepts, yet it still falls short of a full‑featured message system.

3. Kafka

Kafka was designed to address the shortcomings of simple list‑based queues. Its core concepts include:

Partition : A topic is split into multiple partitions, each stored on a different broker, allowing horizontal scaling.

Cursor (offset) : Consumers maintain a cursor indicating the next message to read; the log itself is append‑only, never deleting data.

Consumer groups : Each group has its own cursor per partition, enabling independent consumption, broadcasting (1‑N), and replay.

Kafka stores each partition as a series of segment files. Segment filenames are the offset of the first message in the segment (e.g., 0.log, 18234.log), which enables binary search to locate a target offset. An accompanying index file ( *.index) maps offsets to byte positions, allowing fast random access.

Retention is handled by deleting whole expired segments, avoiding costly per‑message deletions. Kafka also supports sparse indexing to reduce index size.

High availability is achieved via replication: each partition has a leader and one or more followers. Producers write to the leader; followers replicate the log. Acknowledgment (ack) policies can be tuned for speed (leader‑only ack) or durability (wait for all in‑sync replicas).

4. Pulsar

Pulsar separates compute and storage. The stateless broker handles client requests, while Apache BookKeeper provides durable, replicated segment storage (ledgers). This design offers:

Easy broker scaling (stateless services can be added or removed without data migration).

Segment‑level distribution across BookKeeper nodes, simplifying storage expansion.

High availability through BookKeeper’s replication parameters (n replicas, m write quorum, t ack quorum).

Like Kafka, Pulsar uses partitions and segments, but the actual data resides in BookKeeper ledgers rather than broker disks. This enables rapid rebalancing: if a broker fails, another broker can take over the partition without moving data, because the data is already stored in the distributed ledger.

Pulsar also introduces a richer subscription model (exclusive, failover, shared, key‑shared), providing flexible consumption patterns beyond Kafka’s single consumer‑group model.

5. Summary of Architectural Trade‑offs

Both Kafka and Pulsar achieve high throughput, low latency, and strong durability, but they differ in storage handling and scalability. Kafka’s monolithic broker‑local storage makes scaling storage harder, while Pulsar’s compute‑storage separation makes storage expansion straightforward. Understanding these design choices helps engineers select the right platform for their workload.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture Kafka Message Queue Pulsar

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.