Backend Development 15 min read

Design and Implementation of Delayed Message Queues in Distributed Systems

This article examines various delayed‑message implementations in distributed message‑queue systems, comparing external‑storage approaches using databases, RocksDB and Redis, and reviewing built‑in solutions in open‑source MQs such as RocketMQ, Pulsar and QMQ, while discussing their advantages, drawbacks and design considerations.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Design and Implementation of Delayed Message Queues in Distributed Systems

Delayed (scheduled) messages refer to messages sent by a producer in a distributed asynchronous messaging scenario that are intended to be consumed at a specified future time or after a certain delay, rather than being consumed immediately.

These messages are widely applicable in many business scenarios, and in distributed system environments the delayed‑message functionality is typically provided at the middleware layer, either built into the MQ itself or offered as a shared foundational service.

This article explores common implementation schemes for delayed messages and analyzes the pros and cons of each design.

Implementation Schemes

Solutions Based on External Storage

The "external storage" discussed here refers to storage systems introduced in addition to the MQ's native storage.

External‑storage solutions generally separate the MQ from a dedicated delayed‑message module, which stores messages in another medium until they expire, then delivers them to the MQ. Some edge‑case logic, such as immediate delivery if a message is already expired when entering the module, is omitted for brevity.

Based on a Database (e.g., MySQL)

Using a relational database table to hold delayed messages.

CREATE TABLE `delay_msg` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `delivery_time` DATETIME NOT NULL COMMENT '投递时间',
  `payloads` blob COMMENT '消息内容',
  PRIMARY KEY (`id`),
  KEY `time_index` (`delivery_time`)
);

A scheduled thread periodically scans for expired messages and delivers them; the scan interval defines the minimum time granularity of the delayed messages.

Advantages:

Simple to implement.

Disadvantages:

B‑Tree indexes are not optimal for the high‑write workload typical of messaging.

Based on RocksDB

This approach selects a storage medium more suitable for heavy writes. RocksDB uses an LSM‑Tree, which fits the message‑heavy scenario. The open‑source DDMQ project’s Chronos module adopts this design.

In DDMQ, a proxy layer sits in front of RocketMQ; delayed messages are first placed into a dedicated topic, then Chronos consumes them, stores them in RocksDB, and later re‑injects them into RocketMQ when they expire.

Advantages:

RocksDB’s LSM‑Tree handles massive writes efficiently.

Disadvantages:

The solution is heavyweight; developers must implement their own data‑replication and disaster‑recovery logic for RocksDB.

Based on Redis

A more complete Redis‑based design includes:

Messages Pool: a KV store where the key is the message ID and the value is the full message (stored as a Redis hash for O(1) HSET/HGET operations).

Delayed Queue: 16 horizontally‑scalable ordered queues implemented with ZSETs; the score is the expiration timestamp, and the value is the message ID from the pool. Multiple queues improve scan speed.

Worker: a processing thread that periodically scans the Delayed Queues for expired messages.

Redis ZSETs are well‑suited for delayed queues, offering O(log n) insertion while benefiting from Redis’s in‑memory performance optimizations.

Potential issues include handling concurrent processing across nodes and ensuring that duplicate deliveries do not occur, which may require distributed locks or a master‑slave arrangement for low‑traffic scenarios.

Defects of Timer‑Thread Scanning and Improvements

All the above schemes rely on a timer thread to scan for expired messages. When the message volume is low, this wastes resources; when the volume is high, an inappropriate scan interval can cause inaccurate delays. A more efficient approach borrows from JDK’s Timer class, using wait‑notify to sleep until the next message’s delivery time, waking early only when a newer earlier message arrives.

Implementation in Open‑Source MQs

RocketMQ

RocketMQ supports delayed messages via 18 predefined levels (e.g., 1 s, 5 s, 10 s, …, 2 h). Messages are stored in a special topic SCHEDULE_TOPIC_XXXX and routed to a specific queue based on the level (queueId = delayTimeLevel – 1). The broker later delivers them to the real topic.

Advantages:

Fixed levels keep overhead low.

Messages with the same level share a queue, preserving order.

Scheduling is reduced to simple appends to level‑specific topics.

Disadvantages:

Changing level configurations is costly and inflexible.

Delayed messages increase the size of the CommitLog.

Pulsar

Pulsar allows arbitrary‑time delayed messages. The client sends the message to the target topic; the broker creates an off‑heap priority queue (per subscription group) that indexes delayed messages by their delivery time. The consumer checks this queue and pulls messages when they become due.

Drawbacks:

High memory consumption: each subscription group creates its own priority queue, and large time spans increase memory usage.

Rebuilding the index after a broker failure can take hours for massive delayed‑message workloads.

Storage overhead: delayed messages keep the underlying topic data for the full delay period, preventing space reclamation.

Community solutions propose time‑partitioned queues that load only near‑term partitions into memory, persisting older partitions to disk, though an official implementation is not yet available.

QMQ

QMQ offers arbitrary‑time delayed/scheduled messages (configurable up to two years). Its design combines a multi‑level time wheel, delayed loading, and separate disk storage for delayed messages.

The first level resides on disk with an hourly granularity, generating a schedule log file per hour. The second level lives in memory with a 500 ms granularity; when a message’s delivery time approaches, its index is loaded from disk into the in‑memory wheel for fast dispatch.

Design highlights:

Time‑wheel algorithm provides O(1) insertion and deletion, ideal for delayed‑message workloads.

Multi‑level wheels support very large time spans.

Delayed loading keeps only near‑term messages in memory, reducing memory pressure.

Separate storage (schedule log) prevents delayed messages from affecting normal message space reclamation.

In summary, the article surveys common delayed‑message designs, evaluates their strengths and weaknesses, and aims to inspire readers when choosing an appropriate solution for their distributed messaging needs.

distributed systemsbackend developmentRedismessage queueRocketMQPulsarRocksDBDelayed Messages
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.