Backend Development 15 min read

Delayed Queue Technology Research and Implementation Overview

This article surveys various delayed queue implementations—including Kafka, RocketMQ, Redis (Redisson), and Netty’s HashedWheelTimer—examining their design principles, advantages, drawbacks, and integration strategies, and proposes a unified micro‑service architecture leveraging Kafka topics, Redis ZSETs, and thread‑pool optimizations for reliable message scheduling.

Top Architect
Top Architect
Top Architect
Delayed Queue Technology Research and Implementation Overview

Project Background A delayed queue is a message queue with built‑in delay functionality, needed for several business scenarios in the author's current work.

Kafka

Consideration: The project already uses Kafka for business communication, so integrating a native Kafka delayed queue is desirable.

Idea: Borrow RocketMQ’s design by creating multiple topics (e.g., delay‑minutes‑1 ) for different delay intervals, sending delayed messages to these topics, and later moving them to the real target topic.

When sending a delayed message, publish to a dedicated delay topic instead of the target topic.

Periodically poll the delay topic, and forward messages whose delay has expired to the actual target topic.

Solution: Use KafkaConsumer’s pause/resume API to hold consumption until the delay condition is met, then resume and forward the message.

Drawbacks: High internal complexity, need for health checks, limited flexibility in setting arbitrary delay times.

RocketMQ

Consideration: RocketMQ’s delay queue is fully encapsulated and easy to use.

Principle: Messages are assigned a delayLevel (18 levels) and placed into corresponding queues; a timer periodically scans these queues and delivers expired messages.

Drawbacks: Requires deep understanding of the source code, single‑threaded timer may become a bottleneck under high load.

Redis (Redisson)

Consideration: Redisson provides a ready‑made delayed queue via getBlockingQueue() and getDelayQueue() .

Core Structures:

Delay Queue – stores incoming delayed items.

Blocking Queue – holds items whose delay has expired, awaiting consumption.

Timeout ZSET – a sorted set where the score is the expiration timestamp, used to detect expired items.

Timer Implementation: Uses Redis Pub/Sub to publish a timeout key when a new item is added; a client runs a HashedWheelTimer that checks the ZSET, moves expired items to the blocking queue.

Drawbacks: Potential duplicate timer tasks across clients, concurrency safety concerns, and the need for a Redis cluster when data volume grows.

Youzan Delayed Queue

Reference: Youzan delayed queue design .

Components:

Job: Basic unit containing jobId, topic, delayTime, ttr, and message payload.

Job Pool: Map storing original job information.

Delay Bucket: Redis ZSET grouping jobs by execution time.

Timer: Periodically scans buckets and moves due jobs to a Ready Queue (implemented here as a common Kafka topic).

Modifications:

Replace Redisson’s Pub/Sub timer with ZSET head polling.

Introduce a thread‑pool to accelerate message handling.

Use Redis SETNX for a simple distributed lock in cluster mode, ensuring only one timer thread runs.

Overall Execution Flow

Business services publish tasks to an entrance Kafka topic, creating delayed jobs stored in a bucket.

A timer periodically scans buckets; when a job’s delay expires, it is sent to a common Kafka topic.

Consumers read from the Kafka topic and execute the business logic.

The Kafka producer acknowledges each message to guarantee at‑least‑once delivery.

Microservice Architecture Diagram

Message Formats

Entrance Topic (delay_entrance_topic) Fields:

Property

Type

Required

Description

realTopicName

string

Yes

Business target topic

delayTime

long

Yes

Delay duration

message

string

Yes

JSON payload

Exit Topic (delay_exit_topic) Fields:

Property

Type

Required

Description

delayJobId

long

Yes

Internal job identifier

realTopicName

string

Yes

Business target topic

message

string

Yes

JSON payload

Potential Issues

Message persistence: Redis loss leads to delayed‑message loss; consider backup or persisting to MongoDB.

Time‑wheel accuracy: Use thread‑pool to reduce timing error.

Cluster reliability: Distributed lock needed to avoid multiple timers running simultaneously.

Message reliability: Ensure at‑least‑once delivery with proper acknowledgment.

Netty HashedWheelTimer

Core loop calculates the deadline, sleeps until the next tick, and processes expired buckets. Sample code:

long deadline = tickDuration * (tick + 1);
for (;;) {
    final long currentTime = System.nanoTime() - startTime;
    long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;
    if (sleepTimeMs <= 0) {
        if (currentTime == Long.MIN_VALUE) {
            return -Long.MAX_VALUE;
        } else {
            return currentTime;
        }
    }
    if (PlatformDependent.isWindows()) {
        sleepTimeMs = sleepTimeMs / 10 * 10;
    }
    try {
        Thread.sleep(sleepTimeMs);
    } catch (InterruptedException ignored) {
        if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
            return Long.MIN_VALUE;
        }
    }
}

Kafka Time‑Wheel (DelayQueue + PriorityQueue)

Uses a DelayQueue of buckets, each bucket contains a PriorityQueue sorted by expiration time. The timer blocks on the queue until the next bucket expires, then processes it.

private[this] val reinsert = (timerTaskEntry: TimerTaskEntry) => addTimerTaskEntry(timerTaskEntry)

def advanceClock(timeoutMs: Long): Boolean = {
    var bucket = delayQueue.poll(timeoutMs, TimeUnit.MILLISECONDS)
    if (bucket != null) {
        writeLock.lock()
        try {
            while (bucket != null) {
                timingWheel.advanceClock(bucket.getExpiration())
                bucket.flush(reinsert)
                bucket = delayQueue.poll()
            }
        } finally {
            writeLock.unlock()
        }
        true
    } else {
        false
    }
}

XXL_JOB

Two threads: scheduleThread scans DB for tasks due in 5 seconds and puts them into the time‑wheel; ringThread processes expired task chains, dispatching them via RPC.

Conclusion

The key to designing a delayed queue lies in two core problems: (1) sorting delayed tasks, and (2) detecting when a task’s execution time arrives. Various implementations—RocketMQ’s level‑based buckets, Netty’s array‑based wheel, Kafka’s priority‑queue wheel, and Redis’s ZSET‑based approach—solve these problems differently. Choose the solution that best fits your business requirements and scale.

microservicesRedisKafkaNettyrocketmqDelayed Queuetime wheel
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.