Backend Development 13 min read

Implementing a Reliable Delay Queue with Redis and Go

This article explains how to build a reliable delayed message queue using Redis, covering business scenarios, requirements such as persistence and retry, the design of Redis data structures, Lua scripts for atomic operations, and a Go implementation with example code for producing and consuming delayed tasks.

Architecture Digest
Architecture Digest
Architecture Digest
Implementing a Reliable Delay Queue with Redis and Go

We first examine two typical business scenarios: closing unpaid orders and sending activation SMS for newly created stores that have not uploaded products.

The simplest solution is periodic table scanning, but it introduces unacceptable latency and high database load.

Alternative approaches such as Redis expiration notifications, time wheels, or Java DelayQueue have drawbacks like imprecise timing, lack of persistence, and possible message loss.

To meet the requirements of persistence, retry mechanism, and precise scheduling, we propose using a professional message queue with delayed delivery, but to avoid extra middleware we implement a delay queue directly on Redis.

Redis data structures used include:

msgKey : a string key storing each message with a UUID, enabling TTL.

pendingKey : a sorted set where members are message IDs and scores are delivery timestamps.

readyKey : a list of message IDs ready for consumption.

unAckKey : a sorted set of messages being processed, with scores as retry timestamps.

retryKey : a list of messages whose retry time has arrived.

garbageKey : a set of messages that have exhausted retries and await deletion.

retryCountKey : a hash storing remaining retry counts per message ID.

Four Lua scripts guarantee atomic state transitions:

pending2ReadyScript moves messages whose delivery time has arrived from pendingKey to readyKey using ZRANGEBYSCORE and LPUSH .

-- keys: pendingKey, readyKey
-- argv: currentTime
local msgs = redis.call('ZRangeByScore', KEYS[1], '0', ARGV[1])
if (#msgs == 0) then return end
local args2 = {'LPush', KEYS[2]}
for _,v in ipairs(msgs) do
    table.insert(args2, v)
end
redis.call(unpack(args2))
redis.call('ZRemRangeByScore', KEYS[1], '0', ARGV[1])

ready2UnackScript pops a message from readyKey (or retryKey ) and adds it to unAckKey with a retry timestamp.

-- keys: readyKey/retryKey, unackKey
-- argv: retryTime
local msg = redis.call('RPop', KEYS[1])
if (not msg) then return end
redis.call('ZAdd', KEYS[2], ARGV[1], msg)
return msg

unack2RetryScript scans unackKey for messages whose retry time has arrived, decrements their retry count, and moves them either back to retryKey or to garbageKey when retries are exhausted.

-- keys: unackKey, retryCountKey, retryKey, garbageKey
-- argv: currentTime
local msgs = redis.call('ZRangeByScore', KEYS[1], '0', ARGV[1])
if (#msgs == 0) then return end
local retryCounts = redis.call('HMGet', KEYS[2], unpack(msgs))
for i,v in ipairs(retryCounts) do
    local k = msgs[i]
    if tonumber(v) > 0 then
        redis.call('HIncrBy', KEYS[2], k, -1)
        redis.call('LPush', KEYS[3], k)
    else
        redis.call('HDel', KEYS[2], k)
        redis.call('SAdd', KEYS[4], k)
    end
end
redis.call('ZRemRangeByScore', KEYS[1], '0', ARGV[1])

The ack operation simply removes the message from unAckKey , deletes its TTL key, and clears its retry count.

func (q *DelayQueue) ack(idStr string) error {
    ctx := context.Background()
    if err := q.redisCli.ZRem(ctx, q.unAckKey, idStr).Err(); err != nil {
        return fmt.Errorf("remove from unack failed: %v", err)
    }
    _ = q.redisCli.Del(ctx, q.genMsgKey(idStr)).Err()
    q.redisCli.HDel(ctx, q.retryCountKey, idStr)
    return nil
}

The nack operation updates the retry timestamp of the message in unAckKey , causing the subsequent unack2RetryScript to move it to the retry queue immediately.

func (q *DelayQueue) nack(idStr string) error {
    ctx := context.Background()
    err := q.redisCli.ZAdd(ctx, q.unAckKey, &redis.Z{
        Member: idStr,
        Score:  float64(time.Now().Unix()),
    }).Err()
    if err != nil {
        return fmt.Errorf("negative ack failed: %v", err)
    }
    return nil
}

The core consume function runs every second, invoking the Lua scripts to transition messages between states, delivering them to a user‑provided callback, and handling acknowledgments, retries, and garbage collection.

func (q *DelayQueue) consume() error {
    // pending -> ready
    if err := q.pending2Ready(); err != nil { return err }
    // fetch from ready, process, ack/nack
    var fetchCount uint
    for {
        idStr, err := q.ready2Unack()
        if err == redis.Nil { break }
        if err != nil { return err }
        fetchCount++
        ack, err := q.callback(idStr)
        if err != nil { return err }
        if ack { err = q.ack(idStr) } else { err = q.nack(idStr) }
        if err != nil { return err }
        if fetchCount >= q.fetchLimit { break }
    }
    // move nack/timeout to retry
    if err := q.unack2Retry(); err != nil { return err }
    // delete exhausted retries
    if err := q.garbageCollect(); err != nil { return err }
    // consume from retry queue (similar loop)
    // ...
    return nil
}

A Go example shows how to install the library with go get github.com/hdt3213/delayqueue , create a Redis client, register a message‑handling callback, send delayed messages, and start the consumer.

package main

import (
    "github.com/go-redis/redis/v8"
    "github.com/hdt3213/delayqueue"
    "strconv"
    "time"
)

func main() {
    redisCli := redis.NewClient(&redis.Options{Addr: "127.0.0.1:6379"})
    queue := delayqueue.NewQueue("example-queue", redisCli, func(payload string) bool {
        // process message
        return true
    })
    for i := 0; i < 10; i++ {
        err := queue.SendDelayMsg(strconv.Itoa(i), time.Hour, delayqueue.WithRetryCount(3))
        if err != nil { panic(err) }
    }
    done := queue.StartConsume()
    <-done
}

Because all message data resides in Redis, the system guarantees no loss as long as Redis remains operational and keys are not externally tampered.

In summary, the article provides a complete, reliable delay‑queue solution built on Redis, combining Lua scripts for atomic state changes and a Go client for easy integration.

distributed systemsredisGoMessage QueueDelay QueueLuaretry mechanism
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.