How Redis Stream Replaced MQ in High‑Throughput Traffic Processing

This article explains why the traffic team switched from traditional MQ to Redis Stream, covering the underlying concepts, design choices, implementation details, load‑balancing strategies, cross‑datacenter handling, monitoring metrics, performance benchmarks, and practical lessons learned.

dbaplus Community
dbaplus Community
dbaplus Community
How Redis Stream Replaced MQ in High‑Throughput Traffic Processing

Background

The traffic‑condition team processes massive AMAP trajectory data in real time. Rapid growth in data volume and processing frequency caused the existing MQ middleware to become cost‑prohibitive, prompting a migration to Redis Stream as a lower‑cost, high‑throughput alternative.

Redis Stream Concept

Redis Stream, introduced in Redis 5.0, implements a FIFO log where each entry consists of an id (default UNIX‑timestamp_sequence) and a content payload. It supports multiple consumer groups, each tracking a last_delivered_id, and provides an ACK mechanism for confirming consumption. Streams can be trimmed lazily to avoid immediate deletions.

Redis Stream structure
Redis Stream structure

Design and Implementation

Load Balancing

Redis Stream does not have a native tag concept and may be deployed across sharded instances. To retain the original MQ semantics, the logical topic used by producers/consumers is treated as a prefix, while the actual Redis key is formed as topic_tag.

Topic Splitting

Producers send messages with a topic and a tag. The SDK concatenates them into a Redis key topic_tag. Consumers specify the same topic and tag; the SDK resolves the full key and reads from the corresponding stream, thereby emulating tag‑based routing.

Topic splitting example
Topic splitting example

Sharding Hash

When N total shards are available, the hash is calculated as:

globalIdx = tag % N

instanceIdx = globalIdx / shardsPerInstance

localIdx = globalIdx % shardsPerInstance

This mapping aligns with Redis Cluster’s 16384‑slot mechanism and guarantees an even distribution of keys across instances.

Hash calculation diagram
Hash calculation diagram

Cross‑Datacenter Read/Write

Two deployment models were compared:

Asynchronous hiredis client – producer operates asynchronously, consumer synchronously. Average end‑to‑end latency: 22‑23 ms.

Global‑active‑active Redis – both producer and consumer use synchronous connections across data centers. Average latency: 51‑57 ms and requires additional Redis instances.

Cross‑datacenter deployment
Cross‑datacenter deployment

Engineering Implementation (C++ SDK)

The SDK supports multiple Redis instances, configurable producer/consumer thread pools, synchronous or asynchronous modes, consumer‑group offset reset, load‑balancing, real‑time monitoring, and automatic reconnection.

Producer configuration

Instance list – one or more Redis endpoints.

Thread count – multiple threads round‑robin messages to balance load.

Consumer configuration

Instance list – same as producer.

Tag list – tags to subscribe.

Max tags per thread – limits the number of tags fetched in a single request to avoid backlog.

Processing thread count – number of threads that invoke the user‑defined callback.

Redis Stream flow diagram
Redis Stream flow diagram

Real‑time Monitoring

Redis itself does not expose queue lag metrics, so the following indicators are collected:

Production‑consumption volume – should match when there is no backlog.

Batch fetch size – maximum messages per pull; approaches the limit under backlog.

Write/read latency – spikes indicate possible network‑induced backlog.

Performance Test

Single‑threaded producer/consumer throughput drops as message size increases:

10 KB messages → >3000 TPS.

100 KB messages → ~1500 TPS.

Single‑thread producer TPS
Single‑thread producer TPS
Single‑thread consumer TPS
Single‑thread consumer TPS

Practical Experience

Online Performance

After full migration, cost and latency decreased by >90 %. A peak link processes ~20 million messages per minute (≈1 KB each) using four 64 GB, 64‑shard Redis instances.

Redis cluster water level
Redis cluster water level

Applicable Scenarios

Redis Stream is well‑suited for workloads with high message volume and strong cost constraints. It provides basic MQ features (consumer groups, offset reset, ACK) with high availability and throughput. Limitations include potential data loss on server failure, lack of a dedicated operations platform, and limited C++ client ecosystem.

Pitfalls & Tips

Use hiredis ≥ 1.2.0 for async connections; it supports connection‑timeout configuration.

Keep individual message size below 100 KB; larger payloads degrade single‑thread TPS.

Ensure a sufficient number of distinct tags to avoid shard skew.

CPU capacity, not memory or bandwidth, is the primary scaling factor for Redis instances.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend Developmentload balancingCRedis Stream
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.