How RocketMQ Achieves High Performance and Scalability with Queues, Brokers, and mmap
This article explains how RocketMQ tackles synchronous registration bottlenecks, tight coupling, and traffic‑burst risks by introducing an intermediate queue layer, designing a durable high‑availability broker, leveraging page cache and mmap for zero‑copy I/O, and using a nameserver for automatic routing, ultimately delivering a high‑throughput, low‑latency messaging system.
Introduction
Rapid business growth caused the user‑registration flow to become a performance bottleneck: each registration now requires multiple service calls (SMS, push, coupons, etc.), turning a simple synchronous operation into a 200 ms latency problem and creating tight coupling and traffic‑burst risks.
Problems Identified by the CTO
同步 : The registration process waits for several downstream services, which is the root cause of high latency.
耦合 : Registration code is tightly coupled with other modules; any failure in a secondary service causes the whole transaction to fail.
流量暴增风险 : Sudden traffic spikes (e.g., promotional red‑packet events) can overwhelm the long registration pipeline, leading to system collapse.
Solution: Adding an Intermediate Queue Layer
The CTO proposes inserting a producer‑consumer queue between registration and downstream services. The registration request is placed into the queue, allowing the API to return immediately while consumers process the event asynchronously, achieving decoupling and peak‑shaving.
This design reduces total latency from ~200 ms to about 55 ms (50 ms service time + 5 ms queue enqueue), improving throughput by nearly four times.
Choosing the Right Queue Implementation
Using a simple in‑memory JDK Queue leads to tight producer‑consumer coupling, message loss on crash, and scalability issues. Therefore a dedicated broker component is introduced.
Broker Design
The broker must satisfy:
Message persistence : Store messages on disk (commitlog) to survive crashes.
High availability : Ensure the broker remains reachable even if a node fails.
High performance : Achieve 100 k TPS by fast producer writes, fast disk persistence, and fast consumer reads.
Messages are sequentially written to a commitlog file. To avoid random disk writes, the broker uses sequential file writes.
Page Cache and mmap
To eliminate costly kernel‑to‑user copies, the broker relies on the OS page cache and memory‑mapped files (mmap). Reads first check the page cache; if missing, a page fault loads the block. Writes go to the page cache and are later flushed to disk.
Using mmap removes the extra copy step and halves memory usage because user and kernel share the same page cache.
ConsumeQueue Index File
Since message sizes vary, a separate ConsumeQueue index file stores fixed‑size entries (commitlog offset, size, tag hashcode). This enables O(1) location of a message given a consumer offset.
Topic and Tag Filtering
Messages are grouped into topics , and further classified by tags . The broker writes the tag hashcode into the ConsumeQueue, allowing fast integer comparison during consumption.
High Availability of Brokers
Single‑broker setups are vulnerable, so a master‑slave (or multi‑master) architecture is used. After RocketMQ 4.5, a dledger mode with at least three nodes provides Raft‑based leader election.
Nameserver for Service Discovery
To avoid hard‑coding broker addresses, a nameserver cluster stores routing information (topic‑to‑broker mapping, queue counts). Brokers register themselves; producers and consumers periodically pull the latest routes, achieving automatic configuration and failover.
Summary
RocketMQ’s design meets three core goals—message persistence, high performance, and high availability—by using a sequential commitlog, page‑cache‑backed mmap I/O, a broker that decouples producers and consumers, sharded ConsumeQueues for parallel consumption, topic/tag filtering, and a nameserver for dynamic routing.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.