Operations 9 min read

RocketMQ Cluster Migration: Issues, Preparation Steps, and Recommended Migration Plans

This article analyzes the problems caused by multiple independent RocketMQ clusters, outlines the current cluster architecture, details pre‑migration preparations, compares two migration schemes (one not recommended and one recommended), and summarizes the benefits of consolidating clusters into a single, well‑managed deployment.

YunZhu Net Technology Team
YunZhu Net Technology Team
YunZhu Net Technology Team
RocketMQ Cluster Migration: Issues, Preparation Steps, and Recommended Migration Plans

1. Background

With rapid business growth, each department independently built its own RocketMQ clusters (mro, mdm, rot‑sc) without a unified planning, resulting in multiple clusters with multi‑master‑multi‑slave deployment. While this isolates traffic, it introduces issues such as cross‑department consumption difficulty, low resource utilization, and high maintenance cost.

2. Current Cluster Status

Each cluster uses a 3‑master‑3‑slave architecture and runs RocketMQ 4.4.0. The Namesrv service is accessed via a domain name that maps to a single Namesrv node IP.

3. Pre‑migration Preparation

Prepare six new machines (3 masters, 3 slaves) and ensure network connectivity between business machines and the new cluster.

Validate that topics and consumer groups in the three existing clusters (MRO, MDM, ROT‑SC) do not conflict; resolve any conflicts with the business owners.

Prepare a Docker image of RocketMQ 4.4.0 for deployment.

4. Migration Plan 1 (Not Recommended)

Deployment:

Deploy a new cluster with its own Namesrv address.

Migrate topics and groups from the old clusters to the new cluster.

Producers and consumers use the new Namesrv address; old services keep the old Namesrv address unchanged.

Switch DNS of the old Namesrv domain to point to the new IP (only half of the domain is switched).

Producer can connect to either the new or the old cluster.

Consumer can connect to either the new or the old cluster, but this may cause consumption problems.

5. Switch the old cluster broker to read‑only mode so that no new messages are written.

6. Change the old cluster Namesrv domain IP to the new cluster Namesrv IP.

7. Decommission old cluster nodes after ensuring all messages are consumed.

Advantages:

Minimal business changes: new business uses the new Namesrv address, old services keep their address.

Problems:

After DNS IP switch, producers and consumers may only connect to the new cluster, causing some messages in the old cluster to become unconsumable.

Only when the client restarts or the old Namesrv node goes offline can it obtain the new IP address.

5. Migration Plan 2 (Recommended)

Deployment:

Deploy a new cluster; its Namesrv address is the old Namesrv IP plus the new Namesrv IP, with the service initially set to read‑only.

Migrate topics and groups from the old clusters to the new cluster service.

Business units continue to use the existing Namesrv address (no change).

Enable read‑write permission on the new cluster; part of the traffic will automatically flow to the new cluster for production and consumption.

Switch old cluster brokers to read‑only, wait for all messages to be consumed.

Update DNS to point the old Namesrv domain IP to the new Namesrv IP (operational step).

Decommission old cluster service nodes after confirming all messages are consumed and no new writes occur.

Advantages:

No code changes required; business continues to use the same Namesrv address.

The new cluster brokers register with the old Namesrv, allowing clients to discover the new cluster seamlessly.

Problems:

DNS switch requires the client to reconnect to the new Namesrv address; long‑lived connections only pick up the new IP after the old Namesrv is restarted or taken offline.

During migration, new topics and groups must be created in both old and new clusters.

6. Summary

Choosing Plan 2 solves the message‑not‑consumed issue by controlling broker permissions for a smooth traffic shift without business impact.

The final DNS switch of the old Namesrv domain IP forces clients to obtain the new IP, ensuring normal production and consumption.

After consolidation, all departments share a single cluster, facilitating inter‑department collaboration, releasing redundant machines, improving resource utilization, reducing operation costs, and simplifying future upgrades.

PS: Other solutions are welcome for discussion.

DockerOperationsmessage queueRocketMQinfrastructureCluster Migration
YunZhu Net Technology Team
Written by

YunZhu Net Technology Team

Technical practice sharing from the YunZhu Net Technology Team

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.