Refactoring the Game Business Product Update MQ Consumer: Architecture, Design, and Implementation
This article details the background, problem analysis, and comprehensive refactoring plan for a game business's product update MQ consumer, outlining architectural redesign using Flyweight and Strategy patterns, phased implementation, testing strategies, idempotent handling, monitoring, and the resulting 50‑80% reduction in downstream interface calls.
1 Background
The game business launched in 2017 and has accumulated a heavy historical burden over seven years, akin to a large tree with both thriving branches and many dead ones. This article focuses on the product update MQ consumer module, describing how the existing code was refactored to revitalize the system.
1.1 Origin of the Issue
An alert was triggered for downstream RPC call rate limiting at a threshold of 600k/min. Investigation revealed that external update operations were only generating about 3K/min, yet the limit was still hit, prompting a deeper system investigation.
1.2 Pre‑Refactor Situation
We discovered 19 consumers across different clusters handling product update MQ messages, each containing internal query and update logic. Some consumers generated new messages, causing call volume to explode. Additionally, abandoned consumers remained online and duplicate consumption logic existed.
Key problems identified:
a. 逻辑分散,可维护性差
b. 服务调用量成倍放大
c. 存在并发更新和覆盖的情况
d. 存在废弃或者重复消费情况1.3 Problem Analysis
Rapid early‑stage requirement iteration led to many quickly added consumers, making the system increasingly fragmented and hard to maintain. To reduce MQ‑related interface calls, two core points are needed: reduce queries (data reuse) and reduce update calls (prevent new messages). A new, consolidated architecture is required.
2 Refactoring
2.1 Goals
Before refactoring, clear goals are essential to guide the solution.
a. 合理的结构
b. 优化重复无效消费逻辑
c. 提高消费能力
d. 逻辑优化
e. 构建新体系The aim is a highly cohesive, low‑coupling MQ consumption logic with clear responsibilities, while also eliminating obsolete code and preventing interface call explosion.
2.2 Solution Design
2.2.1 Architecture Design
The overall architecture employs the Flyweight and Strategy design patterns and consists of three layers:
a. 数据预处理
b. 按分类调用Handler进行消费
c. 收拢调用更新接口Data preprocessing filters and pre‑queries data, performs batch MQ consumption, removes non‑game messages, and calls batch query interfaces to reduce duplicate processing.
Handler layer extracts category‑specific and common handlers, separating responsibilities. A Manage layer beneath handlers implements concrete consumption logic and reusable components.
Update aggregation consolidates middle‑platform product update calls to reduce the number of update interfaces, thereby limiting the generation of new messages.
2.2.2 Implementation Plan
The refactor is divided into three phases.
Phase 1 : Migrate and refactor non‑core business MQ logic, gray‑release to limit impact and validate stability.
Phase 2 : Migrate and refactor core business MQ logic, gray‑release while monitoring impact. Completion of this phase essentially finishes the migration.
Phase 3 : Fine‑tune the structure, further decompose and refactor functions to improve cohesion and reduce coupling, achieving the original design goals.
Multi‑step refactoring controls impact scope, enables quick results, and simplifies issue localization.
2.2.3 Test Plan
Before each release, three test types are performed: black‑box testing, white‑box testing, and log comparison.
a:黑盒测试,校验新老流程处理后的数据是否一致。
b:白盒测试。测试每一行代码的覆盖率,并观察新老流程数据是否一致。
c:调用接口前数据对比。在调用更新接口之处打印日志,对比新老流程调用更新接口的传参是否一致。Post‑deployment, system health is monitored, alerts are set up, and frontline support collects user feedback for gray‑release adjustments.
2.3 Detailed Design Highlights
Unified Idempotent Gray‑Scale Aspect
Using Spring AOP, an idempotent annotation and handler are defined. The handler checks for annotation presence, stores successful results in a cache keyed by msgId, and skips re‑processing on repeat consumption.
Exception Failure Handling
Failed downstream updates are retried using a RocketMQ‑based retry component, converting synchronous failures into asynchronous retry messages.
Update failure alarms are sent via enterprise WeChat with product data for manual intervention.
Data Isolation
New consumers run in dedicated thread pools for better monitoring and increased concurrency.
Monitoring & Alerting
Rich monitoring metrics and alert notifications are established via log platforms, dashboards, and enterprise WeChat alerts to observe the new flow in real time.
3 Summary
Data Impact
After launch, downstream core interface call volume dropped dramatically, with reductions ranging from 50% to 80%; update‑type interfaces fell by 80% and query‑type interfaces by 50%.
Reflections
Clearly define the reasons for system refactoring, addressing existing problems and future business constraints.
Thoroughly understand the current system before redesign; assess business logic and impact scope.
Define good standards and architecture to guide future development and collaboration.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.