How a Monolith Redesign Boosted Content Ingestion Performance 13‑Fold
The article details how QQ Browser's content architecture was transformed from a fragmented micro‑service system into a single monolithic service with a plugin framework, dramatically improving processing speed, fault tolerance, and development efficiency while handling thousands of content types.
Content architecture is the content ingestion and computation layer of QQ Browser search, handling thousands of content types from many partners.
Problems of the old system: low development efficiency (adding a data type required changes in 3‑4 services), poor performance (CPU usage max 40%, many JSON parses), complex fault tolerance, slow iteration, excessive serialization, and difficulty scaling.
To address these, the team rebuilt the system with a zero‑based design, moving from many micro‑services to a single monolithic service, introducing a plugin framework for flexible processing, and separating consumption and computation threads.
Key redesign points
Monolithic service to reduce RPC overhead and keep data in memory.
Plugin system for extensible handling of diverse content types.
Support both incremental updates and batch "刷库" (bulk load) with dedicated processing flows.
Fault‑tolerant design using Kafka for message buffering and peak‑shaving.
Horizontal scaling by decoupling consumption threads from processing threads.
The new architecture replaces numerous if‑else branches with table‑driven logic, uses modern C++20 features (e.g., std::atomic<std::shared_ptr<T>> ), adopts faster JSON libraries (Sonic‑JSON), and integrates jemalloc for better memory handling.
CI/CD improvements include stricter code review, unified coding standards, automated pipelines, and dependency mirroring to speed up builds.
Performance results
Metric
Before
After
Improvement
Single‑core processing QPS
13
172
13×
Batch QPS
13
230
17×
Cluster batch QPS
500‑1000
10000
10×
Average latency
2.7 s
0.8 s
‑71%
p99 latency
17 s
1.9 s
‑88%
CPU utilization
≤40 %
≈100 %
2.5×
R&D efficiency also improved: lead‑time for business requirements dropped from 5.72 days to ≤1 day (‑82 %), code issues eliminated, unit‑test coverage rose to 77 %, and code lines reduced from 113k to 28k (‑75 %).
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.