Backend Development 20 min read

How a Monolith Redesign Boosted Content Ingestion Performance 13‑Fold

The article details how QQ Browser's content architecture was transformed from a fragmented micro‑service system into a single monolithic service with a plugin framework, dramatically improving processing speed, fault tolerance, and development efficiency while handling thousands of content types.

Sanyou's Java Diary

Aug 31, 2023

How a Monolith Redesign Boosted Content Ingestion Performance 13‑Fold

Content architecture is the content ingestion and computation layer of QQ Browser search, handling thousands of content types from many partners.

Problems of the old system: low development efficiency (adding a data type required changes in 3‑4 services), poor performance (CPU usage max 40%, many JSON parses), complex fault tolerance, slow iteration, excessive serialization, and difficulty scaling.

To address these, the team rebuilt the system with a zero‑based design, moving from many micro‑services to a single monolithic service, introducing a plugin framework for flexible processing, and separating consumption and computation threads.

Key redesign points

Monolithic service to reduce RPC overhead and keep data in memory.

Plugin system for extensible handling of diverse content types.

Support both incremental updates and batch "刷库" (bulk load) with dedicated processing flows.

Fault‑tolerant design using Kafka for message buffering and peak‑shaving.

Horizontal scaling by decoupling consumption threads from processing threads.

The new architecture replaces numerous if‑else branches with table‑driven logic, uses modern C++20 features (e.g., std::atomic<std::shared_ptr<T>>), adopts faster JSON libraries (Sonic‑JSON), and integrates jemalloc for better memory handling.

CI/CD improvements include stricter code review, unified coding standards, automated pipelines, and dependency mirroring to speed up builds.

Performance results

Metric

Before

After

Improvement

Single‑core processing QPS

172

13×

Batch QPS

230

17×

Cluster batch QPS

500‑1000

10000

10×

Average latency

2.7 s

0.8 s

‑71%

p99 latency

17 s

1.9 s

‑88%

CPU utilization

≤40 %

≈100 %

2.5×

R&D efficiency also improved: lead‑time for business requirements dropped from 5.72 days to ≤1 day (‑82 %), code issues eliminated, unit‑test coverage rose to 77 %, and code lines reduced from 113k to 28k (‑75 %).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend microservices plugin architecture C++system redesign

Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.