Backend Development 21 min read

How We Revamped QQ Browser’s Content Engine: From Micro‑services Chaos to High‑Performance Monolith

This article details the complete redesign of QQ Browser's content ingestion system, explaining why the original micro‑service architecture caused low efficiency and performance, and how a zero‑base redesign using a monolithic service, plugin framework, fault‑tolerant pipelines, and thread separation dramatically improved throughput, latency, and developer productivity.

Efficient Ops
Efficient Ops
Efficient Ops
How We Revamped QQ Browser’s Content Engine: From Micro‑services Chaos to High‑Performance Monolith

1. Project Background

The content architecture of QQ Browser Search handles content ingestion and computation across thousands of content types, currently integrating many partners. The existing micro‑service system suffered from low development efficiency and poor performance due to excessive RPC calls, redundant JSON parsing, and string copying.

Low R&D efficiency: Adding a new data type required changes in 3‑4 services, making development cumbersome.

Poor system performance: Data traversed many small services; CPU utilization of core services capped at 40%, and a single message required over 20 JSON parses.

Business teams complained about slow throughput, e.g., processing 600 million documents took 12 days.

2. Overall Design

The new design focuses on five key points:

Monolithic service: Replaces fragmented micro‑services with an in‑memory data flow, reducing RPC overhead.

Plugin system: Introduces a flexible plugin architecture to replace hard‑coded if‑else logic.

Support for incremental and batch (刷库) processing: Custom configurations improve batch performance.

Fault tolerance: Uses Kafka for message buffering and peak‑shaving, ensuring no data loss during failures.

Horizontal scalability: Separates consumption and processing threads, enabling scaling beyond Kafka partition limits.

3. Detailed Design

3.1 From Micro‑services to Monolith

The original system consisted of many tiny services, each handling a specific ingestion path (HTTP, Kafka, DB pull, etc.), resulting in 6 RPC hops per content item. The new monolithic design keeps data in memory, eliminating most RPC calls and simplifying the processing pipeline.

3.2 Plugin‑based Ingestion Flow

Three layers are defined: ingestion, processing, and distribution. Each layer’s functions are implemented as plugins, allowing new content types to be added by configuring plugins rather than writing code.

Examples include batch ingestion tasks and document processing pipelines, both visualized with diagrams.

3.3 Incremental Updates vs. Batch Refresh

Four processing streams are configured: source update, feature update, source batch refresh, and feature batch refresh. This separation removes unnecessary computation during batch jobs, achieving a 10× QPS increase for refresh operations.

3.4 Fault‑tolerant Data Ingestion

All ingestion paths now funnel through Kafka, which buffers messages until they are successfully processed, guaranteeing no data loss even if a node crashes.

3.5 Consumer‑Processor Thread Separation

A lock‑free queue decouples Kafka consumption from document processing, allowing multiple processing threads per partition and improving CPU utilization and horizontal scalability.

4. Diff Verification

A diff verification service aggregates logs from all 15 distribution endpoints, providing unified diff analysis and a recursive JSON comparison tool to handle complex data structures.

5. Code Optimizations

5.1 Less Code

Adopted table‑driven programming to replace verbose if‑else chains and used C++20

std::atomic<std::shared_ptr>

instead of double‑buffer designs.

5.2 Higher Performance

Replaced repeated RapidJSON lookups with iterators, eliminated redundant JSON serialization, and introduced Sonic‑JSON, which is 40% faster than RapidJSON.

5.3 Better Foundations

Fixed perceived memory leaks by switching to jemalloc and refined memory pool usage, reducing OOM incidents.

6. R&D Process

6.1 Overall Workflow

Standardized requirement gathering, code review, coding standards, static analysis, CI/CD pipelines, and versioning (MAJOR.MINOR.PATCH).

6.2 Code Review

Mandatory security and style exams, with all changes undergoing rigorous CR checks.

6.3 Documentation

Comprehensive documentation of architecture, operational procedures, and module READMEs ensures knowledge transfer.

6.4 Pipeline Acceleration

Implemented stage‑level locking in BlueShield pipelines and used GitHub mirrors to speed up dependency fetching.

7. Business Impact

7.1 Performance Gains

Processing performance: Single‑core QPS increased from 13 to 172 (13× improvement).

Batch refresh: QPS rose from 1 000 to 10 000 (10×), limited only by storage.

Latency reduction: Average processing time dropped from 2.7 s to 0.8 s (70%+ reduction).

7.2 R&D Efficiency Gains

Lead‑time reduction: Feature development time fell from 5.72 days to 1 day (82% decrease).

Code size reduction: Total lines of code shrank from 113 k to 28 k (75% reduction) due to monolith consolidation, plugin design, and modern C++ usage.

Overall, the redesign delivered a more reliable, scalable, and maintainable content ingestion platform for QQ Browser Search.

backendperformance optimizationmicroservicessystem designPlugin ArchitectureC++
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.