How Intelligent Traffic Distribution Boosts New Book Exposure in Reading Apps
This article describes the design and implementation of an intelligent traffic distribution system for a reading platform, detailing its background, overall architecture, sub-modules such as the small‑traffic experiment platform, near‑line computation, retrieval strategies, pacing algorithms, and how it balances user personalization with content ecosystem growth.
Background
In traditional intelligent distribution, the focus is on user‑centric metrics such as click‑through, read‑through, and payment rates. For the reading group’s apps, distribution must also support content‑side ecosystem goals, promoting new books and fostering growth. Therefore a smart traffic distribution system was built to provide fine‑grained control per request (user, slot, time) while balancing personalization and platform content growth.
Overall Architecture
The near‑line component aggregates real‑time exposure counts per book and recommendation slot, supporting flexible slot combinations with second‑level feedback. The retrieval side uses these UV statistics together with user profiles, recall, and personalized ranking to output content under traffic constraints. The offline side syncs content pools, slot info, exposure controls, and generates reports for analysis and content promotion.
Sub‑module Overview
Small‑Traffic Experiment Platform
The platform has a three‑layer structure (platform / scenario / experiment domain). Features include:
Platform + scenario + user/device granularity for experiment plans
Multi‑dimensional traffic slicing (e.g., tail numbers, random ratios, whitelist)
Config‑driven, flexible and extensible
Hot‑load support for experiments with second‑level activation
Node‑level rollout, rollback, and full‑experiment propagation
Fault tolerance and robustness checks (traffic overlap, baseline/experiment ID conflicts, parameter validation, traffic hit detection)
Mutual exclusion within the same layer and orthogonal experiments across layers, with traffic reuse mechanisms
Two experiment dimensions are supported:
Global granularity (user profile, recall, ranking model, etc.)
Single‑book granularity (traffic speed, exposure caps)
Near‑line Computation
Real‑time flow built with Kafka + Flink provides:
≈40 k QPS of real‑time behavior
Real‑time UV and conversion statistics per hour/day
Shared exposure across multiple slots
Tumbling‑window aggregation
Probabilistic counting via HyperLogLog
Token‑bucket based flow‑rate control
Connection‑pool and pipeline storage
Real‑time exposure/conversion reporting
Hourly experiment feedback and book‑pool updates
Zookeeper‑driven configuration updates and hot‑load triggers
Retrieval Strategy
Exposure pacing follows a budget‑pacing approach similar to ad‑delivery, aiming for uniform exposure over the target period. It combines offline traffic estimation with per‑time‑slice proportional control, adjusting each book’s competition probability based on real‑time UV and expected exposure.
Personalized Recall and Ranking
Multiple recall algorithms are employed: dual‑tower vector recall, inverted‑index collaborative filtering, tag‑based recall, popularity‑based recall, and diversity‑exploration recall. Configurable factors include behavior decay, trigger thresholds, gender, and short/long‑term behavior ratios. A DeepFM‑based ranking model incorporates user historical behavior and item attention, with additional weighting based on historical exposure metrics.
Traffic Promotion Strategy
The recommendation pool supports dynamic promotion: based on exposure conversion performance, the system adjusts future exposure quotas, slot assignments, and weights for each book.
Yuewen Technology
The Yuewen Group tech team supports and powers services like QQ Reading, Qidian Books, and Hongxiu Reading. This account targets internet developers, sharing high‑quality original technical content. Follow us for the latest Yuewen tech updates.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.