Vivo Hawking A/B Experiment Platform: Architecture, Practices, and Solutions
The Vivo Hawking platform provides a company‑wide, one‑stop A/B testing solution with a layered architecture, covariate‑balanced split algorithms, real‑time monitoring, and unified SDKs for Android, Java and H5, enabling thousands of daily experiments, automated analysis, and rapid product iteration across multiple departments.
Introduction
The article introduces the Vivo Hawking experiment platform, describing its system architecture, the problems encountered during business development, and the corresponding solutions.
Project Overview
Vivo’s internet products have shifted from a growth‑driven phase to a data‑driven, scientific development model. A/B testing has become a core tool for improving conversion efficiency and accelerating product‑research iteration. Hawking has evolved from a single system into a company‑wide, one‑stop platform that supports large‑scale A/B experiments.
2.1 A/B Experiment
An A/B experiment randomly splits traffic into two groups (A and B) to compare a new version of a page or feature with the old one. The article gives a concrete example where version A achieves a 70% conversion rate versus 50% for version B.
The experiment lifecycle is divided into three stages: pre‑experiment (define goals and metrics), during experiment (allocate traffic), and post‑experiment (evaluate results).
2.2 Layered Experiment Model
Hawking’s layered model follows Google’s “Overlapping Experiment Infrastructure” paper, enabling overlapping experiments with consistent statistical guarantees.
2.3 Platform Development and Business Value
Supports more than 900 daily experiments (peak >1000) across 20+ departments.
Standardized workflow lowers experiment entry barriers and improves efficiency.
Automated data‑analysis tools accelerate decision‑making and product iteration.
Reusable platform components avoid duplicated effort across teams.
3. Hawking System Architecture
The platform consists of several modules:
Experiment Personnel : roles for managing experiments, metrics, and analysis.
Experiment Portal : experiment management and result analysis.
Metric Management : built‑in and custom metrics integrated with the company’s big‑data metric system.
Comparison & Significance : visual components showing uplift, confidence intervals, and significance.
AA Analysis : validates that pre‑experiment groups are balanced on core metrics.
Real‑time Split Monitoring : monitors traffic distribution and allows manual intervention.
Experiment Split Service : provides SDKs for Android, Java, H5 (NGINX), Dubbo/HTTP, with a C++ SDK planned.
Split Methods : random split, targeted audience split, and covariate‑balanced split.
3.4–3.6 Data Services
Split data collection uses a unified data‑capture component and stores processed data in HDFS.
Metric calculation runs in an independent service with retry and alert mechanisms.
Data storage relies on MySQL for business data, Ehcache for configuration cache, Redis for auxiliary cache, and HDFS for experiment data.
4. Hawking Practices
4.1 Covariate Balancing Algorithm
Problem: Simple hash‑mod‑100 splitting can produce groups with uneven covariate distributions, harming statistical validity.
Solution: A three‑part covariate‑balanced algorithm consisting of offline stratified sampling, real‑time uniform grouping, and offline verification.
(1) Offline Stratified Sampling
Define core metrics with business owners.
Apply proportional stratification + K‑means clustering to obtain stratified samples.
Write sampled data into Hive tables.
(2) Real‑time Uniform Grouping
Synchronize stratified sample tables from Hive to Redis (uid → layer mapping, layer‑wise ratios).
Create experiments by linking experiment IDs, group IDs, and sample sizes to the latest layer data.
During split, look up a user’s layer and assign the user uniformly to a group within that layer.
(3) High‑Performance Split Schemes
Three Redis‑based designs were evaluated:
HASH with bucket‑wise sample count (best for 2 buckets, performance degrades linearly).
SORTED SET with bucket scores (stable performance).
HASH with modulo of (layer sample count, bucket size) (stable, 1.12× faster than scheme 2, 58% of single‑GET latency).
Scheme 3 was chosen for production.
High‑Memory User‑Info Storage
Three designs compared; the third (10000 primary hashes each containing 125 secondary hashes) offered the best memory‑usage trade‑off and was selected.
4.2 Java SDK
Early Java SDK only provided split routing, requiring clients to report results, leading to high integration cost and performance bottlenecks (Dubbo thread‑pool exhaustion, network failures). Subsequent upgrades added split result reporting, real‑time configuration updates, self‑monitoring, and fallback mechanisms, dramatically improving stability.
4.3 H5 Experiments
Problems with traditional H5 SDKs: code changes required, page masking, long integration cycles. Hawking’s solution uses an APISIX‑based VUA (Unified Access) layer that automatically injects routing rules via a visual configuration platform, eliminating code changes for the client.
Multi‑version and multi‑page H5 experiments are supported through APISIX plugins that rewrite upstream paths based on experiment configuration.
5. Experiment Effect Analysis
The platform provides metric services, near‑real‑time metric calculation, AA analysis, and visual dashboards for experiment evaluation.
6. Summary and Outlook
The Hawking platform enables a closed loop of experiment creation → data analysis → decision → iteration, offering:
Simple, flexible experiment workflow.
Scientific multi‑layer split algorithms without code releases.
Real‑time split monitoring and hourly metric dashboards.
Custom metric support without waiting for analyst‑built reports.
Future work will focus on improving user experience, simplifying metric configuration, and enhancing interactive data analysis (multi‑dimensional, attribution analysis).
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.