Fundamentals and Implementation of A/B Testing at Qunar
This article explains the basic principles, practical demo, platform architecture, statistical validation, sample size estimation, and reporting workflow of A/B testing used at Qunar to evaluate advertising strategies and product features, illustrating how data‑driven experiments are designed, executed, and analyzed.
Author Background : Wang Lei graduated from China Agricultural University in 2006, worked at UFIDA on financial products, then at Sohu on SNS and big data, and joined Qunar in August 2015 to develop the A/B Test platform and log collection system.
A/B Test Basic Principles : A/B testing is a single‑variable controlled experiment that changes one factor (e.g., ad background color) while keeping others constant, comparing a control group (old version) with an experimental group (new version) using metrics such as click‑through rate.
Demo Example : An advertising redesign experiment split users 55/45 between old and new pages, recorded clicks over 30 days, and showed a modest increase in click‑through rate for the new version (6.23% vs 5.26%). However, statistical testing (Z‑test) revealed no significant difference.
Effective A/B Test Design : Experiments can be placed in exclusive or parallel traffic zones, with hierarchical layers allowing multiple independent experiments to run simultaneously without interference.
Qunar A/B Test Platform Overview : The platform consists of three parts – experiment management, SDK integration, and reporting. Experiment creators generate an experiment code, embed the SDK in application code, which fetches experiment configuration, performs traffic splitting, logs events, and sends data to Hadoop via Flume. A reporting system aggregates logs, computes common metrics, and provides dashboards and APIs for custom metrics.
SDK Usage : The service layer calls the SDK’s traffic‑splitting API, receives the assigned variant, and returns it to the front‑end for rendering.
Statistical Validation & Minimum Sample Size : Results are evaluated using Z‑test (|Z| > 1.65 indicates significance). Sample size calculations consider confidence level (α) and statistical power (β), using normal‑distribution assumptions and flow ratios (e.g., 40%/60%). Formulas for total required samples (n₁ + n₂) are provided.
Confidence Interval for Difference : The delta series follows a normal distribution; a 95% confidence interval for the difference between versions is computed, showing the range of possible improvement. If the interval includes zero, the new version is not considered better.
Conclusion : A single well‑designed A/B test, combined with proper statistical analysis, can determine whether a product change yields a significant effect.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.