Big Data 22 min read

A/B Testing Framework for Online Experiments: Design, Implementation, Analysis, and Decision Making

The paper presents a comprehensive A/B testing framework for online experiments that guides practitioners through four stages—designing objectives and metrics, implementing random traffic allocation with robustness checks, evaluating effects using descriptive statistics and hypothesis testing, and making rollout decisions based on multidimensional significance and attribution analyses.

Alimama Tech
Alimama Tech
Alimama Tech
A/B Testing Framework for Online Experiments: Design, Implementation, Analysis, and Decision Making

The article introduces A/B testing within an online traffic splitting framework, outlining its purpose to compare two strategies under consistent conditions.

It details a four-step experimental process: design, implementation, effect evaluation, and decision making.

In the design phase, it covers defining objectives (e.g., comparing ROI, long-term effects) and structuring experiments via metric design (invariants vs. variables) and traffic acquisition/allocation, including random unit selection, sample size estimation using formulas involving Type I/II errors, and considerations for multiple metrics.

The implementation section discusses random unit granularity (page, session, user levels), orthogonal vs. mutually exclusive experiment designs, and robustness checks using AA tests and overlapping stratified bucketing to mitigate bucket bias.

Effect evaluation covers descriptive statistics (central tendency, dispersion, distribution shape, frequency and trend analysis, correlation) and statistical inference including hypothesis testing (bootstrap + t-test), p-value interpretation, statistical power, and confidence intervals.

Depth analysis includes dimensional subdivision (e.g., user segmentation, position-based analysis) using methods like stratified chi‑square (CMH test) for rate metrics and ANOVA/LSD for mean metrics, and attribution analysis via process metrics, dropout data, and post‑conversion retention, employing models such as funnel, user path, session analysis, and regression/ML techniques.

Finally, the decision‑making step outlines how to interpret results: assessing core metric significance, examining growth across dimensions, and deciding on full‑scale rollout, partial rollout, or rollback based on multidimensional evidence.

data analysisA/B Testingstatistical inferenceExperimental designonline experiments
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.