Product Management 18 min read

Understanding Online Experiments: Origins, Development, Types, and Applications

Online experiments, rooted in biomedical randomized controlled trials, have become essential for internet businesses to achieve data‑driven growth by providing causal inference, quantifying value, and managing risk through various designs such as AB, ABn, AA, multivariate and quasi‑experimental tests.

Alimama Tech

Jul 8, 2021

Understanding Online Experiments: Origins, Development, Types, and Applications

In internet business, "growth" is an eternal theme. As the era of abundant traffic diminishes, achieving effective growth under invisible strategy effects becomes a major challenge for online companies. Online experiments have become a crucial measurement tool since Google applied experimental techniques to its products in 2000, and they are now indispensable for strategy validation, product iteration, algorithm optimization, and risk control.

One accurate measurement is worth more than a thousand expert opinions – Admiral Grace Hopper

Alibaba’s advertising division has accumulated extensive practice and technical expertise in online experiments. The following series will share this knowledge, covering topics such as "Understanding Online Experiments", "AB Testing under Traffic Splitting Framework", and "AB Testing under Offline Sampling Framework".

1. What Is an Online Experiment

1.1 Origin

The concept of AB testing originates from the randomized controlled trial (RCT) used in biomedical research to evaluate drug efficacy. RCTs randomize subjects into groups, apply different treatments, and compare outcomes, thereby minimizing bias and confounding factors. This statistical foundation makes RCTs the gold standard for causal inference.

1.2 Development

With the rise of the internet, the RCT methodology has been widely adopted for product optimization. After the traffic‑growth era, user and value growth slowed, making data‑driven growth essential. The internet makes experiments easier because:

Sample size : Online platforms can collect millions of samples, eliminating the small‑sample limitations of traditional experiments.

Experiment cost : Computing resources allow near‑zero cost per sample, enabling large‑scale experiments.

Control of confounding factors : Precise version control ensures that all users experience the same environment, preserving causal validity.

2. Why Conduct Experiments

2.1 Causal Inference

AB testing serves as the golden rule for causal inference, providing direct evidence of whether a variable causes an effect, unlike many theoretical causal methods that rely on assumptions.

2.2 Value Evaluation

Beyond qualitative conclusions, AB tests quantify the impact of a change (e.g., revenue increase, user growth, efficiency gains), enabling data‑driven decision making.

2.3 Risk Control

Every optimization carries risk. Small‑traffic AB tests allow teams to estimate the risk of large‑scale rollouts, offering a safe trial‑and‑error mechanism.

3. Classification of Experiments

3.1 AB Test

3.1.1 AB Experiment

An AB experiment is a controlled randomized test that compares two versions of a single variable using a two‑sample hypothesis test. For example, a button’s color (A vs. B) is randomly shown to users, and click‑through rate (CTR) is measured.

Typical AB experiment workflow:

Define the objective (e.g., increase button clicks, metric = CTR).

Choose the randomization unit (user ID, page view, etc.).

Determine sample size via power analysis, balancing exposure proportion and experiment duration.

Collect data and perform hypothesis testing (usually a two‑sample t‑test or z‑test for large samples).

3.1.2 ABn Experiment

ABn experiments compare multiple versions of a single variable (e.g., several button colors). The hypothesis testing can involve:

Two‑sample tests for pairwise comparisons.

Multi‑sample tests (e.g., chi‑square or ANOVA) when more than two groups are involved.

3.1.3 AA Experiment

An AA experiment compares two identical versions to validate the experimental setup. Significant differences indicate either a Type I error (expected ~5%) or flaws in randomization.

3.1.4 Multivariate Test (MVT)

MVT evaluates multiple variables simultaneously (e.g., button color and text). It can test interaction hypotheses such as whether color and text effects are independent.

3.2 Quasi‑Experiments (Class Experiments)

3.2.1 Definition

When randomization or parallel control groups are infeasible, quasi‑experiments (or “class experiments”) are used. They still rely on controlled comparisons but may lack full random assignment.

3.2.2 Characteristics

Key traits include non‑random grouping, large sample sizes, and often the use of internal or self‑controls.

3.2.3 Common Designs

Self‑pre/post control : Compare the same subjects before and after an intervention.

Pre/post with groups : Randomly assign subjects to treatment and control, then compare outcomes (often using Difference‑in‑Differences).

Post‑only group control : Compare treated subjects with a contemporaneous control when pre‑intervention data are unavailable.

Solomon four‑group design : Combines pre/post and group controls, though rarely used due to complexity.

4. Summary

Online experimentation has evolved into an independent discipline that bridges statistical theory and internet technology. AB testing exemplifies how large‑scale data and computing power enable rigorous causal inference for product decisions. Future articles will delve into specific Alibaba‑Mama scenarios, covering standard AB tests and specialized experimental designs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

statistics data driven causal inference online experiments product optimization

Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.