Industry Insights 24 min read

Next‑Generation AB Experiment Analysis Engine for Multi‑Sided Scenarios

The article presents a next‑generation experiment analysis engine that standardizes the core AB testing framework, integrates advanced statistical solutions to tackle small‑sample and overflow challenges, and offers precise variance and P‑value calculations, thereby improving reliability and efficiency for multi‑side fulfillment platform experiments.

Meituan Technology Team

Sep 5, 2024

Next‑Generation AB Experiment Analysis Engine for Multi‑Sided Scenarios

Introduction

Since Google introduced A/B testing in 2000, the method has become essential for data‑driven product optimization. The rise of O2O platforms with multi‑side markets and strong LBS attributes exposes limitations of traditional experiment engines, which rely on a single experimental unit, ordinary random grouping, large samples, and independent individuals.

Challenges of Traditional Engines

In multi‑side scenarios, small samples and overflow effects are common, breaking the normality and independence assumptions that underpin standard P‑value calculations. Traditional engines also support only a narrow set of experimental designs, leading to robustness and extensibility issues.

Our Approach: A New Generation Experiment Analysis Engine

The new engine addresses these challenges through three principles: standardized and automated analysis pipelines, a central method library covering both single‑side and multi‑side domains, and decoupling of experiment infrastructure.

Standardized Analysis Workflow

The engine automatically selects analysis methods based on experimental context and follows a fixed pipeline:

Data reliability validation : homogeneity tests, systematic bias checks (SRM), and sample‑size verification.

Data preprocessing : outlier detection and removal without altering distribution.

Strategy effect estimation : choose appropriate estimators such as difference‑in‑differences or regression.

Variance estimation : apply variance‑reduction techniques or compute variance based on sample independence.

P‑value calculation : select parametric or non‑parametric tests as appropriate.

Report generation : consolidate all stages into a detailed experiment report.

Central Method Library

The library integrates 7 grouping methods and 16 analysis techniques, including:

2 variance‑reduction methods (e.g., binary CUPED).

5 effect‑estimation methods.

3 P‑value calculation approaches.

4 variance‑calculation strategies.

2 integrated‑analysis techniques.

It incorporates industry‑leading solutions such as covariate‑adaptive grouping and binary CUPED, which dramatically improve precision for small‑sample experiments.

Decoupled Infrastructure

By separating the analysis engine from experiment configuration, traffic allocation, and data pipelines, teams can evolve components independently. Engineers build the platform pipeline, data scientists contribute statistical methods, and data‑warehouse experts ensure metric consistency. This modularity enables zero‑cost integration with any experiment platform.

Key Techniques

Covariate‑Adaptive Grouping

Originating from medical research, this technique ensures homogeneous groups even with as few as 10 samples. Simulations show that when sample size drops below 40, covariate‑adaptive grouping keeps imbalance under 5% whereas completely random grouping exceeds 6%.

Integrated (Meta) Analysis

Aggregating results from multiple experiments boosts statistical power in low‑traffic scenarios, allowing reliable conclusions without large individual sample sizes.

Zero‑Cost Integration & Self‑Service

The modular pipeline automatically detects experiment units, types, grouping methods, and required metrics, then runs distributed analysis workers to produce reports, eliminating the need for custom code.

Conclusion and Outlook

The engine standardizes the experimental framework, merges advanced solutions such as covariate‑adaptive grouping, binary CUPED, and integrated analysis, and provides diversified overflow‑effect strategies (e.g., rotation experiments, double‑difference designs). It delivers precise variance and P‑value calculations tailored to various business scenarios, fostering knowledge sharing and rapid iteration. Following Fabijan et al. (2017), the decoupled architecture supports continuous evolution of metrics and methods while keeping the underlying infrastructure stable. Future work will open advanced observational‑study techniques and geographic overflow‑effect models to the broader community.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

A/B testing statistical methods small sample experiment analysis fulfillment platform overflow effect

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.