Operations 15 min read

Designing Experiments for Bilateral Markets in Advertising Platforms

This article explains how to design and evaluate experiments for bilateral markets in advertising platforms, covering the limitations of traditional randomization, the four‑cell traffic‑advertisement experiment, various mitigation strategies such as counterfactual interleaving and joint sampling, and the use of a simulation system to validate methods.

DataFunSummit
DataFunSummit
DataFunSummit
Designing Experiments for Bilateral Markets in Advertising Platforms

In a bilateral market, a platform connects two groups—suppliers and demanders—whose actions affect each other through network effects, making the independence assumption of classic A/B testing difficult to satisfy.

Traditional randomization methods such as geographic, category, or time splitting are often infeasible for advertising platforms because ads are typically delivered without geographic limits, lack clear categories, and exhibit strong temporal spill‑over (e.g., Matthew effect).

The "four‑cell traffic‑advertisement" experiment attempts to isolate the effect of a strategy by dividing traffic and ads into two groups each, but interference arises: when the strategy improves ad A in traffic U₁, it can steal impressions from ad B, and spill‑over can affect other traffic‑ad combinations, leading to misleading comparisons.

A naïve mitigation is to split both traffic and ads 50 %/50 % so that the strategy only runs on half of the ads and half of the traffic, which eliminates stealing and spill‑over but incurs a large revenue loss because only half of the inventory is used for each variant.

An improved approach introduces a third “blank” segment (p % experiment, p % control, remaining blank), reducing revenue loss but still allowing some stealing and spill‑over in the blank traffic.

From an engineering perspective, one can duplicate each ad and route the original and copy to isolated traffic streams, applying different strategies to each. This avoids revenue loss and user impact but raises challenges such as ensuring the duplicated ads behave identically, increased ad‑volume pressure on the system, and difficulty guaranteeing true independence between groups.

Counterfactual interleaving, a framework popularized by Facebook, implements a within‑subject design where a single request is ranked by both the experimental and control algorithms and the results are interleaved. It suffers from three major drawbacks: the Condorcet paradox when merging rankings, loss of optimal ads due to the interleaving process, and state‑dependency pollution because ad performance is influenced by feedback loops and competition.

The "contingency‑table joint sampling" method generalizes the four‑cell design to an m × n layout using upper‑triangular sampling, providing more data points to estimate stealing and spill‑over effects. It captures both ad‑level metrics (e.g., bids, ROI) and traffic‑level core indicators (e.g., spend), allowing a unified view of strategy impact.

Advantages of joint sampling include solving the stealing problem, observing advertiser behavior, and unifying supply‑ and demand‑side effects. Disadvantages are the assumption of linear additive effects, the need for randomization on both sides (which can be hard with limited samples), and higher model complexity.

To validate experiment designs before online rollout, Tencent built a bilateral‑market simulation system that abstracts the ad pipeline to core components (recall, ranking, model estimation, feedback). The simulator can run full‑traffic A and B strategies, then apply experimental designs on a subset of traffic to quantify the gap between full‑scale results and experimental estimates, providing an objective, low‑risk evaluation method.

The presentation concludes with a summary of the discussed methods and an invitation for further discussion.

A/B testingData Sciencecounterfactual interleavingadvertising experimentbilateral marketjoint samplingsimulation system
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.