Fundamentals 12 min read

Tubi’s Switchback Experiment Platform: Design, Challenges, and Solutions

The article describes Tubi’s internal experimentation platform, explaining how traditional user‑group A/B tests can suffer from network interference and how Switchback experiments—time‑window based designs—address these issues, detailing their implementation, statistical methods, and the practical challenges overcome.

Bitu Technology

Nov 18, 2022

Tubi’s Switchback Experiment Platform: Design, Challenges, and Solutions

At Tubi, learning from data is a top priority and experiments play a crucial role; using the internal Popper platform, all experiments can be run as user‑group A/B tests with a custom sample‑size calculator powered by Tubi Data Runtime.

The system implements the CUPED variance‑reduction method to boost statistical power and displays average effects and test results on a central dashboard.

In user‑group experiments, random assignment enables causal inference via the Rubin causal model, but shared ad inventory can cause interference, violating the model’s assumptions.

To handle such network interference, Tubi adopts Switchback experiments, which assign entire time windows to treatment or control rather than individual users, eliminating cross‑group interference and allowing evaluation of non‑user‑facing system changes.

Switchback experiments shift the focus of metrics from user‑level to time‑window level, enabling analysis of effects such as ad‑budget allocation or DSP integration.

Key challenges of Switchback experiments include:

Time dependence: Seasonal patterns can bias randomization; Tubi mitigates this by imposing a weekly mirror constraint on window allocation.

Carryover effects: Residual effects from previous windows are reduced by discarding data during window burn‑in and burn‑out periods, using 20‑minute aggregation windows.

Reduced sample size and power: Fewer experimental units lead to lower power; Tubi applies a permutation test (per Bojinov, Simchi‑Levi, and Zhao) and other techniques such as t‑tests and hierarchical models to improve sensitivity.

The Switchback experiment system runs on Popper, with experiment parameters (window length, duration, device platforms) specified similarly to user‑group tests.

When a Switchback experiment goes live, a nightly data‑processing pipeline aggregates user behavior and ad‑exposure events into 20‑minute windows, writes daily files to S3, cleans up completed experiments, aggregates metrics at the window level, performs statistical testing (using the Benjamini‑Hochberg procedure to control false discovery rate), and stores result reports.

Currently, the system has been operational for several months, with ongoing work to further address carryover effects, optimize window allocation, and expand the applicability of Switchback experiments within Tubi.

Interested data scientists are invited to join the team; additional resources and related materials are listed at the end of the article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

A/B testing data-science causal inference experiment design variance reduction Switchback experiments

Written by

Bitu Technology

Bitu Technology is the registered company of Tubi's China team. We are engineers passionate about leveraging advanced technology to improve lives, and we hope to use this channel to connect and advance together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.