Artificial Intelligence 18 min read

Edge AI Video Preloading: Case Study and Implementation with ByteDance's Client AI Platform

This article presents a comprehensive case study of applying edge AI to video preloading on the Xigua Video platform, detailing scenario analysis, predictive modeling of user behavior, feature engineering, on‑device model inference, dynamic algorithm package deployment, experimental evaluation, and the resulting performance and cost improvements.

ByteDance Terminal Technology

Nov 9, 2021

Edge AI Video Preloading: Case Study and Implementation with ByteDance's Client AI Platform

Introduction

Edge intelligence, or running AI models on the client side, has become a hot new direction in the industry. Companies such as Alibaba, Google, and Kuaishou are actively deploying edge AI to optimize various business scenarios. ByteDance's Client AI team collaborated with Xigua Video to implement an edge‑intelligent video preloading solution, demonstrating how on‑device AI can improve business outcomes.

1. Scenario

1.1 Scenario Description

In the Xigua Video preloading scenario, the client preloads a fixed 800 KB buffer for the next three videos while the current video is playing, aiming to provide a smooth playback experience for subsequent videos.

However, this static strategy suffers from two major problems: (1) most users do not watch the entire preloaded buffer, leading to bandwidth waste; (2) when users carefully browse a video, insufficient buffer can cause start‑up failures or stuttering, degrading the experience.

The ideal strategy is to match the preloaded size with the actual playback size, loading only what the user is likely to watch.

1.2 In‑Depth Analysis

Because user behavior is highly variable, predicting exactly how much a user will watch is impossible. Instead, if we can forecast the user's upcoming behavior pattern—whether they will quickly swipe to the next video or watch slowly—we can adjust the preloading strategy accordingly.

User behavior exhibits regularities such as swipe speed, interaction tendency, time of day, and whether the user is in a fragmented‑time scenario. By predicting these patterns, we can, for example, reduce preloaded buffer size while increasing the number of preloaded videos for fast‑swipe users, and do the opposite for slow‑watch users.

1.3 Breakthrough Direction

The core problem becomes: How to predict user behavior patterns on the client?

We consider two approaches: rule‑based and model‑based. Rules are simple and low‑cost but struggle with complex scenarios. Models can handle complex, fine‑grained strategies but require higher development cost and longer cycles. A hybrid approach uses rules for quick early‑stage validation and models for the final production deployment.

1.4 Where to Run the Prediction

The prediction can be performed either in the cloud or on the client. For this scenario, the high real‑time requirement and short‑term feature data favor on‑device inference.

2. Edge‑Intelligent Preloading Solution

2.1 Overall Flow

The solution consists of three independent, parallel stages: client‑side AI development, client development, and algorithm‑package development.

2.2 On‑Device AI Development

2.2.1 Feature Mining

We first identify user‑behavior features that influence video consumption, such as swipe speed, interest level, and content type.

2.2.2 Feature Processing

Raw data from logs and device sensors are transformed into historical and real‑time features (e.g., last x samples, last x videos, last x hours).

2.2.3 Feature Analysis

Statistical methods (Pearson, Spearman, Cohen’s kappa), regularization (Lasso, Ridge), distance correlation, and decision‑tree feature ranking are used to evaluate each feature’s contribution and cost.

2.3 Algorithm Package Development

2.3.1 On‑Device Feature Engineering

ByteDance’s Pitaya framework provides on‑device feature extraction capabilities, supporting various trigger mechanisms, data sources, and hierarchical management.

2.3.2 On‑Device Model Inference

The inference pipeline includes environment deployment, a real‑time inference engine, and dynamic update capabilities that allow algorithm packages to be refreshed without a client version release.

2.3.3 Real‑Time Effect Monitoring

A monitoring system tracks execution success rate, latency, PV/UV, as well as model metrics such as accuracy, precision, recall, TP/FP/TN/FN.

2.4 Client Development

The client triggers the algorithm package after the first frame renders in landscape mode, parses the returned result, and adjusts the preloading task accordingly.

3. Evaluation

3.1 Rapid Iteration of Algorithm Packages

Using Pitaya’s sub‑release and traffic‑splitting features, multiple strategy groups (e.g., varying preload count, size, or scheduling mode) can be tested simultaneously in A/B experiments.

3.2 Algorithm Package Monitoring

Both business KPIs and algorithm‑package metrics are monitored to ensure model performance and stability.

3.3 Optimization Loop

Experimental data revealed that false negatives had a larger impact than false positives, leading to threshold adjustments that reduced start‑up failures and stuttering. Further analysis of short‑term user behavior patterns enabled time‑segmented model tuning, yielding additional gains.

3.4 Results

Failure rate decreased by 3.372%, start‑up failure by 3.892%, stutter rate by 2.031%, and per‑100‑seconds stutter count by 1.536% compared with a fixed preloading strategy.

Total bandwidth cost reduced by 1.11%, saving tens of millions of dollars.

Conclusion

The end‑to‑end edge‑AI workflow—scenario analysis, feature engineering, on‑device modeling, dynamic deployment, and continuous A/B testing—demonstrates how ByteDance’s Pitaya platform enables rapid, data‑driven optimization of video preloading, delivering measurable performance and cost benefits.

ByteDance’s Pitaya platform and the MARS development suite provide a comprehensive infrastructure for edge AI across multiple products such as Douyin, Xigua Video, and Toutiao.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

model optimization feature engineering A/B testing dynamic deployment client inference video preloading

Written by

ByteDance Terminal Technology

Official account of ByteDance Terminal Technology, sharing technical insights and team updates.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.