Edge AI Video Preloading: Case Study and Implementation with ByteDance's Client AI Platform
This article presents a comprehensive case study of applying edge AI to video preloading on the Xigua Video platform, detailing scenario analysis, predictive modeling of user behavior, feature engineering, on‑device model inference, dynamic algorithm package deployment, experimental evaluation, and the resulting performance and cost improvements.
Introduction
Edge intelligence, or running AI models on the client side, has become a hot new direction in the industry. Companies such as Alibaba, Google, and Kuaishou are actively deploying edge AI to optimize various business scenarios. ByteDance's Client AI team collaborated with Xigua Video to implement an edge‑intelligent video preloading solution, demonstrating how on‑device AI can improve business outcomes.
1. Scenario
1.1 Scenario Description
In the Xigua Video preloading scenario, the client preloads a fixed 800 KB buffer for the next three videos while the current video is playing, aiming to provide a smooth playback experience for subsequent videos.
However, this static strategy suffers from two major problems: (1) most users do not watch the entire preloaded buffer, leading to bandwidth waste; (2) when users carefully browse a video, insufficient buffer can cause start‑up failures or stuttering, degrading the experience.
The ideal strategy is to match the preloaded size with the actual playback size, loading only what the user is likely to watch.
1.2 In‑Depth Analysis
Because user behavior is highly variable, predicting exactly how much a user will watch is impossible. Instead, if we can forecast the user's upcoming behavior pattern—whether they will quickly swipe to the next video or watch slowly—we can adjust the preloading strategy accordingly.
User behavior exhibits regularities such as swipe speed, interaction tendency, time of day, and whether the user is in a fragmented‑time scenario. By predicting these patterns, we can, for example, reduce preloaded buffer size while increasing the number of preloaded videos for fast‑swipe users, and do the opposite for slow‑watch users.
1.3 Breakthrough Direction
The core problem becomes: How to predict user behavior patterns on the client?
We consider two approaches: rule‑based and model‑based. Rules are simple and low‑cost but struggle with complex scenarios. Models can handle complex, fine‑grained strategies but require higher development cost and longer cycles. A hybrid approach uses rules for quick early‑stage validation and models for the final production deployment.
1.4 Where to Run the Prediction
The prediction can be performed either in the cloud or on the client. For this scenario, the high real‑time requirement and short‑term feature data favor on‑device inference.
2. Edge‑Intelligent Preloading Solution
2.1 Overall Flow
The solution consists of three independent, parallel stages: client‑side AI development, client development, and algorithm‑package development.
2.2 On‑Device AI Development
2.2.1 Feature Mining
We first identify user‑behavior features that influence video consumption, such as swipe speed, interest level, and content type.
2.2.2 Feature Processing
Raw data from logs and device sensors are transformed into historical and real‑time features (e.g., last x samples, last x videos, last x hours).
2.2.3 Feature Analysis
Statistical methods (Pearson, Spearman, Cohen’s kappa), regularization (Lasso, Ridge), distance correlation, and decision‑tree feature ranking are used to evaluate each feature’s contribution and cost.
2.3 Algorithm Package Development
2.3.1 On‑Device Feature Engineering
ByteDance’s Pitaya framework provides on‑device feature extraction capabilities, supporting various trigger mechanisms, data sources, and hierarchical management.
2.3.2 On‑Device Model Inference
The inference pipeline includes environment deployment, a real‑time inference engine, and dynamic update capabilities that allow algorithm packages to be refreshed without a client version release.
2.3.3 Real‑Time Effect Monitoring
A monitoring system tracks execution success rate, latency, PV/UV, as well as model metrics such as accuracy, precision, recall, TP/FP/TN/FN.
2.4 Client Development
The client triggers the algorithm package after the first frame renders in landscape mode, parses the returned result, and adjusts the preloading task accordingly.
3. Evaluation
3.1 Rapid Iteration of Algorithm Packages
Using Pitaya’s sub‑release and traffic‑splitting features, multiple strategy groups (e.g., varying preload count, size, or scheduling mode) can be tested simultaneously in A/B experiments.
3.2 Algorithm Package Monitoring
Both business KPIs and algorithm‑package metrics are monitored to ensure model performance and stability.
3.3 Optimization Loop
Experimental data revealed that false negatives had a larger impact than false positives, leading to threshold adjustments that reduced start‑up failures and stuttering. Further analysis of short‑term user behavior patterns enabled time‑segmented model tuning, yielding additional gains.
3.4 Results
Failure rate decreased by 3.372%, start‑up failure by 3.892%, stutter rate by 2.031%, and per‑100‑seconds stutter count by 1.536% compared with a fixed preloading strategy.
Total bandwidth cost reduced by 1.11%, saving tens of millions of dollars.
Conclusion
The end‑to‑end edge‑AI workflow—scenario analysis, feature engineering, on‑device modeling, dynamic deployment, and continuous A/B testing—demonstrates how ByteDance’s Pitaya platform enables rapid, data‑driven optimization of video preloading, delivering measurable performance and cost benefits.
ByteDance’s Pitaya platform and the MARS development suite provide a comprehensive infrastructure for edge AI across multiple products such as Douyin, Xigua Video, and Toutiao.
ByteDance Terminal Technology
Official account of ByteDance Terminal Technology, sharing technical insights and team updates.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.