Artificial Intelligence 19 min read

Applying Causal Inference Techniques to Short‑Video Recommendation at Kuaishou

This article presents how causal inference methods are applied to Kuaishou’s single‑column short‑video recommendation, covering the platform’s recommendation scenario, model representations, duration bias mitigation, viewing‑time prediction techniques such as D2Q and TPM, experimental results, and future research directions.

DataFunTalk

Feb 4, 2024

Applying Causal Inference Techniques to Short‑Video Recommendation at Kuaishou

The article shares the application of causal inference techniques in Kuaishou’s short‑video recommendation system, outlining the motivation, methodology, experimental validation, and future research directions.

Kuaishou single‑column short‑video recommendation scenario: Kuaishou’s short‑video feed primarily uses a single‑column, immersive vertical scrolling interface. User interactions (likes, follows, comments, drag‑to‑seek) generate rich feedback logs that drive the core recommendation algorithm, which faces challenges such as self‑reinforcing loops, popularity bias, and data sparsity.

Causal inference technique and model representation: The work separates user interest representations from conformity (social influence) representations using a causal graph. Interest loss supervises the interest encoder, while confirmation loss captures conformity. Contrastive learning with sample augmentation addresses sparsity, and normalized popularity ratios are incorporated into the losses.

Related work and model details: Prior research (e.g., WebConf2021) modeled user‑item interactions with separate interest and conformity embeddings and employed pairwise and confirmation losses. The current approach extends this by handling static popularity bias and introducing contrastive sample augmentation to improve learning stability.

Viewing‑time estimation and causal inference (D2Q): Watching duration is a key long‑term metric. The authors model duration bias using a causal graph linking user features (U), video features (V), video length (D), and watch time (W). By applying do‑calculus, they perform back‑door adjustment: each duration group is trained separately, and quantile regression predicts watch time, mitigating bias amplification.

TPM method (Tree‑based Prediction Model): TPM transforms the continuous duration prediction into a hierarchical binary classification problem using a full binary tree. Conditional dependence between tree nodes is exploited, and parameter sharing (shared embeddings and hidden layers) keeps model complexity low. During inference, the model outputs probabilities for each non‑leaf node, which are traversed to compute a weighted expected watch time.

Experimental results: Offline experiments on Kuaishou’s public dataset and the CIKM16 stay‑time dataset show TPM outperforming baselines such as WLR, D2Q, and OR. Online A/B tests on Kuaishou’s feed demonstrate significant improvements in average watch time with stable or improved secondary metrics, confirming the practical impact of the proposed methods.

Future outlook: The authors emphasize the need for systematic, automated causal‑adjustment pipelines to handle increasing system complexity, bias mitigation, and scalability while maintaining cost‑effectiveness.

Q&A highlights: TPM’s conditional dependence is explained as a sequential decision process akin to a Markov decision process, and deployment considerations are addressed by sharing bottom‑layer parameters, resulting in inference costs comparable to standard models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning causal inference Kuaishou short video recommendation duration bias

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.