Kuaishou Short‑Video Recommendation System: Fine‑Grained Ranking Model Practices
This article details Kuaishou's practical advances in short‑video recommendation, covering the CTR PPNet model, multi‑domain multi‑task learning, short‑ and long‑term user behavior sequence modeling, and the resulting performance gains in user engagement and app usage.
With global data volume projected to reach 175ZB by 2025, recommendation systems have become essential for helping users quickly find desired content, and Kuaishou's short‑video platform serves as a concrete example.
The recommendation pipeline consists of three core modules—user modeling, item modeling, and the recommendation algorithm—with the algorithm being the most critical component that determines system type and performance.
Kuaishou's fine‑grained ranking ("精排") practice begins with the 2019 CTR model PPNet, which predicts click‑through rates and drives user experience. Initial attempts to personalize the model via stacking user‑specific networks or bias vectors yielded limited gains, leading to the final design of a globally shared base network (LTE) fine‑tuned per user through a personalized gate network.
To address the diversity of product scenarios and tasks, Kuaishou adopted a multi‑domain multi‑task learning framework that aligns feature semantics, embeds space, and feature importance using techniques such as feature pruning, embedding transform gates, and slot gates, ultimately integrating a multi‑objective MMOE with personalized bias for each task tower.
For short‑term behavior sequence modeling, four improvements were introduced: (1) an encoder to represent historical sequences, (2) incorporation of user video‑watch histories, (3) replacement of Transformer self‑attention with target attention, and (4) use of log‑scaled time differences instead of position embeddings.
Long‑term behavior modeling faced challenges like Transformer convergence on long lists and inefficient SIM user indexing. Kuaishou iterated two versions: V1.0 based on tag retrieval with high‑density storage, and V2.0 based on embedding‑distance retrieval using clustering and cosine similarity, both delivering noticeable gains in average app usage time.
Overall, the combined innovations in CTR prediction, multi‑task learning, and behavior sequence modeling resulted in a near‑10% increase in user interaction and significant improvements in app engagement.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.