Artificial Intelligence 18 min read

Evolution of Large‑Scale Recommendation Models at Weibo: Technical Roadmap and Recent Advances

This article reviews the evolution of Weibo's large‑scale recommendation technology, covering the system's business scenarios, technical roadmap, recent large model iterations, multi‑task and multi‑scenario modeling, feature engineering, consistency between recall and ranking, and emerging techniques such as causal inference and graph methods.

DataFunTalk

Apr 24, 2023

Evolution of Large‑Scale Recommendation Models at Weibo: Technical Roadmap and Recent Advances

The presentation outlines Weibo's recommendation platform, describing the diverse business scenarios (home‑page tabs, hot‑search streams, immersive video) and the challenges of high‑traffic, multi‑modal content with varied user feedback.

A technical roadmap is shown, highlighting the transition from early FM‑based models to deep, real‑time architectures, the in‑house Weidl online‑learning platform, and the ability to switch back‑ends quickly.

Recent large‑model iterations focus on multi‑objective fusion (static, RL‑based, and model‑driven weighting), multi‑task learning (MMOE → SNR → DMT), and multi‑scenario techniques (slot‑gate layers) to handle heterogeneous goals such as click, dwell time, interaction, and completion.

Interest representation advances from DIN to SIM/DMT, with longer behavior sequences and ultra‑long user histories improving personalization, especially for low‑exposure items.

Feature engineering discusses the impact of massive ID features, matching statistics, and multimodal embeddings (both direct fusion and clustering‑based approaches) to alleviate cold‑start problems.

Consistency between recall, coarse‑ranking, and fine‑ranking is examined, introducing DNN‑based stacking and cascade models to align representations and reduce truncation loss.

Causal inference is applied in recall and coarse‑ranking by constructing pairwise samples of low‑popularity clicked items versus high‑popularity unclicked items, improving personalization for niche content.

Additional techniques include beam‑search sequence re‑ranking, graph databases and embeddings for user‑author interactions, and exploratory GNN‑based recommendation models.

The Q&A section addresses the importance of dwell time as a signal, handling model consistency during fail‑over, and the relationship between recall and ranking pipelines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

multi-task learning Recommendation Systems causal inference graph embeddings large-scale models

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.