Evolution of Alibaba’s Advertising Prediction Models: From Linear Regression to Deep Interest Evolution Networks
This article reviews the characteristics of e‑commerce personalized prediction, traces Alibaba’s advertising CTR model evolution from large‑scale logistic regression through deep learning architectures such as DIN and CrossMedia, and discusses future research directions like representation learning and white‑box modeling.
The talk by Zhou Guorui at the AI Science Frontier Conference introduced the characteristics of e‑commerce data personalized estimation, Alibaba’s model iteration path, and future research directions.
1. Characteristics of e‑commerce data personalized estimation – Alibaba’s display ads are divided into banner and single‑item ads, each composed of id, image, and text. Because these ads target non‑search user behavior, the system must infer user intent from historical data to predict clicks, favorites, or purchases. The problem can be described along three dimensions: explicit content (images, text, reviews), account‑system IDs (item, shop, category), and final feedback (buy, fav, cart).
2. Alibaba’s model iteration path
Since 2012 the prediction models have evolved as shown in the diagram below.
MLR (Massive Linear Regression) – Traditional large‑scale feature + logistic regression, relying heavily on manual feature engineering to compensate for the linear model’s inability to capture non‑linear relationships.
MLR (Piecewise Linear Learning) introduces a non‑linear algorithm that partitions the feature space into multiple linear regions, similar to GBDT+LR.
First‑generation deep CTR model – With growing compute and data, a simple embedding + MLP network was introduced in 2016, achieving a significant lift over MLR because deep models increase capacity and fitting ability.
Deep learning decouples model design from optimization, enabling faster experimentation with more complex architectures.
Deep Interest Network (DIN) – Users exhibit multiple diverse interests; DIN uses an attention‑based activation unit to select historical behaviors relevant to a target item, representing user interest with a fixed‑length vector. This design improved CTR by 10%, CVR by 3.3%, and GPM by 12.6%.
CrossMedia Network – Added image‑text features to the model, storing massive image data remotely and extracting low‑dimensional vectors via a neural network to reduce storage and I/O pressure.
Deep Interest Evolution (DIE) – Recognized that RNNs struggle with e‑commerce sequences because user histories mix many interests. DIE introduces an interest‑extraction layer (GRU with attention) and an interest‑evolution layer to model how each interest evolves over time, improving sequence modeling and gradient propagation.
Experimental results on public datasets, offline validation, and online A/B tests demonstrate consistent performance gains.
Rocket Launching (Model Distillation) – To meet online latency constraints, a teacher‑student framework compresses a complex offline model into a lightweight online model, using collaborative training, parameter sharing, and gradient blocking.
3. Future Directions
Research will focus on representation learning (e.g., disentangled representations for e‑commerce concepts) and building more interpretable, white‑box models that expose the concepts influencing user decisions, enabling better user‑product interaction and more precise marketing.
Author: Zhou Guorui, Alibaba algorithm expert, Ph.D. candidate at Beijing University of Posts and Telecommunications, research interests include large‑scale machine learning, NLP, computational advertising, and recommendation systems.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.