Full-Chain Linkage Techniques for Alibaba Display Advertising: From Deep Learning to Set Selection
Facing diminishing deep‑learning and compute gains in Alibaba’s display‑ad pipeline, the speaker proposes a full‑chain linkage approach that combines vector‑based recall (PDM), entire‑space pre‑ranking (ESDM), and set‑selection learning‑to‑rank models (LDM, LBDM) to align upstream modules with downstream objectives, yielding 8‑10% revenue growth.
01 Background and Current Situation
Deep learning has brought huge technical and compute benefits to search and advertising, but these gains are fading as models become saturated, causing bottlenecks in recall, coarse ranking, fine ranking, and re‑ranking modules.
In 2019 the author upgraded Alibaba's display ad coarse‑ranking from a vector inner‑product model to a real‑time deep fully‑connected model (COLD), yet the gap between coarse and fine ranking PCTR models narrowed, making further online gains difficult.
To break this deadlock, a full‑chain linkage direction was proposed, addressing inconsistencies between module goals and sample‑selection bias, which has been deployed across all core Alibaba display‑ad scenarios, delivering over 10% revenue uplift.
02 Problems and Challenges
The coarse‑ranking upgrade reduced the performance gap with fine‑ranking, but several issues remain:
Recall channels prioritize interest over RPM, causing high‑RPM ads to be filtered out.
Coarse‑ranking uses advertiser bids directly, while fine‑ranking applies multi‑objective scoring and bid adjustments, leading to mismatched scores.
Sample selection bias (SSB) exists because training samples differ from online inference samples, especially in earlier stages.
03 Technical Solution Overview
Traditional approaches migrate fine‑ranking precision upstream, but this becomes costly as model complexity and inference scale grow. Instead, two new routes are explored:
Set‑selection learning‑to‑rank (LTR) that directly learns the downstream ranking order.
Vector‑based recall (PDM) that can optimize arbitrary targets such as RPM or GMV.
04 Precise Value Estimation Techniques
1. Point‑based Deep Match Model (PDM) for Full‑Library Recall
PDM handles both direct targets (e.g., CTR) and indirect targets (e.g., RPM = CTR × Bid). Using ALSH for maximum inner‑product search, eCPM is expressed as a dot product of two vectors (a PCTR twin‑tower vector and the raw bid), enabling efficient full‑library retrieval of the highest‑RPM ads.
The model also mitigates sample‑selection bias by incorporating click, PV, and un‑PV samples, joint training with the fine‑ranking model, random negative sampling, and knowledge distillation on un‑PV samples.
2. Entire‑Space Domain Adaptation Model (ESDM) for Coarse‑Ranking
ESDM aligns the score distributions of coarse‑ranking and fine‑ranking across the whole sample space (click, PV, un‑PV) using joint training, shared embeddings, auxiliary networks, random negative sampling, and distillation from the fine‑ranking model.
Online results: CTR + 3%, RPM + 1.5%.
05 Set‑Selection Techniques
1. Learning‑to‑Rank based Deep Match Model (LDM) for Recall
LDM treats the fine‑ranking order as the learning target, using click‑exposed samples as positives and random negatives, plus an auxiliary twin‑tower network for hard‑negative learning.
Benefits: implicit learning of multi‑objective scoring and bid‑adjustment, low maintenance, and online gains of CTR + 3%, RPM + 4%.
2. Learning‑to‑Rank based and Bid‑Sensitive Deep Pre‑Ranking Model (LBDM) for Coarse‑Ranking
LBDM builds real‑time ODL streams of fine‑ranking competition logs, forms pairwise samples across ranking buckets, and introduces a bid‑monotonic pairwise loss to preserve the monotonic relationship between bid and score.
Online gains: CTR + 8%, RPM + 5%.
06 Exploit & Explore Full‑Chain Channels
The system separates exploitation (short‑term goal alignment using set‑selection) from exploration (long‑term ecosystem optimization using precise value estimation), allowing each channel to be optimized independently while sharing data via real‑time feedback loops.
07 Business Impact
Precise value estimation (PDM): CTR + 1.5%, RPM + 2%.
Entire‑space coarse‑ranking (ESDM): CTR + 3%, RPM + 1.5%.
Set‑selection recall (LDM): CTR + 3%, RPM + 4%.
Set‑selection coarse‑ranking (LBDM): CTR + 8%, RPM + 5%.
Overall, the full‑chain linkage technology has been rolled out to all core Alibaba display‑ad scenarios, contributing more than a 10% increase in platform revenue.
08 Summary and Outlook
The proposed full‑chain linkage breaks the traditional cascade bottleneck by jointly optimizing recall, coarse‑ranking, and fine‑ranking through both precise value estimation and set‑selection routes. Future work includes deeper integration of the two routes, exploring end‑to‑end multi‑module architectures, and extending the approach beyond advertising to search and recommendation systems.
09 Q&A
Audience questions and discussion.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.