Artificial Intelligence 24 min read

OPPO Advertising Recall Algorithm: Architecture, Model Selection, Offline Evaluation, Sample Optimization, and Future Directions

This article presents OPPO's comprehensive advertising recall system, detailing the transition from the old to the new architecture with ANN support, the selection of main‑road recall models, the construction of offline evaluation metrics, sample optimization techniques, model enhancements, multi‑scenario training strategies, and outlook for future improvements.

DataFunSummit

Mar 9, 2024

OPPO Advertising Recall Algorithm: Architecture, Model Selection, Offline Evaluation, Sample Optimization, and Future Directions

Background OPPO's previous recall pipeline performed directional filtering followed by truncation before personalized recall, which suffered from performance constraints and limited personalization due to its placement after truncation. The goal was to enable full‑inventory personalized recall and improve overall platform metrics.

New Recall Architecture The updated design introduces Approximate Nearest Neighbor (ANN) search to support full‑inventory personalized recall and a multi‑path recall mechanism consisting of a primary "consistent" LTR‑based main path and several auxiliary paths (ECPM, cold‑start, and others) to enhance diversity and fairness, achieving a cumulative 15% ARPU lift.

Main‑Road Recall Model Selection Three model objectives—consistency, generalization (common and individual), and diversity—guide the selection. After evaluating precise value estimation versus set‑selection approaches, the set‑selection method was chosen for its higher consistency with downstream ranking.

LTR Prototype Model A simple dual‑tower architecture with pairwise samples (positive from top‑ranked ads, negatives from random exposures) uses ranking loss for training.

Offline Evaluation Metrics Three stages were built: an initial time‑split AUC evaluation (found too optimistic), a full‑library Faiss retrieval stage measuring GAUC and Recall (with K and N parameters), and a segmented sampling stage that splits negatives into Easy, Medium, and Hard to enable finer analysis.

Sample Optimization Techniques include adjusting bid‑sensitivity models (raising bid‑logits to improve price sensitivity from 5% to 90%), incorporating Hard and Medium negatives via in‑batch sampling, and using Pointwise loss with temperature‑scaled Sample Softmax to achieve up to 2% ARPU gains.

Model Optimization Exploration Improvements cover dual‑tower interaction enhancements (SENet for feature weighting, DAT for early interaction, implicit feature sharing), large‑scale multi‑classification via Sample Softmax, cold‑start optimization with CDN (memory vs. generalization experts), and multi‑scenario personalization using PPNet with gated expert networks.

Multi‑Scenario Joint Training Three strategies—fully independent models, fully unified models, and a hybrid where each media has its own ad tower while sharing a user tower—balance commonality and specificity, with the hybrid approach showing promising gains.

Outlook Future work includes further development of the ECPM auxiliary path, adapting to the shift toward ad productization and creative intelligence, and continued exploration of sample and model innovations.

Q&A The session addressed concerns about algorithm complexity, the impact of learning ranking in recall, determination of Recall parameters N and K, sample difficulty ratios, and the relationship between offline AUC/Recall and online performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Advertising Machine Learning dual-tower model large-scale classification offline evaluation recall algorithm sample optimization

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.