Serializing Advertising Placement with User Algorithms at Alibaba Health
Alibaba Health’s user algorithm leverages multi‑channel serialized ad placement, using vector‑based three‑tower models, knowledge distillation, and ROI‑oriented optimizations to sequence user touchpoints, improve conversion rates, and enhance model accuracy across diverse marketing channels.
In the era of rapid mobile internet growth, consumer decision paths have become non‑linear and dynamic, prompting Alibaba Health to apply data‑driven algorithms for intelligent handling of fragmented user journeys.
The presentation outlines five key topics: an overview of Alibaba Health’s user algorithm business, the foundational serialized placement model, vector model optimizations, rule‑based enhancements, and a concluding summary with future outlook.
1. User Algorithm Overview – Alibaba Health’s algorithm builds on the broader Tmall industry data assets, offering merchants tools for audience targeting across multiple channels (e.g., SMS, push, in‑site ads, external platforms like Douyin and Zhihu). It integrates user‑level behavior data to generate differentiated recommendations, product suggestions, and membership prompts, feeding back into a unified industry‑wide user asset repository.
The serialized placement solution addresses two main challenges: (1) merchants lack a holistic view to select the most efficient combination of channels, and (2) isolated channel goals cause valuable cross‑channel synergies to be overlooked. The proposed approach sequences audience flow, combines channel bundles, and controls budget, bidding, and frequency to achieve precise, ordered user reach.
2. Basic Serialized Placement Model – The model transforms single‑channel CTR/CVR estimation into a person‑product path prediction problem. Key difficulties include constructing sequential samples and handling path uncertainty due to incomplete exposure in real‑world campaigns.
To solve this, a “vector three‑tower” architecture is introduced, representing users, items, and paths as high‑dimensional vectors. This design enables fast serving and caching while preserving white‑box transparency for merchants.
3. Vector Model Optimizations – Two enhancements are applied:
Generalization improvement: incorporating GraphSAGE‑based item embeddings to better learn from long‑tail and new products.
ROI‑oriented prediction: shifting from pure CVR estimation to ROI estimation, with both direct and separate ROI prediction strategies.
Knowledge distillation is employed to inject cross‑feature information into the online model, while a virtual kernel with topic embeddings enriches user vectors for “thousand‑faces‑one‑object” personalization.
4. Rule Optimizations – Strict flow rules are relaxed to allow broader exposure triggers (e.g., public‑domain searches) and similar‑category products, boosting overall flow rates by 23.6%. Frequency controls prevent over‑exposure of users who have already converted, raising overall ROI by 4.46%.
5. Summary & Outlook – Knowledge distillation and virtual kernel techniques improve both AUC and ROI. Future work includes tailoring flow strategies for health‑related repeat purchases, integrating channel‑aware features, and leveraging multi‑touch attribution to refine budgeting and path selection.
Q&A Highlights
• Model inference typically runs in minutes, varying with audience and product scale.
• User‑side features are channel‑agnostic; alignment is possible via common metrics like click‑through rates.
• Budget is set externally; the algorithm assists in allocation and bidding suggestions.
• Frequency control and dynamic path adjustments are incorporated, though real‑time redirection remains a challenge.
• Personalized paths are essential, and the system ensures data privacy without leakage.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.