Highlights of Alibaba Mama Team Papers Accepted at CIKM 2023
Eight Alibaba Mama team papers accepted at CIKM 2023 present advances such as task‑specific bottom‑representation networks for recommendation, a unified GNN for multi‑scenario e‑commerce search, multi‑slot bid shading, consistency‑oriented pre‑ranking, bias‑mitigating CTR prediction, efficient progressive‑sampling self‑attention, delayed‑feedback conversion modeling, and hybrid contrastive multi‑scenario ad ranking.
Recently, the results of paper acceptance for the 32nd ACM International Conference on Information and Knowledge Management (CIKM 2023) were announced. Eight papers from the Alibaba Mama technical team were accepted.
This article introduces the selected papers; we will invite the authors to discuss their ideas and technical achievements in detail. Stay tuned!
Deep Task-specific Bottom Representation Network for Multi-Task Recommendation
Abstract: Neural‑network‑based multi‑task learning (MTL) has greatly advanced recommendation systems. Existing deep MTL methods (e.g., MMOE, PLE) rely on soft‑gating for shared parameters but can suffer negative transfer when tasks conflict, degrading performance. DTRN explicitly learns task‑specific bottom representations by giving each task its own representation‑learning network, extracting user interests from diverse behavior sequences via a parameter‑saving hypernetwork, and refining each feature with a SENet‑like module. These modules together obtain task‑specific bottom representations, mitigating interference. DTRN can be combined with existing MTL methods. Experiments on public and industrial datasets, as well as deployment in a real system, demonstrate significant improvements.
BOMGraph: Boosting Multi‑scenario E‑commerce Search with a Unified Graph Neural Network
Abstract: Mobile Taobao supports text, image, and similar‑item search across multiple scenarios with both shared and distinct data distributions. BOMGraph proposes a unified GNN‑based recall method that propagates heterogeneous information across scenarios via intra‑ and inter‑scenario metapaths, uses a decoupling network to extract common and exclusive item representations, and applies cross‑scenario sample augmentation and contrastive learning to address long‑tail and sparsity issues. Offline evaluations and online A/B tests confirm its effectiveness; the method is now deployed in search advertising.
MEBS: Multi‑task End‑to‑end Bid Shading for Multi‑slot Display Advertising
Abstract: Traditional research focuses on bidding for a single ad slot, but multi‑slot display ads are increasingly common. Different slots have varying cost‑effectiveness, requiring fine‑grained bidding. This work introduces bid shading based on request‑level information, proves its optimality, and proposes a multi‑task end‑to‑end algorithm that integrates bid shading for multiple slots. Compared with two‑stage methods and traditional GSP bidding, the approach yields superior performance and significant online gains.
COPR: Consistency‑Oriented Pre‑Ranking for Online Advertising
Abstract: Large‑scale ad systems use multi‑stage cascades; inconsistency between pre‑ranking and ranking harms efficiency. Existing value‑alignment methods leave residual errors. COPR presents a consistency‑optimized pre‑ranking framework with plug‑and‑play pairwise alignment, chunk sampling, and ΔNDCG weighting, enabling end‑to‑end optimization of consistency. Deployed in Alibaba Mama display ads, it achieved +12.3% CTR and +5.6% RPM.
Rec4Ad: A Free Lunch to Mitigate Sample Selection Bias for Ads CTR Prediction in Taobao
Abstract: CTR prediction suffers from sample selection bias because training data consist of exposed ads, not the true candidate distribution. Existing re‑weighting methods have high variance; random sampling is costly. Rec4Ad leverages mixed ad‑recommendation lists, assuming similar user click logic, to introduce recommendation samples for bias mitigation. After fine‑grained data augmentation, Rec4Ad learns decoupled representations via alignment and de‑correlation modules, extracting bias factors and improving performance. Online deployment yields significant gains.
PS‑SA: An Efficient Self‑Attention via Progressive Sampling for User Behavior Sequence Modeling
Abstract: Self‑attention’s O(n²) complexity hinders deployment. Observing sparsity in attention matrices, PS‑SA employs a learnable progressive sampling strategy to select valuable items and compute attention only for them, reducing computation while preserving modeling quality. Experiments on academic and production datasets show cost reduction and strong performance; the method has been deployed in Alibaba’s display ad system, delivering +2.6% CTR and +1.3% RPM.
Entire Space Cascade Delayed Feedback Modeling for Effective Conversion Rate Prediction
Abstract: Effective conversion rate (ECVR) excludes refunded purchases. Traditional CVR and refund prediction face data sparsity, sample selection bias, and cascade delayed feedback (CDF). ECAD models CVR and refund rate in the entire space to address sparsity and bias, and adds auxiliary tasks for conversion and refund time windows to handle CDF. Offline and online tests confirm its effectiveness; deployment in the Xianyu recommendation system significantly boosts ECVR.
Hybrid Contrastive Constraints for Multi‑Scenario Ad Ranking
Abstract: Multi‑scenario ad modeling aims to train a unified model across scenarios to alleviate data sparsity. Existing methods still struggle with limited network capacity and difficulty modeling inter‑scenario relations. HC² introduces hybrid contrastive learning with scene‑general and scene‑specific contrastive losses, along with strategies for sample selection and loss weighting, to capture both commonality and uniqueness across scenarios. The approach can enhance various multi‑scenario ad models.
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.