Multi-Task Multi-Scenario Modeling: Challenges, Industry Solutions, and Zhaozhuan's Practice
This article outlines the challenges of multi-task and multi-scenario modeling for large-scale C-end services, reviews key industry approaches such as Shared-Bottom, MMoE, PLE, ESMM, and LHUC, and details Zhaozhuan’s own EPNET-based solution that improved click-through and conversion rates.
1 Overview of Multi-Task Multi-Scenario Problems
Applications serving C‑end users at large scale often face the combined difficulty of handling multiple tasks (e.g., click‑through rate, conversion rate) and multiple scenarios (e.g., feed recommendation, search), which creates optimization challenges for algorithm systems.
1.1 Background
Multi‑task refers to optimizing several user‑experience metrics simultaneously, while multi‑scenario means users express interests across different contexts, leading to divergent behavior patterns and data distributions.
1.2 Multi-Task Solutions
Typical evolution paths include Shared‑Bottom, Multi‑gate Mixture‑of‑Experts (MMoE), and Progressive Layered Extraction (PLE). Shared‑Bottom shares low‑level parameters across tasks; MMoE adds a gating network to weight expert outputs per task; PLE combines shared and task‑specific experts to capture both commonality and heterogeneity. The ESMM framework further addresses conditional relationships between tasks such as CTR and CVR.
1.3 Multi-Scenario Solutions
Learning Hidden Unit Contributions (LHUC) adapts dense parameters per scenario to avoid representation collapse. Dynamic‑weight mechanisms have inspired algorithms such as Kuaishou’s PEPNet, Alibaba’s M2M, AdaSparse, and STAR, all of which rely on gating networks to select or re‑weight information for each scenario.
2 Industry Solutions Overview
Kuaishou’s PEPNet uses a GateNU gating network to personalize embeddings for each scenario and task. Baidu’s MTMS follows a multi‑tower approach, separating embeddings per scene and task and employing a two‑stage training pipeline (representation learning then fine‑tuning). Meituan’s HiNet builds on MMoE with hierarchical information extraction, adding scene‑specific experts, shared experts, and a scene‑sensitive attention network, while re‑using MMoE for task‑level prediction.
3 Zhaozhuan's Multi‑Business Multi‑Scenario Practice
3.1 Problem and Solution
Zhaozhuan expanded from mobile 3C products to a wide range of categories, creating diverse business lines and scenarios. To address data imbalance across small scenes and heterogeneous material features, Zhaozhuan adopted an EPNET‑plus‑dynamic‑weight architecture.
The model consists of a representation generation module (handling scene descriptions, sparse/dense features, and a DomainNet that outputs feature weights) and a task prediction module that re‑uses the Deep & Cross Network (DCN) for CTR (or other task) prediction. The end‑to‑end training avoids the two‑stage limitation of MTMS.
Deployed in Zhaozhuan’s search, the model yielded over 6% lift in click‑through rate and more than 2% lift in conversion rate, especially benefiting small‑traffic categories.
3.2 Future Plans
Future work includes extending the approach to other targets such as CVR and to recommendation scenarios, while addressing cold‑start issues for new scenes or material features.
About the author: Li Guangming, algorithm engineer at Zhaozhuan, works on search, recommendation, user profiling, GNN, few‑shot learning, contrastive learning, and NLP. WeChat ID: gmlldgm .
References
[1] MMoE: Modeling Task Relationships in Multi‑task Learning with Multi‑gate Mixture‑of‑Experts
[2] PLE: Progressive Layered Extraction (PLE): A Novel Multi‑task Learning (MTL) Model for Personalized Recommendations
[3] MoE: Adaptive Mixtures of Local Experts
[4] ESMM: Entire Space Multi‑Task Model: An Effective Approach for Estimating Post‑Click Conversion Rate
[5] LHUC: Learning Hidden Unit Contribution for Unsupervised Speaker Adaptation of Neural Network Acoustic Models
[6] PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information
[7] M2M: A Multi‑Scenario Multi‑Task Meta Learning Approach for Advertiser Modeling
[8] AdaSparse: Learning Adaptively Sparse Structures for Multi‑Domain Click‑Through Rate Prediction
[9] STAR: One Model to Serve All: Star Topology Adaptive Recommender for Multi‑Domain CTR Prediction
[10] MTMS: Multi‑Task and Multi‑Scene Unified Ranking Model for Online Advertising
[11] HiNet: Novel Multi‑Scenario & Multi‑Task Learning with Hierarchical Information Extraction
[12] DCN: Deep & Cross Network for Ad Click Predictions
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.