Tagged articles
2 articles
Page 1 of 1
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Feb 28, 2024 · Artificial Intelligence

Mastering Multi-Task Learning: Network Designs & Loss Balancing

This article reviews the challenges of multi‑task learning, compares various network architectures such as hard‑parameter sharing, MMoE, CGC, and PLE, and examines loss‑balancing techniques like GradNorm, Dynamic Weight Average and task‑prioritization, offering insights on how to mitigate the “seesaw” effect and improve overall performance.

AI researchdynamic weightinggradient normalization
0 likes · 15 min read
Mastering Multi-Task Learning: Network Designs & Loss Balancing
DaTaobao Tech
DaTaobao Tech
Mar 29, 2022 · Artificial Intelligence

Dynamic Weight Averaging and Gradient Normalization for Multi‑Task Recommendation Models

To improve multi‑task recommendation in the “每平每屋” system, the team augments an MMoE ranking model with dynamic weight averaging, dynamic task prioritization, and GradNorm gradient normalization, stabilizing loss convergence across CTR, CVR, and fav tasks and delivering 3–4% online metric gains.

Dynamic Weight AveragingMMoEgradient normalization
0 likes · 10 min read
Dynamic Weight Averaging and Gradient Normalization for Multi‑Task Recommendation Models