Artificial Intelligence 14 min read

Calibration-Based Multi-Task Learning for CVR: Model Design, Experiments, and Future Directions

This article reviews the evolution of CVR multi‑task learning, introduces a representation‑calibration architecture that shares embeddings between CTR and CVR, details four calibration network designs, reports offline and online AUC improvements, and outlines future research on embedding clustering and loss‑level triplet modeling.

DataFunSummit

Mar 6, 2023

Calibration-Based Multi-Task Learning for CVR: Model Design, Experiments, and Future Directions

The paper revisits the progress of conversion‑rate (CVR) prediction from 2017 to 2022, highlighting the shift from coarse‑grained sharing in ESMM and MMOE to fine‑grained neuron‑level sharing in Calibration4CVR and NCS4CVR, and the evolution from manually specified shared components to automatically learned shared representations.

1. Introduction explains multi‑task learning (MTL) as a solution to data sparsity, describing hard sharing, soft parameter sharing, and sample sharing, and emphasizes the importance of handling shared and conflicting knowledge between CTR and CVR tasks.

2. Model Design

2.1 Single‑Task Network – The baseline single‑task network for CVR consists of a bias sub‑network and a CVR sub‑network (see Figure 1).

2.2 Calibration‑Based CVR – Embedding layers for CTR and CVR are shared. Two MLPs calibrate CTR features for CVR (CTR‑Pos‑Net to CVR‑Pos‑Net) and CTR‑Net knowledge to CVR‑Net, avoiding the unstable full‑space modeling of ESMM. The overall framework (Figure 2) can be extended to additional tasks such as add‑to‑cart or dwell time.

The authors propose four calibration structures:

(1) CTR Calibration – Two‑layer MLP processes CTR output, concatenates with CVR layer i, providing non‑linear transformation and feature selection (Figure 3).

(2) CTR Calibration with Scaling – Single‑layer MLP followed by a learned scaling factor (0‑1) before concatenation with CVR (Figure 4).

(3) CVR Calibration – Squeeze‑and‑excitation on CTR produces a calibration vector (sigmoid) that multiplies CVR output (Figure 5).

(4) CVR Calibration Concat – Same as (3) but concatenates calibrated CVR with original CVR representation (Figure 6).

3. Experiments – Online daily data were used for training (multiple days) and one day for testing. The calibration layers were added on top of the baseline CTR/CVR networks. Results show an offline AUC gain of ~0.01 and an online CVR lift of about 2.8% for the best calibration variant. The CTR‑Calibration‑Concat variant did not yield further gains, suggesting the calibration already captures the optimal representation.

Additional findings:

A 1:1 CTR‑to‑CVR sample ratio performed best.

Weakening CVR’s influence on embeddings reduced CVR AUC, indicating distinct embedding needs for CTR and CVR.

4. Future Work – Plans include more thorough benchmarking, clustering analysis of CTR vs. CVR embeddings, triplet‑wise loss to mitigate task conflict, and extending calibration to additional tasks for cumulative knowledge accumulation.

5. References

[1] Xiao Ma et al., “Entire Space Multi‑Task Model: An Effective Approach for Estimating Post‑Click Conversion Rate,” SIGIR‑2018.

[2] Wen Hong et al., “Entire space multi‑task modeling via post‑click behavior decomposition for conversion rate prediction,” SIGIR‑2020.

[3] Zhao Zhe et al., “Recommending what video to watch next: a multitask ranking system,” RecSys‑2019.

[4] Tang Hongyan et al., “Progressive layered extraction (PLE): A novel multi‑task learning model for personalized recommendations,” RecSys‑2020.

[5] Xuanji Xiao et al., “LT4REC: A Lottery Ticket Hypothesis Based Multi‑task Practice for Video Recommendation System,” arXiv 2020.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

CVR Calibration

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.