Artificial Intelligence 13 min read

Bidden-MarfNet: Feature Missing-aware Routing-and-Fusion Network for Customer Lifetime Value Prediction

This paper presents Bidden-MarfNet, a novel architecture that explicitly encodes feature‑missing information and dynamically re‑weights samples to address feature missingness and label sparsity in user‑level LTV prediction for advertising, demonstrating superior performance over existing methods through extensive experiments.

IEG Growth Platform Technology Team

Nov 28, 2022

Bidden-MarfNet: Feature Missing-aware Routing-and-Fusion Network for Customer Lifetime Value Prediction

The article introduces a practical innovation for user‑level LTV prediction in advertising, tackling two major challenges: frequent feature missingness and extreme label sparsity. Traditional approaches focus on imputing missing values, which can mislead models when missing patterns vary across samples.

To overcome these issues, the authors propose a Feature Missing‑aware Routing‑and‑Fusion Network (MarfNet) that explicitly informs the model of missing features. Samples are first grouped by feature source, and each group’s coverage rate is encoded as a missing‑status embedding.

The architecture contains a one‑order missing‑aware layer that routes samples to specialized expert networks based on their missing‑status embeddings via a soft gating mechanism. This allows experts to focus on samples with similar missing patterns, reducing the influence of imputed features.

A two‑order missing‑aware layer further encodes missing information in feature interactions, concatenating expert outputs from the first layer with the one‑order missing embeddings and applying a similar gated fusion.

The final representation (concatenated one‑ and two‑order missing‑aware embeddings) passes through several fully‑connected layers to produce three outputs: payment probability (sigmoid), mean (identity), and variance (softplus). Training uses a novel Ziln Loss that combines a binary classification loss with a log‑normal regression loss.

To mitigate label sparsity, a batch‑in dynamic weighting mechanism (Bidden) computes a discrimination gap for each sample based on its logit distance to positive/negative class centers within a batch. Samples with smaller gaps (harder to distinguish) receive higher loss weights, governed by a steepness‑bias mapping.

Extensive experiments on the IEG advertising platform (WSDM 2023) show that Bidden‑MarfNet outperforms baseline models across overall performance, ablation studies, hyper‑parameter analysis, and online A/B tests. The gating network learns useful group‑wise missing patterns, and the dynamic weighting scheme proves robust across different model backbones.

In summary, the paper introduces a missing‑aware routing‑fusion network and a batch‑in dynamic weighting mechanism that together improve LTV prediction under severe feature missing and label sparsity, with potential applicability to other advertising tasks such as CTR and CVR prediction.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Mixture of Experts LTV prediction feature missing dynamic weighting label sparsity

Written by

IEG Growth Platform Technology Team

Official account of Tencent IEG Growth Platform Technology Team, showcasing cutting‑edge achievements across front‑end, back‑end, client, algorithm, testing and other domains.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.