Artificial Intelligence 17 min read

Graph Neural Network Based Content Recall and Popularity Bias Mitigation for Alibaba's Home‑Decor Platform

The paper presents Alibaba’s home‑decor platform solution that combines graph‑neural‑network side‑information mining and a multi‑view GNN framework with the DICE causal embedding approach to alleviate sparse user behavior and popularity bias, achieving higher recall accuracy and diversity as demonstrated by offline metrics and online A/B test improvements.

DaTaobao Tech

Mar 10, 2022

Graph Neural Network Based Content Recall and Popularity Bias Mitigation for Alibaba's Home‑Decor Platform

The article describes the practical research conducted on the "Mei Ping Mei Wu" home‑decor platform, focusing on the challenges of sparse user behavior and strong popularity bias in content recommendation.

Two recall paradigms are introduced: a traditional U‑I‑I (user‑item‑item) recall that leverages user attributes, content style, space, and attached products, and a deep U‑I approach that uses vector similarity to increase diversity.

To address data sparsity, the authors propose deep side‑information mining with a Graph Neural Network (GNN). A content‑content homogeneous graph is built where each node represents a piece of content and its side information (items, categories, style, space) is used as node attributes. The pipeline includes GraphSAGE‑style sampling, an encoder that concatenates ID embeddings and aggregates neighbor vectors (mean aggregation), a decoder that reconstructs edge weights, and a binary cross‑entropy loss.

For richer representations, a multi‑view GNN framework (M2GRL) is employed. Separate intra‑view graphs (e.g., category‑category, item‑item) are learned alongside an inter‑view graph that connects content with its side information. Multi‑task learning balances the intra‑view and inter‑view objectives.

Popularity bias is mitigated using the DICE (Disentangling Interest and Conformity with Causal Embedding) framework. User and item vectors are decoupled into interest and conformity components, trained on datasets split by popularity. Four auxiliary tasks—Conformity Modeling, Interest Modeling, Click Estimation, and Discrepancy regularization—are jointly optimized.

Offline, top‑k candidate items are pre‑computed from recent 15‑day click logs, and noisy edges are filtered by a relevance threshold. Online A/B tests show consistent gains: +0.75% to +0.80% in pCTR, +0.48% to +1.50% in ipv_pCTR, and modest improvements in diversity metrics.

The authors conclude that GNN‑based high‑order relation modeling and DICE‑based bias removal effectively improve recall accuracy and diversity for low‑activity users, and outline future directions such as long‑tail node handling, GNN‑ranking integration, and multi‑factor decoupling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

DICE GNN graph representation learning multi-view

Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.