Preference‑Oriented Diversity Model Based on Mutual Information for E‑commerce Search Re‑ranking (SIGIR 2024)
This paper, accepted at SIGIR 2024, introduces PODM‑MI, a preference‑oriented diversity re‑ranking model for e‑commerce search that jointly optimizes accuracy and diversity by modeling user intent with multivariate Gaussian distributions and maximizing mutual information between user preferences and candidate items.
Re‑ranking in e‑commerce search aims to reorder items by considering their relationships, but existing methods often improve scoring accuracy at the expense of result diversity, or vice‑versa. Users exhibit varying diversity needs across decision stages, requiring a balanced approach.
The authors propose PODM‑MI (Preference‑oriented Diversity Model based on Mutual Information), which simultaneously addresses accuracy and diversity. The framework consists of two main components: PON (Preference‑Oriented Modeling) for capturing user diversity preferences, and SAM (Similarity‑aware Mutual‑information) for aligning candidate items with those preferences.
PON models user intent using a multivariate Gaussian distribution derived from historical queries, session information, and behavior streams. The distribution’s mean and diagonal covariance capture the evolving uncertainty of user preferences, while a parallel Gaussian models item‑level diversity.
SAM quantifies the alignment between user preference and item diversity via mutual information. Because direct estimation is intractable, a variational lower‑bound is derived (see ), enabling gradient‑based optimization. An learnable utility matrix further adjusts the influence of each item according to the enhanced consistency.
The overall loss combines a PRM classification loss (see ) with a mutual‑information loss, encouraging both accurate ranking and diverse outcomes.
Extensive online A/B testing on JD.com’s main search engine demonstrates that PODM‑MI significantly improves user conversion probability (UCVR) and result diversity. Visual analysis using entropy measures and T‑SNE clustering shows clear separation of user intent groups, confirming the model’s ability to adaptively balance relevance and variety.
Case studies illustrate how the model yields more diverse results for users with broad query histories (e.g., "Switch, Zelda, phone case…") and more focused results when intent becomes specific (e.g., repeated searches for "dress").
Future work includes incorporating finer‑grained features for better intent modeling, refining the update mechanism for user preference representations, and making intent modeling explicitly influence ranking decisions.
The research team also announces open recruitment for positions related to large‑model‑driven retrieval and ranking, inviting interested candidates to apply.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.