Generative Re‑ranking for Diverse and Context‑Aware Recommendation
The paper presents a generative re‑ranking framework for Taobao’s home‑decor channel that combines heuristic sequence generation methods (MMR, DPP, beam search) with a context‑aware encoder to produce diverse, relevance‑balanced recommendation lists, achieving notable gains in PV, IPV, CTR and click‑diversity over traditional point‑wise ranking.
This article is the fourth in a series that shares practical research on recall, ranking, and cold‑start modules of the Meipingmeiwu (Every Square Every House) channel on Taobao.
Problem background : The channel presents scene‑based home‑decor content with multiple space (living room, bedroom, etc.) and style (Japanese, Nordic, etc.) attributes. Traditional point‑wise ranking optimizes click‑through or conversion rates but leads to homogeneous recommendation lists, causing user fatigue, reduced exposure for long‑tail items, and ecosystem imbalance.
Related work : Existing context‑aware re‑ranking methods such as DLCM, PRM, miRNN, Seq2Slate, EG‑Rerank, and GRN either break the consistency between scoring context and display context or fail to fully model the influence of later items on earlier ones.
Technical solution :
1. Sequence generation : Parallel heuristic strategies (MMR, DPP, beam search) generate candidate lists that balance efficiency and diversity.
2. MMR : Greedy selection that maximizes a weighted sum of relevance and similarity to already selected items.
3. DPP : Determinantal point process selects a subset maximizing the determinant of a kernel matrix, jointly considering relevance and pairwise similarity.
4. Beam search : A middle ground between greedy and exhaustive search; maintains the top‑k partial sequences at each step to approximate a globally optimal list.
5. Sequence evaluation : A context‑aware encoder (LSTM, Bi‑LSTM, or Multi‑Head Self‑Attention) processes the whole candidate list, producing a value score for each position. A shared task network (MMoE‑style) combines user features, item features, and encoder outputs to predict efficiency scores. The final list value is the sum of position scores; the list with the highest value is shown.
Experimental results :
• Offline PV and IPV improvements: DPP (+4.33% PV, +1.06% IPV), MMR (+9.72% PV, –15.18% IPV), beam search (+4.28% PV, –2.45% IPV). • Context‑aware models (unidirectional LSTM +2.3 pt, bidirectional LSTM +3.0 pt, MSA w/position embedding +4.1 pt in PV‑AUC) outperform the baseline rank‑score model. • Online A/B test (7‑day) shows +2.17% pCTR, +0.25% depth, +2.43% IPV, slight drops in exposure diversity but +1.60%–1.74% increase in click diversity.
The generative re‑ranking framework eliminates the mismatch between scoring and display contexts, improves overall list utility, and boosts both efficiency metrics and user click diversity.
Conclusion : Generative re‑ranking, consisting of diverse sequence generation and context‑aware evaluation, effectively balances relevance and diversity, delivering higher CTR, IPV, and click‑diversity compared with traditional DPP‑based methods. Future work includes direct modeling of user scrolling behavior, integrating generative models guided by the evaluator, and adding exploration mechanisms to mitigate sample bias.
References : [1]‑[15] (selected papers on multi‑interest networks, cross‑domain recommendation, MMR, DPP, visual recommendation, impression discounting, deep listwise context models, generative rerank networks, etc.).
DaTaobao Tech
Official account of DaTaobao Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.