Generative Recommendation Systems for JD Alliance Advertising: Design, Implementation, and Evaluation
This article surveys how large language models reshape recommendation systems, details a generative recommender framework for JD Alliance ads—including item representation, model input, training, and inference—presents extensive offline and online experiments, and discusses future optimization directions.
Large language models (LLMs) are profoundly influencing natural language processing and present new opportunities for recommendation systems (RS). The paper introduces the motivation for integrating LLMs with RS, particularly within the JD Alliance advertising platform, and outlines the overall workflow.
Generative Recommendation Systems directly generate recommended items without scoring each candidate, simplifying the traditional multi‑stage pipeline (recall, coarse ranking, fine ranking). Advantages include streamlined processes, better generalization for cold‑start users, and increased stability.
JD Alliance Advertising is a CPS‑based marketing platform that targets low‑activity users across multiple scenarios. Challenges include data sparsity, cold‑start, scenario understanding, and maintaining diversity and novelty.
The integration of LLMs offers high‑quality textual representations and world knowledge, enabling more accurate context‑aware recommendations and mitigating sparsity and cold‑start issues.
Four Core Stages of Generative RS are described: (1) Item representation – using numeric IDs, textual metadata, or semantic IDs (SID); (2) Model input – task description, user profile, and context encoded as text; (3) Model training – next‑item prediction and alignment tasks; (4) Model inference – free or constrained generation with beam search or Trie‑based decoding.
Item Representation details three methods: numeric ID (split into token sequences), textual metadata (titles, descriptions), and semantic‑based ID obtained by quantizing embeddings via RQ‑VAE. The paper discusses trade‑offs such as length, semantic richness, and uniqueness.
Model Input Representation combines a task prompt, user interaction history, and contextual information into a textual sequence, enabling LLMs to predict the next item token sequence.
Model Training involves constructing text pairs {"instruction": "...", "response": "..."} for next‑item prediction and alignment between SIDs and item titles. Example training instances are provided in JSON format.
Model Inference generates item identifiers token‑by‑token, using either unconstrained generation (with post‑hoc filtering) or constrained decoding (Trie, FM‑index) to ensure validity.
The paper surveys representative generative RS works (e.g., RecSysLLM, P5, TIGER, LC‑Rec, BIGRec) and summarizes their contributions.
Practical Solution proposes a framework that uses semantic IDs for items and aligns collaborative and textual signals through multi‑task training. The base models (Qwen1.5‑0.5B/1.8B/4B, Yi‑6B) are fine‑tuned with added SID tokens, and beam‑size‑20 constrained decoding is employed.
Experiments include offline evaluations (HR@1/5/10, NDCG@1/5/10) across model sizes and base models, as well as online A/B tests measuring UCTR. Results show larger models perform better, Yi‑6B excels without constraints, and generative models achieve comparable or superior online metrics, especially for sparse‑data scenarios.
Optimization Directions suggest improving data quality, extending SID training for incremental updates, employing LoRA, multi‑task mixing, model distillation, pruning, quantization, and exploring multimodal queries and recommendation‑reason generation.
The article concludes with a call for collaboration to advance generative recommendation technology.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.