How JD’s Advertising Lab Leverages Large‑Scale AI to Transform E‑Commerce Ads
JD's advertising research team combines deep learning, multimodal modeling, reinforcement‑learning auctions, and generative recommendation to boost ad relevance, improve long‑tail product exposure, and overcome large‑model inference challenges in a high‑traffic e‑commerce environment.
JD Retail Advertising Department drives site‑wide traffic monetization and marketing effectiveness, with its R&D team applying cutting‑edge AI algorithms to massive user and merchant data, empowering millions of merchants and billions of consumers.
Flow Value Estimation – Better Understanding of User‑Item‑Context
Query Intent Understanding
Query intent recognition parses user search queries into categories, correcting errors, extracting entities, and rewriting queries to provide accurate signals for downstream recall, relevance, and ranking.
Generic terms with multiple intents (e.g., "fruit", "birthday gift")
Ambiguous terms with multiple intents (e.g., "Xiaomi" could refer to grain or phone)
Cold‑start long‑tail categories lacking exposure
Long‑tail queries with diverse expressions
Generation‑Matching Model for Long‑Tail Training Data
A generative‑matching pipeline pre‑trains a query generator from SKU titles/attributes and a matching model that scores generated queries against original titles, filtering low‑quality queries. The generated queries, labeled with the SKU’s category, augment training data to balance long‑tail categories.
These synthetic queries also support query suggestion and rewriting tasks.
Prior Knowledge Injection for Medium‑Tail Recall
To break the feedback‑loop bias toward high‑click categories, the system injects prior knowledge such as category semantics, co‑occurrence graphs, and GCN‑based encoders, training a BERT‑GCN hybrid with a semi‑supervised loss that matches query‑category semantics.
Multimodal Content Understanding
Multimodal Representation in Recall
A dual‑stream pipeline extracts text embeddings (BGE‑large‑zh1.5) from titles, brands, and categories, and visual embeddings (ViT‑CLIP‑base) from product images. Contrastive learning aligns the modalities, and a Gate‑GNN aggregates item‑item graphs to produce a unified multimodal product vector for recall.
Multimodal Representation in Creative Selection
Self‑supervised vision models (DINO) generate robust image embeddings without requiring object masks, enabling fine‑grained creative ranking that captures visual appeal and high‑order structured information.
Flow Selling Mechanism – ListVCG Reinforcement‑Learning Auction
ListVCG reformulates the combinatorial 700‑choose‑4 auction as a reinforcement‑learning problem, using an Actor‑Critic architecture where the Actor samples candidate permutations and the Critic evaluates them with real‑world feedback, iteratively improving the policy.
Multi‑Agent RL for Bidding and Mechanism
Separate bidding and mechanism agents are co‑trained; the bidding agent predicts optimal bid ratios, while the mechanism agent learns allocation and pricing policies. Offline simulation, reward shaping, and curriculum RL address sparse reward and environment mismatch.
Generative Recommendation for Advertising
The pipeline quantizes high‑click product titles into semantic IDs using RQ‑VAE, expands the token vocabulary of a large language model, and fine‑tunes it on bidirectional translation tasks between semantic IDs and product text.
Prompt: Please tell me the title of the product whose four‑tuple representation is {input_tuple}?
Input: <a_1><b_2><c_3><d_4>
Output: Huawei Mate60 Pro 16G+512GB WhiteSubsequent asymmetric prediction tasks generate the next product a user is likely to browse, using either semantic IDs or raw text as input.
Prompt: User’s historical product tuple sequence is {input_tuple1, …, input_tupleN}, predict the next product.
Input: <a_1><b_2><c_3><d_4>, …
Output: Huawei Mate60 Pro 16G+512GB WhiteAdvertising Creative Generation
To improve AI‑generated ad creatives, the team proposes a Multimodal Reliable Feedback Network (RFNet) that automatically evaluates generated images, feeding back into a recursive generation loop. Consistent‑Condition regularization fine‑tunes diffusion models using RFNet scores, dramatically raising usable image rates. A 1M‑image labeled dataset (RF1M) supports training; the work appears at ECCV 2024.
Large‑Model Engineering for Advertising
Challenges include sub‑100 ms latency for 0.5‑72 B‑parameter models, high inference cost, and complex business pipelines. JD advertising has deployed 1.5 B models with token‑cost efficiency, optimized hardware topology, chip‑specific adaptations, distributed training, caching, and load‑balancing to support million‑QPS workloads.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.