Artificial Intelligence 14 min read

Solving Technical Challenges with Large AI Models at JD Retail: Reward Modeling, Query Expansion, and Model Pruning

JD Retail’s engineering team tackles hard AI problems by replacing a monolithic reward model with specialized small models for ad‑image generation, deploying an LLM‑driven query‑expansion pipeline that lifts conversion rates, and pruning text‑to‑image transformers using FFT and RDP to boost throughput 40% without loss, while building comprehensive evaluation tools and a semantic smart‑assistant.

JD Retail Technology

May 7, 2025

Solving Technical Challenges with Large AI Models at JD Retail: Reward Modeling, Query Expansion, and Model Pruning

JD Retail's technology team showcases how young algorithm engineers solve hard technical problems using large‑scale AI models.

Reward modeling for advertising image generation : The team replaced a single large reward model with multiple specialized small reward models that evaluate aspects such as product shape, placement, and color. They built a training‑inference framework where the generator model is fine‑tuned by reinforcement learning using multi‑dimensional quality signals. The resulting system achieves 98% usable image rate and a 30% recall improvement.

Query expansion for e‑commerce search : To bridge the gap between user queries and product descriptions, they designed a large‑language‑model (LLM) based query‑expansion framework. The pipeline consists of (1) pre‑training on consumer behavior and product data, (2) task‑driven fine‑tuning with high‑quality query‑expansion pairs, and (3) reinforcement‑learning alignment with a simulated search engine. Offline experiments and online A/B tests show a noticeable conversion‑rate boost.

Model pruning with FFT and RDP : To reduce the inference cost of text‑to‑image models, they applied Fast Fourier Transform (FFT) for frequency‑domain analysis and the Ramer‑Douglas‑Peucker (RDP) algorithm to locate redundant transformer blocks. The combined method increased training throughput by 40% without degrading performance. The core RDP implementation is shown below:

def rdp(points, epsilon):
    """Ramer‑Douglas‑Peucker algorithm for curve simplification."""
    def perpendicular_distance(pt, line_start, line_end):
        # compute perpendicular distance
        if np.array_equal(line_start, line_end):
            return np.linalg.norm(pt - line_start)
        else:
            return np.abs(np.cross(line_end - line_start, line_start - pt)) / np.linalg.norm(line_end - line_start)
    def rdp_recursion(points, epsilon):
        dmax = 0.0
        index = 0
        end = len(points)
        for i in range(1, end - 1):
            d = perpendicular_distance(points[i], points[0], points[-1])
            if d > dmax:
                index = i
                dmax = d
        if dmax > epsilon:
            results1 = rdp_recursion(points[:index+1], epsilon)
            results2 = rdp_recursion(points[index:], epsilon)
            return results1[:-1] + results2
        else:
            return [points[0], points[-1]]
    return rdp_recursion(points, epsilon)

They also built an Agent full‑link evaluation system that provides both local and end‑to‑end scores for different question types, enabling precise diagnosis of model failures.

Another contribution is a semantic‑driven smart‑assistant that maps user‑specified dimensions to structured product parameters using multi‑turn LLM reasoning and reinforcement learning, leading to higher user engagement and conversion.

Throughout these projects, the engineers emphasize iterative experimentation, cross‑domain knowledge transfer, and continuous learning from papers and open‑source communities.

The article concludes with a recruitment call for JD Retail’s algorithm team, inviting candidates to join a technically deep and commercially impactful environment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Large Models reinforcement learning Model Pruning query-expansion Reward Modeling

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.