Artificial Intelligence 14 min read

Solving Real-World AI Challenges at JD Retail: Reward Model Ensembles, Query Expansion, and Model Pruning

This article recounts how JD Retail's young algorithm engineers tackled diverse AI problems—optimizing reward‑model ensembles for ad image generation, building large‑language‑model‑based query expansion, and pruning diffusion models with FFT and RDP—while sharing their technical approaches, code snippets, and growth reflections.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Solving Real-World AI Challenges at JD Retail: Reward Model Ensembles, Query Expansion, and Model Pruning

JD Retail’s technology team, composed largely of post‑95 algorithm engineers, demonstrates rapid growth by confronting hard AI problems such as evaluating advertising images, expanding user search queries, and reducing the size of large diffusion models.

Technical challenge 1: Determining whether an ad image meets quality standards is highly subjective, and existing reward models cannot guide AI to precise improvements. The proposed solution combines multiple small, specialized reward models—each focusing on aspects like shape, placement, or color—to replace a single large model, improving granularity and allowing flexible business rule integration.

The team built a training‑inference framework where the generator creates ad images, the ensemble of reward models provides multidimensional signals, and reinforcement learning fine‑tunes the generator. This pipeline achieved a 98% usable image rate and a 30% recall increase.

Technical challenge 2: Traditional query‑expansion models struggle with novel user intents, leading to poor product recall. The engineers adopted a large‑language‑model (LLM) approach enhanced with reinforcement learning from human feedback (RLHF) to create a three‑stage training pipeline: e‑commerce pre‑training, task‑specific fine‑tuning, and RL‑based alignment with a search‑engine simulator. The result was a significant boost in conversion rates.

Technical challenge 3: High‑capacity text‑to‑image models consume excessive compute in e‑commerce settings. By applying frequency‑domain analysis (FFT) to detect redundant components and the Ramer‑Douglas‑Peucker (RDP) algorithm to locate critical points in the spectrum, the team pruned unnecessary blocks, increasing training throughput by 40% without harming performance.

Key code snippets illustrating these methods are shown below:

def
rdp
(
points, epsilon
):
"""
Ramer-Douglas-Peucker algorithm for curve simplification.
points: sequence of points on the curve
epsilon: tolerance, larger values simplify more
"""
def
perpendicular_distance
(
pt, line_start, line_end
):
# compute distance from pt to line segment
if
np.array_equal(line_start, line_end):
return
np.linalg.norm(pt - line_start)
else
:
return
np.
abs
(np.cross(line_end - line_start, line_start - pt)) / np.linalg.norm(line_end - line_start)
def
rdp_recursion
(
points, epsilon
):
# recursive RDP, find farthest point
...
return
rdp_recursion(points, epsilon)
def
get_token_prob
(
prompt, target_token
):
# encode input and locate prediction position
inputs = tokenizer(prompt, return_tensors=
"pt"
)
input_ids = inputs.input_ids
target_len =
len
(tokenizer.encode(target_token, add_special_tokens=
False
))
# obtain model logits
with
torch.no_grad():
outputs = model(**inputs)
next_token_logits = outputs.logits[:, -
1
, :]
# convert to probability distribution
probs = F.softmax(next_token_logits, dim=-
1
)
# get probability of target token
target_ids = tokenizer.encode(target_token, add_special_tokens=
False
)
return
probs[
0
, target_ids[
0
]].item()

Across all projects, the engineers emphasize systematic problem framing, iterative experimentation, continuous learning from top‑conference papers, and building reusable methodologies that accelerate both personal and team growth.

AILarge Language Modelsreinforcement learningalgorithm engineeringmodel pruningquery expansion
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.