Artificial Intelligence 25 min read

Two‑Stage Ranking Optimization in E‑commerce Search: From Coarse to Fine Ranking

The paper presents a two‑stage e‑commerce search framework where the coarse‑ranking stage is redesigned with multi‑objective optimization, expanded negative sampling, and listwise distillation—guided by a new global transaction hitrate metric—enabling it to surpass fine‑ranking on large candidate sets and boost overall GMV by about one percent.

DaTaobao Tech
DaTaobao Tech
DaTaobao Tech
Two‑Stage Ranking Optimization in E‑commerce Search: From Coarse to Fine Ranking

The document describes a two‑stage ranking framework (coarse ranking → fine ranking) used in a large‑scale e‑commerce search system. It explains how the coarse stage, originally a performance‑driven fallback for fine ranking, can be enhanced to outperform fine ranking on large candidate sets by revisiting its objectives and introducing a new evaluation metric called global transaction hitrate .

In the typical multi‑stage pipeline, recall generates ~10^5 candidates, coarse ranking filters them down to ~10^3, and fine ranking selects the final top‑10 items for exposure. The coarse stage focuses on the “mid‑tail” items, while fine ranking emphasizes the head items. The mismatch between their goals leads to two main problems: (1) coarse ranking targets differ from fine ranking, and (2) the scoring space of coarse ranking does not align with that of fine ranking.

To address these issues, the team introduced multi‑objective optimization, negative‑sample expansion, and listwise distillation, achieving roughly a 1.0% increase in overall GMV. A new metric, global hitrate , measures both in‑search (scene‑internal) and out‑of‑search (scene‑external) transactions, allowing a more unbiased assessment of coarse‑ranking performance.

The model architecture follows the industry‑standard inner‑product two‑tower design: user‑query and item vectors are computed separately and their dot product yields the relevance score. Training samples consist of three parts—exposed items, unexposed items, and random negatives—organized in a listwise fashion. The loss combines three objectives (exposure, click, transaction) and a distillation term that forces the coarse model to mimic fine‑ranking scores on exposed samples.

Several optimization directions are explored:

Extending distillation to include unexposed samples, with careful label‑smoothing to avoid zero‑gradient issues.

Aligning coarse‑ranking features with fine‑ranking features by adding user profile and long‑term transaction sequence features, yielding a +0.4 pt lift in offline hitrate.

Introducing cross‑tower features to capture user‑item interactions; offline experiments show +0.2 pt hitrate, though online gains are limited due to latency.

Replacing the inner‑product with shallow MLPs; experiments indicate no significant improvement and increased latency.

Revising the listwise softmax loss to focus on separating each positive sample from all negatives, which improves NDCG by +0.63 pt and out‑of‑search hitrate by ~0.2%.

Adjusting random‑negative sampling using a Word2Vec‑style frequency‑based distribution, which raises low‑exposure item hitrate by +1.04 pt while slightly harming high‑exposure items.

Extensive offline A/B tests confirm that these enhancements collectively raise the coarse‑ranking’s contribution to the overall transaction volume by about 1.0%. The analysis also highlights that reducing the loss between coarse‑ranking and fine‑ranking, as well as between recall and coarse‑ranking, is essential for consistent online impact.

In summary, by redefining evaluation metrics, refining sample construction, improving loss functions, and aligning feature spaces, the coarse‑ranking stage can be transformed from a simple speed‑optimizing filter into a powerful component that meaningfully boosts e‑commerce search performance.

machine learningmetricse-commercecoarse rankingfine rankingsearch ranking
DaTaobao Tech
Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.