iQIYI Search Ranking Algorithm Practice – NLP and Search Integration
At iQIYI’s iTech Conference, Zhang Zhigang detailed a full‑stack search ranking system that combines NLP‑driven query analysis, hierarchical indexing, multi‑stage coarse‑to‑fine ranking, Transformer‑based re‑ranking, sparse‑feature DNN enhancements and LIME/SE‑Block explainability, delivering measurable gains in CTR and NDCG for the platform’s video search.
On the afternoon of July 3, iQIYI's technology product team hosted an offline technical salon titled “iTech Conference” with the theme “NLP and Search”. Experts from ByteDance, Qunar, and Tencent were invited to discuss the synergy between NLP and search, and iQIYI’s own expert Zhang Zhigang presented the “iQIYI Search Ranking Algorithm Practice”.
Welfare: Follow the public account and reply with the keyword “NLP” to receive the full PPT and video recording of the guest talks.
The presentation covered the following main sections:
1. Background – iQIYI Search Scenario
iQIYI’s search covers multiple product lines (main app, Quick Play, Lite version, TV, etc.), business forms (comprehensive search and vertical searches), and data types (albums, short videos, iQIYI accounts). The result page displays a mixture of long videos, short videos, novels, etc., and the top tab bar separates comprehensive, film, short‑video, and other vertical searches.
2. Business Optimization Goals
Four major objectives: improve search efficiency (Session CTR, UCTR, second‑search rate), promote user consumption (play duration, click count, interaction count), enrich content ecosystem, and address new‑hot diversity (cold start, timeliness, diversity).
3. Overall Architecture
The architecture consists of a user‑side request interface, a comprehensive scheduling module (handling URL reception and business logic), query analysis (segmentation, correction, intent recognition), an operation‑configuration module for manual overrides, a merge‑re‑rank module (deduplication, re‑ranking, policy adjustment), and a prediction service built on a Lambda architecture with both full‑index and real‑time index pipelines.
4. Algorithm Strategy Framework
The strategy is divided into three parts: indexing (hierarchical and shard processing), base retrieval (inverted index, vector retrieval such as DSSM and BERT), and upper‑layer ranking (coarse ranking, fine ranking, re‑ranking). Ranking models include Triplet Loss, contrastive loss, and channel‑wise classification loss, combined linearly.
5. Ranking Process
Search corpus reaches billions; recall yields ~100k candidates, coarse ranking reduces to thousands, fine ranking to hundreds, and final re‑ranking to tens before UI display.
STEP1 – Recall Trigger Selection
Queries are segmented; term translation models are built using high‑frequency Query‑Doc title pairs, followed by alignment, feature extraction (translation probability, segmentation features, character vectors), and a binary quality model to filter high‑quality term pairs.
STEP2 – Coarse Ranking
Tree‑based models select top‑K candidates and negative samples, emphasizing quality scores, relevance, and timeliness.
STEP3 – Fine Ranking
Learning‑to‑Rank approaches (point‑wise, pair‑wise, list‑wise) are discussed. Pair‑wise models such as RankNet use cross‑entropy loss; list‑wise extensions incorporate NDCG gradients. LambdaMart is adopted for efficiency and supports feature combinations and warm‑starts.
To address high‑dimensional sparse features and billion‑scale data, a DNN upgrade is introduced, adding sparse features (query segmentation, intent tags, actor, content tags, clicked‑query, etc.) and a DIN‑style user interest model. ID features (concatenated QueryID and DocID) improve Top‑K memory.
STEP4 – Re‑ranking (SE‑Rank)
SE‑Rank adds context‑aware re‑ranking using a Transformer‑based module that processes all candidate Docs jointly, followed by pooling and a final fully‑connected layer. This improves discrimination among candidates and yields significant NDCG gains.
STEP5 – Explainability
LIME is applied to dense features at the Query‑Doc pair level, perturbing features to train a local linear model and obtain feature importance. For sparse features, a simplified SE‑Block injects learned weights into embeddings, providing clear importance scores. Both methods improve model interpretability and offline AUC.
Overall, the talk presented a comprehensive pipeline from query processing, term translation, multi‑stage ranking, DNN upgrades, Top‑K optimizations for hot and long‑tail queries, position‑bias mitigation, and model explainability, demonstrating measurable improvements in click‑through rate and NDCG for iQIYI’s video search service.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.