Artificial Intelligence 20 min read

Iterative Evolution of iQIYI Video Search Ranking Models

This article details iQIYI's practical experience in building and iterating its video search system, covering basic relevance, semantic matching via translation and click models, deep‑learning approaches, and ranking model evolution from heuristic rules to learning‑to‑rank, highlighting challenges, solutions, and performance gains.

DataFunTalk
DataFunTalk
DataFunTalk
Iterative Evolution of iQIYI Video Search Ranking Models

Abstract

This talk shares real cases and specific problems encountered by iQIYI when building video search, along with the solutions applied. The presentation is aimed at front‑line developers and provides practical insights.

1. Introduction

iQIYI supports multiple search entry points in its app, including image search, subtitle search, and voice search. The core of the system is text query search, which relies heavily on natural language processing and semantic understanding.

At the video content level, the most important signals are the video’s textual metadata (title, cast, etc.) and user behavior signals (search, browse, comments, danmu). Matching is performed mainly on the doc and query layers.

2. System Constraints

The overall search system must satisfy five constraints:

Precise matching – return results that exactly match the query and rank the top results.

Content ecosystem – cover all Chinese video resources, including non‑copyrighted videos, and also support literature and comics.

Intelligent distribution – incentivize original content and prevent low‑quality content from dominating.

Cold start – give new videos a chance to be discovered despite weaker features.

Diversity – avoid showing overly similar results at the top.

These constraints guide the design of the full‑stack search architecture shown in the diagram.

3. Recall Strategy Iteration

3.1 Basic Relevance

Traditional inverted‑index based exact matching is used to retrieve candidate videos. The process includes tokenizing the user query, building an inverted index on video metadata, and performing classic term‑matching retrieval.

Key issues include token granularity and term weighting, which affect recall quality.

3.2 Semantic Relevance

3.2.1 Translation Model

To expand user queries, a translation model is built from click logs (query‑document pairs). The pipeline consists of three steps: (1) generate parallel corpora from click data, (2) perform word/phrase alignment, (3) filter noisy pairs using statistical and manually labeled ground truth. The model’s translation probabilities are combined with language model scores and relevance features to select effective expansion terms.

3.2.2 Click‑Based Relevance

A bipartite graph of queries and documents is constructed from click logs. By propagating relevance through the graph, additional query terms are discovered, helping to mitigate cold‑start and vocabulary gaps.

3.2.3 Deep Learning

Embedding‑based semantic matching is employed using both expression‑based (e.g., DSSM) and interaction‑based architectures. Multi‑granularity tokenization, weighted averaging of embeddings, and fully‑connected layers generate relevance scores that are optimized with a pairwise/listwise loss.

4. Ranking Strategy Iteration

After recall, the system must rank the candidates. The evolution follows three stages:

Heuristic rule‑based ranking.

Learning‑to‑rank using point‑wise, pair‑wise, and list‑wise objectives (e.g., NDCG).

Deep neural network (DNN) ranking that fuses dense embedding features with high‑dimensional sparse features.

Features include query intent, video quality metrics, relevance scores (BM25, click‑through rate), and post‑click signals (watch time, satisfaction level). The DNN model concatenates query and document embeddings, performs a dot‑product, adds dense features, and passes through three fully‑connected layers with a sigmoid output. The loss is computed using NDCG.

Model fusion experiments (LR + GBDT, GBDT + LR) were conducted but showed limited gains. The final DNN ranking model achieved significant improvements, such as reduced second‑search rate.

5. Summary

iQIYI’s search engine has progressed along two parallel tracks: improving relevance (basic, semantic, knowledge‑graph) and evolving ranking models (heuristic → learning‑to‑rank → DNN). Ongoing work includes handling sparse‑dense feature fusion and exploring reinforcement‑learning for cold‑start scenarios.

machine learningDeep Learninginformation retrievalsearch rankingsemantic matchingvideo recommendation
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.