AI-Powered Search in iQIYI: Techniques, Architecture, and Implementation
iQIYI’s AI‑powered search expands beyond title‑only queries by handling fuzzy role, plot, star, award, and semantic searches, using Chain‑of‑Thought‑generated TIPS, Retrieval‑Augmented Generation with sophisticated indexing, chunking, embedding, reranking, and prompt‑engineering to deliver personalized, accurate video recommendations that boost user engagement.
iQIYI has fully deployed AI technology in its search engine, moving beyond traditional title‑only queries to support fuzzy searches across five scenarios: role search, plot search, star search, award search, and semantic search. By accurately interpreting vague queries, the system can recommend relevant movies or series through concise TIPS and personalized content suggestions.
The TIPS (top‑line information) displayed at the top of each result is generated using Chain‑of‑Thought (CoT) prompting. The process first extracts salient highlights from extensive film metadata (e.g., director, cast, adaptation, plot) and then composes an attractive tip. An example with the film "手机" shows how CoT transforms raw highlights into a succinct recommendation.
For complex semantic queries, iQIYI adopts Retrieval‑Augmented Generation (RAG). Unlike conventional search (query → index → ranked results) or direct LLM answering (query → model → answer), RAG first retrieves relevant documents, feeds them to a generative model, and then produces the final response, improving accuracy and relevance.
The RAG pipeline consists of three main steps: index construction, retrieval, and generation. Index construction involves data preprocessing, cleaning, and metadata extraction, followed by chunking and embedding. Chunk size is a trade‑off between precision and efficiency; iQIYI experiments with fixed, adaptive, and Late Chunking (embedding first, then chunking) to balance these factors. Embedding models evaluated include OpenAI (text‑embedding‑ada‑002, text‑embedding‑3‑small/large), Google (LaBSE, MURIL), Cohere (1024‑dim), and BGE (BAAI General Embedding), with domain‑fine‑tuned models showing superior performance on video‑specific tasks.
The rerank module further refines retrieved chunks. Methods explored include cross‑encoder (concatenating query and document), bi‑encoder (separate encoding), and traditional ranking models, supplemented with domain‑specific scoring (e.g., matching celebrity names, channels, tags) to prioritize the most relevant content.
For the generative component, several large language models were benchmarked, and the best‑performing model was selected based on accuracy and recall. Prompt engineering techniques such as Chain‑of‑Thought prompting, automated prompt optimization, and Structured‑Aware Multi‑Task Meta‑Prompt Optimization (SAMMO) were applied to enhance reasoning and output quality.
Query guidance assists users by expanding their input with related search terms. Corpus construction combines basic metadata (tags, stars, roles) with AI‑generated plot queries to cover diverse user needs. Retrieval combines similarity‑based and KV‑based recall, while ranking incorporates user demographics, context features, and historical behavior to predict interest and surface the most relevant items.
Overall, the AI innovations in iQIYI’s search improve user experience, increase viewing time, and enable more personalized content discovery. Ongoing research and experimentation aim to further advance technology‑driven entertainment.
References: [1] Chain‑of‑Thought Prompting Elicits Reasoning in Large Language Models [2] Late Chunking: Contextual Chunk Embeddings Using Long‑Context Embedding Models [3] Sentence‑BERT: Sentence Embeddings using Siamese BERT‑Networks
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.