Construction and Application of iQIYI's Qisou Knowledge Graph
iQIYI’s Qisou Knowledge Graph, built since 2015 through a five‑stage pipeline of schema modeling, multi‑source data acquisition, entity alignment fusion, JanusGraph‑HBase storage, and inference‑driven querying, now powers precise video search, intelligent Q&A, tag mining, and relationship‑based recommendation across its platform.
On May 16, 2012 Google first introduced the concept of a Knowledge Graph, aiming to use structured knowledge to enhance search engines and improve user experience. Since its inception, Knowledge Graphs have been tightly coupled with search engines and have become an important branch of artificial intelligence, playing key roles in search, natural language processing, and intelligent assistants.
iQIYI's search team began building its own Knowledge Graph, the Qisou Knowledge Graph, in 2015. The article describes the construction process and concrete applications in iQIYI search and NLP services.
What is a Knowledge Graph? It is a graph‑based model that describes entities (nodes) and the relationships (edges) between them, forming a semantic network that formally represents real‑world objects and their connections.
In a Knowledge Graph, an entity represents a real‑world thing (person, place, etc.) and a relation expresses how two entities are linked (e.g., "person lives in city"). Many real‑world scenarios, such as social networks, naturally fit this representation.
Construction of the Qisou Knowledge Graph
The construction pipeline consists of five major steps: knowledge representation & modeling, knowledge acquisition, knowledge fusion, knowledge storage, and knowledge application (query & inference).
2.1 Knowledge Representation and Modeling Qisou adopts a top‑down modeling approach. The schema is defined using RDF triples and RDFS rules. RDF (Resource Description Framework) represents data as [Subject, Predicate, Object] triples, and RDFS provides schema definitions such as subClassOf for hierarchical relationships.
The team also built an internal schema system to manage and parse schema definitions.
2.2 Knowledge Acquisition Data is the foundation of a Knowledge Graph. Qisou sources data from three main channels: internal data, vertical‑site data, and Baidu Baike. Each source has its own advantages and drawbacks, as summarized in the original table.
For Baidu Baike data, which lacks explicit type information, the team trains a separate classifier for each entity type using a self‑attention DNN model combined with rule‑based heuristics. Features include description text, infobox fields, hyperlinks, and tags.
Entity extraction is performed in three ways:
Structured data extraction: A unified framework with Groovy scripts implements extraction rules for various structured sources.
Semi‑structured data extraction: Supervised learning wrappers handle tables and lists from Baidu Baike.
Text data mining: NLP services (entity recognition, entity linking) extract entities and relations from free text.
2.3 Knowledge Fusion The core of fusion is entity alignment, which merges duplicate entities from different sources into a single global identifier. The process involves candidate retrieval via name/alias indexing, a binary classification model for alignment decisions, and an attribute‑fusion model to combine properties.
2.4 Knowledge Storage Qisou uses JanusGraph as the graph database, backed by HBase for storage and Elasticsearch for indexing, enabling online traversal queries.
3. Applications
Based on the graph database and NLP intent understanding, Qisou provides various question‑answering services:
Attribute queries (e.g., "X's birthday", "release date of a drama").
Relationship queries (e.g., "Wang Fei's ex‑husband's daughter").
Series‑related information, relationship composition, and more.
These services power intelligent Q&A, relationship queries, series‑surrounding content, and relationship composition features.
3.2 Basic Data Services The entity store supplies NLP modules with tokenization, entity recognition, and intent detection, and also powers celebrity graph displays.
3.3 Tag Mining Knowledge Graph data helps build and refine tag systems for videos. Inference rules generate derived attributes (age, zodiac) and relationships (reverse spouse, grand‑child). Graph embedding techniques further expand related entities.
Conclusion
After years of development, the Qisou Knowledge Graph has become a comprehensive entertainment‑industry knowledge base. It enhances video search by providing precise answers, understanding user intent, and enabling richer interactions. Ongoing AI advances continue to expand its applications in search, recommendation, and beyond.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.