Comprehensive Overview of Knowledge Graphs: Construction, Storage, and Applications in Recommendation Systems
This article provides a detailed introduction to knowledge graphs, covering their definition, why they are needed, the four basic triple types, construction pipelines (including data sources, crowdsourced vs automated methods, and schema versus data layers), storage and query techniques using graph and relational databases, and their practical applications such as enhancing precision, diversity, and explainability in recommendation systems through models like DKN, RippleNet, and graph neural networks.
Knowledge graphs are large-scale semantic networks that consist of entities, concepts, attributes, and relationships, enabling richer representations than plain strings. They are essential for cognitive AI, providing the structured knowledge required for reasoning and understanding.
The article begins with a brief overview of knowledge graphs, then explains why computers need them: a simple example shows that the number 110 can be linked to the concept of an emergency phone number, turning a raw string into meaningful knowledge.
Artificial intelligence is divided into three levels—computational, perceptual, and cognitive intelligence—and cognitive intelligence relies on knowledge graphs to provide the world model needed for understanding.
Four basic triple patterns are described: entity‑relation‑entity (e.g., "GoodFuture" – "founder" – "Zhang Bangxin"), entity‑attribute‑value (e.g., "GoodFuture" – "founded" – "2003"), entity‑is‑a‑concept (e.g., "GoodFuture" – "is‑a" – "public company"), and concept‑subclass‑concept (e.g., "actress" – "subclass‑of" – "actor").
Knowledge graphs consist of a schema layer (the ontology) and a data layer . The schema layer abstracts the data model, while the data layer holds the actual triples.
Construction starts from data sources (structured, semi‑structured, unstructured) and can be performed via crowdsourcing (e.g., Wikipedia) or automated pipelines. The pipeline includes entity extraction, relation extraction, attribute extraction, entity disambiguation, ontology building, reasoning, quality assessment, updating, storage, representation learning, and downstream applications.
For storage and query, two main approaches are discussed: graph databases (property graphs, Neo4j, Gremlin, Cypher) and relational databases (triple tables, horizontal tables, predicate‑wise tables). Each method has trade‑offs in query complexity, space efficiency, and support for multi‑value attributes.
The article then focuses on applications, especially in recommendation systems. Knowledge graphs improve precision by revealing hidden item relationships, increase diversity through multi‑hop entity expansion, and enhance explainability by providing interpretable paths (e.g., shared actors between movies).
Specific models are introduced: the DKN method builds a sub‑graph for a news article and learns embeddings; KCNN combines entity, context, and word embeddings via convolution; RippleNet incorporates multi‑hop information; and graph neural networks (GNNs) encode entities and relations with user‑specific scoring functions.
The article concludes with a summary diagram and provides resource links for further study, including a GitHub collection of knowledge‑graph learning materials and a graduate‑level course from Southeast University.
Author bio: Yue Xiang, senior NLP engineer at TAL Education Group.
Recruitment notice: TAL Education is hiring senior engineers for front‑end, algorithm, and back‑end positions; details are available in the "Technical Recruitment" section of the WeChat public account.
TAL Education Technology
TAL Education is a technology-driven education company committed to the mission of 'making education better through love and technology'. The TAL technology team has always been dedicated to educational technology research and innovation. This is the external platform of the TAL technology team, sharing weekly curated technical articles and recruitment information.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.