Ant Group's Knowledge Graph: Overview, Construction, Applications, and Integration with Large Models
Ant Group shares its comprehensive knowledge graph initiatives, detailing the fundamentals, construction pipeline, fusion techniques, cognitive representations, diverse business applications, and the emerging synergy between knowledge graphs and large language models, illustrating how graph-based AI enhances accuracy, interpretability, and downstream services.
This article presents Ant Group's work on knowledge graphs, organized into four main parts: an overview, construction methods, applications, and the relationship with large models.
1. What is a Knowledge Graph? A knowledge graph models complex relationships and domain knowledge using graph structures, serving as a foundation for cognitive intelligence. It is widely used in search, QA, semantic understanding, and big‑data decision analysis, and benefits from deep‑learning‑based representation.
2. Why Build a Knowledge Graph? Ant's heterogeneous data lacks a unified knowledge understanding system. Building a graph standardizes entities, relations, and concepts, accumulates domain knowledge, enables knowledge reuse, and supports reasoning for risk control, credit, claims, merchant operations, and recommendation scenarios.
3. Construction Overview
The construction paradigm consists of five stages: (1) data sources for cold‑start, (2) cross‑domain graph fusion via entity alignment, (3) domain‑structured knowledge base fusion, (4) information extraction from unstructured/semi‑structured data, and (5) integration of domain concepts and expert rules.
From an algorithmic view, capabilities include knowledge inference and matching; from an implementation view, the stack comprises graph computation engines, a graph base (NLP & multimodal platform), graph‑building technologies, inference modules, algorithm services, and business applications.
4. Graph Construction Details
The six‑step pipeline includes: (1) data source acquisition, (2) knowledge modeling (concepts, entities, events), (3) knowledge acquisition via a processing platform, (4) storage (HA3 and graph stores), (5) knowledge operation (editing, online query, extraction), and (6) continuous learning for model iteration.
Experience & Techniques
1) Entity Classification with Expert Knowledge
Enhancements: semantic label embeddings, contrastive learning with hierarchical label supervision, and logical rule constraints.
2) Domain Vocabulary Injection for Entity Recognition
Uses a fully‑connected graph and GAT to learn token representations, applying boundary and semantic contrastive learning.
3) Few‑Shot Relation Extraction with Logical Rules
Combines external knowledge bases, logical‑rule inference, and fine‑grained difference perception to handle few‑shot or zero‑shot scenarios.
5. Graph Fusion
Fusion enables cross‑business knowledge reuse, eliminates redundant data copies, and accelerates value delivery. Entity alignment is a core technique, implemented with the SOTA BERT‑INT model that combines title similarity, attribute/neighbor similarity, and a ranking module.
6. Graph Cognition
Ant employs an encoder‑decoder framework where graph neural networks act as encoders and decoders perform tasks such as link prediction. The resulting low‑dimensional embeddings reduce storage, alleviate sparsity, unify heterogeneous data, and are reusable across downstream services.
7. Applications
Various business cases illustrate the impact of knowledge graphs:
• Structured matching recall for Alipay mini‑program search (merchant graph).
• Real‑time user intent prediction in recommendation (AlipayKG, published at WWW 2023).
• Dynamic graph‑based coupon recommendation addressing cold‑start and sparsity.
• Insurance claim expert‑rule reasoning using a medical knowledge graph.
8. Knowledge Graphs and Large Models
The article discusses the complementary strengths of knowledge graphs (accuracy, interpretability) and large language models (generalization). Three integration routes are identified: using graphs to enhance models, using models to enrich graphs, and co‑training both.
Examples include:
Large models for information extraction, knowledge modeling, and relation inference during graph construction.
Two‑stage extraction pipelines (type detection → detailed extraction) as demonstrated by Ant's DAMO Academy.
Injecting graph knowledge into model inputs, joint training of graph and language tasks, and using graphs as priors to reduce hallucinations and improve timeliness.
Knowledge‑enhanced QA systems that combine retrieval (e.g., LangChain) with graph‑based reasoning.
9. Summary and Outlook
Future directions focus on deeper NLP and QA integration, using graphs for hallucination detection and detoxification, and developing domain‑specific large models powered by graph knowledge.
Overall, Ant Group’s knowledge‑graph ecosystem demonstrates how graph‑based AI can be systematically built, fused, and applied, and how it can synergize with emerging large models to deliver more accurate, explainable, and efficient intelligent services.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.