Artificial Intelligence 18 min read

JD Technology Financial Causal Knowledge Graph: Construction, Causal Extraction, and Alignment Techniques

This article presents JD Technology's recent research on financial causal knowledge graphs, detailing the overall knowledge‑graph architecture, data layers, causal relation extraction, argument extraction, and graph‑alignment methods, and discusses their applications in finance, intelligent research reports, and industry‑leader recommendation.

DataFunTalk
DataFunTalk
DataFunTalk
JD Technology Financial Causal Knowledge Graph: Construction, Causal Extraction, and Alignment Techniques

01 JD Technology Knowledge Graph Overview

JD Technology's knowledge‑graph capability consists of five parts: multi‑source heterogeneous data, foundational technologies, core capabilities, graph platform, and graph applications. The data layer includes structured, semi‑structured, and unstructured data; core capabilities cover information extraction, text parsing, graph storage and visualization. The platform serves finance, e‑commerce, and healthcare, and is also applied in search, recommendation, asset‑management, and intelligent customer service.

02 Financial Causal Knowledge Graph Construction

The financial causal graph contains over 2 billion nodes representing companies, people, products, and indicators, with eight relation types such as supply‑chain, subsidiary, and customer. Construction relies on NER, relation extraction, entity extraction, and indicator extraction.

Event‑level causal graphs (called "event‑logic graphs") abstract events into nodes and directed causal edges. An example from the oil industry shows how a sentence about declining oil demand is transformed into two causal pairs, which are then parsed into argument structures and aligned to graph nodes.

03 Causal Relation Extraction Techniques

Challenges include explicit vs. implicit causality, diverse causal cue words, and nested causal pairs. The extraction model predicts causal connectors and then the corresponding cause and effect spans, using joint pre‑training tasks and a GCN encoder with gating mechanisms to improve representation.

04 Argument Extraction and Graph Alignment

Argument extraction follows semantic‑role labeling (SRL) using the Chinese Proposition Bank (CPB). Nine argument types (e.g., ArgM‑Loc, ArgM‑Time, ArgM‑Tool) are retained to simplify modeling. Two pipelines are explored: independent extraction and joint multi‑task extraction with shared encoders.

Graph alignment converts short text phrases into sub‑graphs, then aligns them with existing graph nodes using rule‑based recall, text‑similarity coarse ranking, and a fine‑ranking GCN‑based model that encodes each node with BERT and aggregates neighbor information.

05 Applications in Finance

The causal graph enriches financial AI by providing multi‑dimensional connections for sentiment analysis, investment research, and price prediction. It also supports intelligent research‑report generation and industry‑leader recommendation by weighting nodes based on in‑degree/out‑degree and policy impact.

06 Summary and Outlook

Current work focuses on explicit causal extraction; future research will address implicit causality, scale up data volume (currently ~1 M events, >2 × 10⁸ nodes), and enhance argument representations to capture conditions and richer semantics.

07 Q&A

Q: How to express conditional logic in event‑logic graphs? A: Model conditions as additional argument roles or separate sub‑graphs, starting with simple dimensions such as time or location.

Q: What proportion of manual verification is needed for argument extraction? A: It depends on the scenario; spot‑checking suffices for sentiment analysis, while full verification is required for high‑risk financial calculations.

Q: Can the financial causal graph be transferred to other domains? A: Yes, the methodology is domain‑agnostic; only data sources need adaptation.

NLPKnowledge GraphFinancial AIsemantic role labelingcausal extractiongraph alignment
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.