Artificial Intelligence 12 min read

Applying Graph Neural Networks for Financial Risk Control: A Case Study by Shuhe Technology

This article describes how Shuhe Technology leveraged graph neural networks to improve financial risk assessment by preparing massive relational graph data, selecting DGL as the development framework, designing a GraphSage‑GAT model, addressing data sparsity and imbalance, and achieving notable AUC gains over traditional methods.

DataFunSummit
DataFunSummit
DataFunSummit
Applying Graph Neural Networks for Financial Risk Control: A Case Study by Shuhe Technology

Business Background Knowledge graphs and relational entities have long been used in finance, but traditional risk control relies on costly manual feature extraction. Shuhe Technology partnered with a fraud‑prevention team to explore graph neural networks (GNN) for deeper, automated feature extraction.

The internal graph contains over 70 billion edges and more than 1 billion nodes, with only a tiny fraction of nodes having useful features, making manual analysis infeasible.

Data Preparation The project selected DGL (Deep Graph Library) as the GNN framework due to its industrial support and active community. A time‑windowed sample (Sept–Dec 2020) reduced the graph to ~7 billion nodes and ~20 billion edges. Sparse, feature‑less nodes were pruned, raising the proportion of feature‑rich nodes to >5 %.

Over 80 node features covering user attributes, loan history, and repayment behavior were attached to the graph. To handle severe class imbalance (positive rate <0.5 %), loss weighting was applied instead of oversampling/undersampling.

Model Introduction The architecture combines GraphSage for multi‑layer neighbor sampling with Graph Attention Networks (GAT) for aggregation, followed by a feed‑forward network for final prediction. Sampling depth of 2–3 layers avoids over‑smoothing, and multi‑head attention improves representation quality.

Project Summary The GNN model achieved stable risk discrimination on two out‑of‑time test sets, boosting AUC by roughly 4 points when stacked with traditional models. Deployment challenges such as online graph sampling remain under investigation.

Future work includes enriching the graph with device and location data, exploring heterogeneous graph techniques (e.g., R‑GCN, HARP, SEAL), and extending applications to marketing, recommendation, and user re‑engagement.

AImodelingGNNgraph neural networksdata preparationfinancial risk
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.