Graph Algorithms and Graph Neural Networks for Fraud Detection
The article explains how building account relationship graphs and applying both traditional graph algorithms and modern graph neural networks—through community detection, anomaly detection, semi‑supervised and unsupervised learning, and dynamic graph techniques—can effectively identify and dismantle fraud groups in online services.
During the operation of internet services, large‑scale attacks from black‑market fraud groups infiltrate many stages such as registration, login, marketing, and transactions; by mining the hidden associations among accounts and constructing a graph of account relationships, these fraud rings can be comprehensively identified.
Graphs model entities (nodes) like accounts, devices, or phone numbers and their connections (edges); homogeneous graphs contain a single node and edge type, while heterogeneous graphs involve multiple types, making them powerful tools for anti‑fraud analysis.
Traditional graph algorithms used in fraud detection include community‑detection methods (LPA, Louvain, Infomap) that split the graph into tightly connected sub‑graphs, allowing abnormal communities to be flagged as fraud rings, and anomaly‑detection algorithms such as Fraudar that locate dense sub‑graphs in bipartite graphs to uncover fake reviews, money‑laundering, etc.
LPA works by initially assigning each node a unique community label, then iteratively updating each node’s label to the majority label among its neighbors until convergence, resulting in nodes with strong connections sharing the same label.
Graph neural networks (GNNs) address the limitation of traditional methods by incorporating node features alongside topology. Techniques like DeepWalk generate node embeddings via random walks and skip‑gram, while Graph Convolutional Networks (GCN) aggregate neighbor features, apply nonlinear transformations, and produce richer node representations for downstream classification.
In practice, semi‑supervised learning (e.g., GCN with a few labeled nodes) and unsupervised learning (e.g., community detection, DGI) can both be applied. DGI creates positive samples from the original graph and negative samples by corrupting node features, training a binary classifier to obtain informative node embeddings.
Dynamic graphs extend static methods by handling time‑varying nodes and edges; streaming graph embedding techniques update representations locally as new nodes appear, enabling timely detection of evolving fraud rings.
Choosing between traditional algorithms and GNNs depends on the required interpretability and label availability; dynamic graph methods, though computationally demanding, better meet real‑time fraud detection needs, and their adoption is expected to grow as hardware and algorithms improve.
References:
[1] Raghavan, U. N., Albert, R., & Kumara, S. (2007). Physical Review E, 76, 036106. [2] Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Journal of Statistical Mechanics: Theory and Experiment, P10008. [3] Rosvall, M., & Bergstrom, C. T. (2008). Proceedings of the National Academy of Sciences, 105, 1118‑1123. [4] Hooi, B., Song, H. A., Beutel, A., Shah, N., Shin, K., & Faloutsos, C. (2016). FRAUDAR: Bounding Graph Fraud in the Face of Camouflage. KDD ’16. [5] Perozzi, B., Al‑Rfou, R., & Skiena, S. (2014). DeepWalk. [6] Kipf, T. N., & Welling, M. (2016). Semi‑Supervised Classification with Graph Convolutional Networks. [7] Velickovic, P., Fedus, W., Hamilton, W. L., Lio, P., Bengio, Y., & Hjelm, R. D. (2018). Deep Graph Infomax. [8] Liu, X., Hsieh, P.-C., Duffield, N., Chen, R., Xie, M., & Wen, X. (2019). Real‑Time Streaming Graph Embedding Through Local Actions. WWW ’19.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.