Artificial Intelligence 15 min read

Real‑Time Graph Neural Network for Payment Fraud Detection at eBay

This article describes how eBay applies graph neural networks to real‑time payment fraud detection, covering the anti‑fraud scenario, limitations of traditional GBDT pipelines, challenges of constructing and serving dynamic heterogeneous graphs, the end‑to‑end solution with directed slice graphs and a Lambda‑style architecture, and experimental results comparing GNN with LightGBM.

DataFunTalk
DataFunTalk
DataFunTalk
Real‑Time Graph Neural Network for Payment Fraud Detection at eBay

The talk begins with an overview of eBay's payment fraud landscape, highlighting risk assessment points before, during, and after a transaction and explaining why real‑time detection is critical.

It then outlines the traditional end‑to‑end pipeline: feature engineering for account‑level variables, labeling based on unauthorized transactions, handling severe class imbalance, and training a GBDT model (e.g., LightGBM) that is later deployed for online scoring.

Next, the limitations of tabular models are discussed, emphasizing that relational features (shared addresses, IPs, emails) are naturally expressed as graph edges, which traditional pipelines struggle to capture efficiently.

The core of the presentation focuses on the challenges of deploying GNNs in a real‑time setting: temporal leakage when constructing a bipartite event‑entity graph, high latency of neighbor queries, and the computational cost of deep models.

To address these, a directed dynamic slice graph is introduced, where each time slice forms a sub‑graph and edges are categorized as (1) order‑to‑entity, (2) historical entity‑to‑entity within a time window, and (3) current‑order propagation edges, with “shadow” orders used to prevent future‑information leakage.

A Lambda‑style architecture is then described: offline embedding of entities via GNNs stored in a key‑value store, and online inference that retrieves a small set of relevant embeddings, combines them with GBDT‑encoded features, and passes them through a final GNN layer for risk scoring.

Experimental results compare the proposed GNN pipeline against LightGBM and MLP baselines on a large e‑commerce fraud dataset. GCN‑based models achieve roughly a 25% improvement in accuracy over LightGBM, while GAT does not outperform GCN due to limited hyper‑parameter tuning.

The talk concludes with a summary of the end‑to‑end solution—graph partitioning, dynamic slicing, and decoupled inference—and outlines future directions such as exploring temporal GNNs (e.g., TGN) and more sophisticated graph partitioning strategies.

e-commercemachine learningfraud detectionReal-time AnalyticsGraph neural networkspayment risk
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.