Heterogeneous Mini-Graph Neural Network for Fraud Invitation Detection
HMGNN introduces hyper-nodes to connect many small heterogeneous mini-graphs, uses attention-weighted heterogeneous convolution and residual feature transmission, achieving superior fraud invitation detection on iQIYI and Cora datasets compared to traditional GNNs and other models.
The iQIYI risk control team, in collaboration with Nanjing University, presents a study on applying graph neural networks to the problem of fraud detection in new‑user acquisition (拉新裂变) campaigns. The paper outlines the background of aggressive incentive activities, the threats posed by black‑gray industry tools (emulators, multi‑instance, device farms, proxy IPs, etc.), and the limitations of existing detection methods such as frequent‑itemset mining, clustering, supervised models, community detection, and standard GNNs.
Challenges
Three main challenges are identified: (1) the prevalence of many small sub‑graphs (most contain fewer than 25 nodes), which hampers information propagation; (2) heterogeneous relationships (invitation, device sharing, network sharing) that cannot be treated uniformly; and (3) a scarcity of labeled data (≈5.7% of nodes). Traditional GCNs and heterogeneous GNNs perform poorly under these conditions.
Proposed Method: Heterogeneous Mini‑Graph Neural Network (HMGNN)
HMGNN introduces a “hyper‑node” concept to connect isolated mini‑graphs into a hyper‑graph, thereby improving connectivity. Each mini‑graph generates a hyper‑node whose feature is the average of its constituent nodes. Additional edges are added: hyper‑node ↔ ordinary node and hyper‑node ↔ hyper‑node (constructed via k‑NN). An attention mechanism is employed to weight different relationship types (invitation, device, network, etc.) during heterogeneous graph convolution. Original node features are concatenated at every convolution layer (a residual‑like design) to prevent gradient vanishing or explosion.
Model Architecture
The overall architecture consists of (1) hyper‑graph construction, (2) multiple relation‑specific graph convolutions, (3) attention‑based aggregation of the convolution results, and (4) residual feature transmission. This design enables the model to learn the importance of each edge type while preserving raw feature information.
Experiments
Two datasets are used for evaluation:
iQIYI business dataset : HMGNN outperforms linear models, tree‑based models, and standard GCN across all metrics.
Cora citation dataset : HMGNN converges faster and achieves higher accuracy than a vanilla GCN.
Results demonstrate that HMGNN effectively addresses the small‑graph and heterogeneity challenges, achieving superior fraud‑invitation detection performance.
Conclusion
HMGNN is the first method to leverage graph neural networks for fraud invitation detection. By integrating hyper‑nodes, attention‑driven heterogeneous convolution, and residual feature transmission, it overcomes the limitations of existing GNN approaches. The authors plan to extend the technique to broader anti‑fraud scenarios such as financial fraud and social‑media abuse. The core code (with iQIYI‑specific components removed) is open‑sourced at https://github.com/iqiyi/HMGNN .
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.