Artificial Intelligence 18 min read

How Graph Algorithms Power Anti‑Fraud in Marketing and E‑Commerce

This article explores how black‑market cheating in marketing campaigns and e‑commerce is detected using graph‑based techniques such as same‑person mining, label propagation, Fraudar, and GCN models, and discusses future directions like multimodal data fusion and real‑time dynamic graph computation.

Baidu Geek Talk

Jun 18, 2025

How Graph Algorithms Power Anti‑Fraud in Marketing and E‑Commerce

Business Background

In marketing campaigns, black‑market groups automate large‑scale reward harvesting, waste marketing funds, and pollute data metrics, leading to inaccurate operational decisions.

Typical cheating stages include resource preparation (mass‑purchasing accounts), task execution (automated scripts simulating user actions), and fund cash‑out (using dispersed real WeChat accounts).

Same‑Person Mining

We define a "same‑person" group as accounts and devices sharing the same underlying entity. By constructing a graph where user accounts are nodes and device IDs, phone numbers, withdrawal IDs, and encrypted ID cards are edges, we extract maximal connected subgraphs to identify same‑person relationships and apply stability processing across multiple days, achieving 96.8% overall ID stability and 99.3% after filtering unstable merges.

Label Propagation Algorithm and Application

The Label Propagation Algorithm (LPA) is a semi‑supervised graph‑based method for community detection and node classification. It iteratively spreads the most frequent neighbor label to each unlabeled node until convergence.

Advantages: computationally efficient (O(|E|) per iteration), no parameter tuning, naturally parallelizable.

Limitations: possible instability due to random initialization, ignores node features, performs poorly on sparse graphs.

Fraudar Algorithm and Application

Fraudar targets bipartite graphs (e.g., user‑shop). It defines a global suspiciousness score g(S)=f(S)/|S|, where f(S) sums node and edge suspiciousness. The algorithm iteratively removes the node with the lowest contribution, tracking g(S) to select the subgraph with maximal suspiciousness.

Key properties: higher node/edge suspiciousness increases g(S); larger dense subgraphs are more suspicious; smaller subgraphs with the same total suspiciousness are more suspicious.

Optimized with a priority heap, the total complexity becomes O(|E| log|V|). Multiple suspicious subgraphs are obtained by repeatedly removing the identified subgraph and re‑running Fraudar.

GCN Model and Application

GCN (Graph Convolutional Network) aggregates neighbor features to update node representations, enabling end‑to‑end learning on graph data. The core propagation formula is H^{(l+1)} = σ(Â H^{(l)} W^{(l)}), where Â is the normalized adjacency with self‑loops.

Advantages: integrates graph structure with node features, efficient sparse matrix computation, suitable for node classification, graph classification, and link prediction.

Limitations: over‑smoothing in deep layers, assumes static graphs.

Summary and Outlook

We have deployed LPA, Fraudar, and GCN for risk control and anti‑fraud, achieving solid results. Future work includes multimodal data fusion (graph, time‑series, text), real‑time dynamic graph computation to counter rapid black‑market evolution, and enhancing interpretability and adversarial robustness.

Graph algorithms are evolving from auxiliary tools to the core engine of risk control systems, with ongoing challenges in balancing technical performance, compliance, and user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

anti-fraud GCN Risk Detection label propagation graph algorithms Fraudar

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.