Graph Computing Algorithms for E‑commerce Anti‑Fraud and Reselling Bot Detection
The Xiaohongshu anti‑fraud team combats sophisticated same‑group and crowdsourced reselling bots by ingesting real‑time transaction streams into a Nebula Graph, using multi‑hop sub‑graph sampling, label propagation, and modularity‑based community detection to identify suspicious clusters, update risk pools, and enforce personalized purchase‑limit rules.
With the rapid development of Xiaohongshu's community e‑commerce, the variety of marketing activities and user scenarios has expanded, leading to increasingly sophisticated reselling‑bot ("黄牛") tactics. In addition to traditional bulk‑purchase groups, a crowdsourced model has emerged where real users are invited to purchase discounted items and later transfer the goods and profit.
These bot activities cause platform losses and harm ordinary users and merchants. The Xiaohongshu anti‑fraud team has therefore built efficient, executable graph‑computing models to combat such behavior.
Challenges
Bots often operate as organized groups, requiring detection beyond obvious purchase‑volume signals.
Two main bot categories: same‑group bots that register many accounts and switch identities, and crowdsourced bots that use real users, making them hard to distinguish from normal users.
The e‑commerce environment evolves quickly, so anti‑fraud methods must be continuously updated.
Why Graph Computing?
Graphs consist of nodes and edges that naturally model relationships such as user‑product interactions, device bindings, and phone numbers. This multi‑dimensional relational representation captures complex associations that traditional tabular storage cannot, enabling richer feature extraction, faster queries, and more intuitive modeling for fraud detection.
2.1 Same‑Group Bot Graph Algorithm
The algorithm ingests real‑time Kafka streams of transaction logs, deserializing fields such as user UID, device fingerprint, IP address, and product ID. Each entity becomes a node, and edges are created to represent:
Registration and login bindings (e.g., user → linked account).
Usage relationships (e.g., user → device, IP).
Purchase relationships (e.g., user → product, merchant).
These heterogeneous edges are stored in Nebula Graph. By performing multi‑hop sub‑graph sampling, strong‑entity mining, and weak‑label propagation, the system identifies suspicious clusters and updates a risk seed pool in real time.
2.2 Crowdsourced Bot Community Discovery Algorithm
The method builds a bipartite graph between users and purchased products. Using a modularity‑based community detection algorithm, nodes are iteratively reassigned to the neighboring community that yields the greatest modularity gain until convergence.
Edge weights are computed with a custom similarity metric:
R(A,B) = f(k, CA, CB, Wpurchase, Wreceive, time_window)where k is the number of co‑purchased items, CA and CB are the purchase counts of users A and B, and Wpurchase / Wreceive reflect purchase and receipt similarity, adapting to promotional periods. After community detection, each community is profiled and automatically screened to surface high‑risk crowdsourced bots.
Additional Anti‑Fraud Measures
The team also implements a closed‑loop workflow: user labeling → interception → product identification → data sinking → risk user re‑scan → label update. This enables real‑time monitoring, early warning based on interception rates, and the construction of comprehensive black‑industry portraits.
On the business side, personalized purchase‑limit rules are devised by combining merchant logic, live‑stream discount scenarios, and the Hammurabi risk engine, creating vertical interception strategies that complement graph‑based models.
The presentation concludes with acknowledgments of the Xiaohongshu security technology team members.
Xiaohongshu Tech REDtech
Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.