Beike's Risk Control System: Leveraging Knowledge Graphs and Graph Analytics
The article details how Beike's Agent Cooperation Network employs a multi‑layered risk control framework built on large‑scale knowledge graphs, graph mining, and machine‑learning techniques to detect fake listings, malicious competition, and other threats across both online and offline real‑estate scenarios.
Beike's ACN (Agent Cooperation Network) business model aims to break down silos between brands, listings, users, and agents, creating a cooperative ecosystem that requires a robust risk‑control system.
The primary business risks include fake listings and clients, crawler‑induced data leaks, and malicious competition such as hidden listings or low‑fee tactics, all of which can damage platform reputation and user experience.
Risk‑control characteristics are shaped by Beike's diversified online‑offline operations, low‑frequency, high‑value, long‑cycle transactions, demanding not only risk identification but also explainable evidence chains.
Beike implements a hierarchical risk‑control architecture: a data capability layer (agent, property, store, behavior data), a core technology layer (relationship graph mining, risk tagging, city risk compass), and an output layer covering pre‑risk (agent/store admission), in‑risk (real‑time behavior monitoring), and post‑risk (reports, automated detection, evidence linking).
Relationship graphs are chosen over traditional blacklists and expert rules because they reduce manual effort, handle large‑scale data, and effectively capture both small‑B and large‑B risk patterns, enabling detection of high‑risk violations and subsequent monitoring of medium‑ and low‑risk activities.
The graph has evolved from a factual graph (1 billion nodes, 10 billion edges) to a reasoning graph (behavior, social, operation, and business graphs) and is moving toward a fusion stage that integrates multiple reasoning graphs for deeper risk insight.
The overall graph architecture consists of four layers: a foundational data layer, a knowledge‑construction layer (entity and relation extraction), a knowledge‑mining layer (shortest‑path, community detection, label propagation, graph embedding), and a business‑application layer (source tracing, risk quantification, proactive violation discovery).
Technically, Beike uses Spark GraphX for graph analysis and JanusGraph for graph queries, leveraging Spark's community support and GraphX's efficiency alongside JanusGraph's visualization capabilities.
Application scenarios include admission control (path searching and risk‑path ranking), risk quantification, quality management, proactive risk discovery (seed‑based monitoring, behavior pattern mining, Louvain community detection, and graph‑embedding‑based ML), and case tracing through multi‑dimensional node attributes.
For automated machine learning, Beike applies the Louvain algorithm for community detection—optimizing modularity—and Graph Embedding techniques such as Node2Vec, which convert graph structures into vector representations for downstream risk analysis.
Future planning focuses on high‑density subgraph mining to uncover hidden groups, combined with graph fusion and embedding methods to enhance the graph's foundational capabilities for broader risk governance and user growth use cases.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.