Building and Applying an Industry Knowledge Graph: Lessons from Beike Real Estate
The article explains how Beike Real Estate constructs an industry knowledge graph by integrating internal and external data, outlines the technical framework and data processing steps, and demonstrates its AI-driven applications such as intelligent Q&A, recommendation, and decision support for the real‑estate market.
Traditional industries are increasingly moving online, but low digitalization and slow information flow make it hard to track market changes and assess their own position.
Industry knowledge graphs address this by aggregating and fusing internal and external data, revealing overall development trends and empowering further growth.
The presentation is organized around three questions: who we are, where we are, and where we are heading. Internal data (transaction, behavior) provides a clear view of company scale, while external data (competitor benchmarks, policy, upstream/downstream entities, POI, user groups) quantifies market position.
To build the graph, Beike integrates five categories of external data—competitor products, professional content (policy, macro‑economics), upstream/downstream partners, surrounding environment (hospitals, schools), and user groups—followed by data cleaning and entity fusion, focusing on communities, buildings, stores, and metric systems.
The resulting graph contains 48 billion triples, over 140 entity types (agents, stores, schools, parks, listings, etc.), ~230 relationships and ~1,800 attributes, stored initially in Neo4j/JanusGraph and now migrating to Dgraph.
With the graph in place, Beike enables several AI‑driven capabilities: intelligent Q&A assistants for agents, recommendation and reasoning based on graph relationships, community discovery (risk alerts, user profiling), and intelligence analysis that improves efficiency, data increment, and strategic decision‑making.
Specific applications include the "XiaoBei" chatbot for real‑time knowledge answering, enhanced search that suggests related listings when no direct results are found, and AI‑generated VR house‑viewing scripts.
The article concludes by emphasizing the graph’s role in boosting intelligence, supporting GMV growth, and guiding future development goals such as serving 200 million families.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.