Artificial Intelligence 13 min read

From Zero to One: Building and Deploying Knowledge Graphs at Beike Real Estate

This article details the evolution, architecture, and practical applications of knowledge graphs at Beike Real Estate, covering their historical background, five‑view advantages, data pipelines, ontology construction, intelligent search, recommendation, and chatbot integration, while also discussing challenges and future directions.

DataFunTalk
DataFunTalk
DataFunTalk
From Zero to One: Building and Deploying Knowledge Graphs at Beike Real Estate

The talk, originally presented by senior knowledge‑graph engineer Wang Heqing at a DataFun Talk AI salon, introduces knowledge graphs, explains why they can be implemented at Beike, showcases concrete use cases, and outlines challenges and future prospects.

It begins with a brief history of knowledge graphs, from the 1960s semantic networks and ontologies, through the 1990s linked data, to Google’s 2012 Knowledge Graph, highlighting their purpose of representing real‑world entities with unique IDs, attribute‑value pairs, and relationships.

Five key advantages are described: Web‑level semantic linking, NLP‑level text extraction, KR‑level knowledge representation, AI‑level reasoning support, and DB‑level graph storage, emphasizing both generic and vertical domain scenarios such as search, chatbots, finance, e‑commerce, public safety, and agriculture.

Beike’s rich data sources—billions of structured entities (listings, customers, districts, stations) and massive unstructured conversational logs—enable the construction of a domain‑specific knowledge graph that powers intelligent search, recommendation, and question‑answering to improve user experience and business conversion.

The knowledge‑graph system architecture consists of five layers: data acquisition (crawlers and internal databases), preprocessing (normalization, fusion, inference), storage (Elasticsearch or Neo4j with backups in HDFS/Hive), and application (intelligent assistants, customer service, search, visualization). The ontology is built using Protégé, defining classes (e.g., traffic, location, person, organization) and properties (object, data, and custom constraints).

In the IM intelligent assistant, user queries are processed through NLU (segmentation, NER, intent detection), transformed into SPARQL/SQL over the graph, and results are formatted with conversational templates. Frequently asked questions are modeled as triples (who/what/where/why/how) to enable precise, semantic‑driven answers.

Additional applications include semantic search optimization, recommendation of nearby properties when direct matches are absent, and a custom graph‑visualization platform that lets end users explore relationships between entities.

The presentation concludes with a summary of the five‑view benefits, the three prerequisites for successful deployment (data, ontology, intelligent scenarios), and the challenges ahead—such as requiring domain experts for ontology curation and massive unstructured data structuring—to build the most authoritative real‑estate knowledge graph.

Author bio: Wang Heqing, senior knowledge‑graph engineer at Beike, previously at Sogou, responsible for research and productization of knowledge graphs. Team intro: Beike’s Intelligent Search team focuses on search, recommendation, chatbot, and large‑scale data warehousing, leveraging AI, NLP, and graph technologies.

Data Engineeringartificial intelligenceNLPsemantic searchKnowledge GraphReal Estateintelligent assistant
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.