Backend Development 12 min read

How Fangdd Scales Real‑Estate Search with Elasticsearch: Architecture & Lessons

This article explains how Fangdd leverages Elasticsearch to boost search performance across consumer, broker, and internal products, detailing a platformized architecture that separates indexing and querying, addresses operational challenges, and outlines design patterns for index management and incremental updates.

Fangduoduo Tech

May 25, 2019

How Fangdd Scales Real‑Estate Search with Elasticsearch: Architecture & Lessons

Fangdd has been exploring the use of internet technology to improve service and transaction efficiency in the real‑estate industry, a concept often referred to as the "Industrial Internet".

Long industry chains and many user roles demand high collaborative efficiency across multiple products.

Transforming traditional industries with internet technology requires high SLA standards for each product function node to truly enhance user efficiency.

The company faces several real‑world search scenarios:

C‑end product provides massive information search for listings, projects, and updates.

Broker product offers complex queries for projects, listings, consultations, and online visits.

Back‑office operations products involve even more complex retrieval conditions, wide data associations, and large data volumes.

To improve performance and efficiency, Fangdd uses Elasticsearch as a platform for search and complex query services.

Beyond the obvious search use cases, Elasticsearch also allows the separation of complex query logic from core microservices, reducing both service and database complexity. For example, in the new‑home order‑transaction domain, order and transaction microservices handle write operations and simple queries while a dedicated query service syncs data to Elasticsearch for complex, multi‑service queries.

Challenges encountered include:

Multiple Elasticsearch clusters consume excessive server resources because different business lines cannot share resources.

Development teams focus on business features and lack time for cluster operation and optimization, affecting high availability.

High learning and resource costs for integrating Elasticsearch cause some teams to avoid it, increasing data‑model and code complexity.

To standardize search usage, Fangdd abstracts common problems:

Index creation and updates.

Full or batch index rebuilding.

Index querying.

Authentication, permission control, and monitoring during queries.

Platform maintenance.

Platform Design

The design separates index creation and querying into two services: Indexer and Searcher . All other services interact with Elasticsearch through these wrappers.

Indexer provides functions for creating indexes, incremental updates, and full rebuilds.

Searcher offers query interfaces, performing identity verification, permission checks, and monitoring.

An index management UI allows visual configuration of indexes.

Index Creation Details

Index creation is broken down into sub‑problems such as defining data scope, acquiring data, and synchronizing updates. Fangdd defines an IndexInfoProvider interface for these tasks:

public interface IndexInfoProvider {
   /**
    * Minimum document ID in the index
    */
   public long getMinId();

   /**
    * Maximum document ID in the index
    */
   public long getMaxId();

   /**
    * Get all documents whose IDs are in [start, end)
    */
   public List<IndexDocument> getDocsByIdRange(long start, long end);

   /**
    * Get documents for specific IDs
    */
   public List<IndexDocument> getDocsByIds(List<Long> ids);
}

Implementations of IndexInfoProvider are exposed as Dubbo services, using the group attribute to separate different indexes, e.g.:

<dubbo:service group="amc_port" ref="amcPortIndexInfoProvider" interface="com.fangdd.searchplatform.indexer.protocol.IndexInfoProvider" version="1.0.0"/>

Administrators add new indexes via the index management UI, configure rebuild schedules and mappings, and the Indexer automatically registers the corresponding Dubbo consumer.

Full Rebuild Process

The full‑rebuild sequence splits the primary‑key range [min, max] into fixed steps, calls getDocsByIdRange for each segment, and writes the documents into a new Elasticsearch index.

Incremental Updates

When business data changes, the updater sends the index name and affected IDs to a message queue. The Indexer consumes the message, retrieves the latest documents via getDocsByIds, and updates Elasticsearch, achieving near‑real‑time incremental indexing.

Query Service

Applications request Elasticsearch queries through the Searcher, providing an appId and appKey for authentication. Each query is logged for monitoring, enabling usage statistics, rate‑limiting, and future throttling.

The Searcher supports:

Encapsulated Elasticsearch requests covering filters, fuzzy matching, aggregations, etc.

Direct Elasticsearch HTTP queries.

Elasticsearch Java API queries.

Future Plans

Generalize incremental interfaces to handle non‑sequential IDs.

Introduce safeguards for heavy aggregation queries to protect cluster resources.

Enhance real‑time monitoring and rate‑limiting for query endpoints.

Develop better operational and optimization tools for long‑term cluster maintenance.

Deployment Strategy

Different business scenarios are deployed on separate Elasticsearch clusters:

To‑C services run on isolated clusters to handle large public traffic and mitigate attacks.

To‑B services, serving enterprise customers, run on dedicated clusters with stricter SLA requirements.

Log‑analysis clusters (ELK) operate under a "write‑heavy, read‑light" pattern with distinct storage and query strategies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Microservices indexing Backend Development Elasticsearch platform design Search Architecture

Written by

Fangduoduo Tech

Sharing Fangduoduo's product and tech insights, delivering value, and giving back to the open community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.