Databases 17 min read

Didi Elasticsearch Overview: Architecture, Deployment, Performance, and Operations

Didi’s Elasticsearch platform, built on ES 7.6 and deployed on physical machines with containerized gateway and control layers, provides a multi‑tenant, high‑performance search service—featuring a user console, operational controls, ZGC‑based latency reductions, cost‑saving compression, custom security, real‑time cross‑datacenter replication, and a roadmap toward ES 8.13.

Didi Tech

May 14, 2024

Didi Elasticsearch Overview: Architecture, Deployment, Performance, and Operations

Elasticsearch is an open‑source, distributed, RESTful full‑text search engine built on Lucene. Every field can be indexed, and the system can scale horizontally to hundreds of servers, handling terabytes of data with very low latency for storage, search, and analysis.

Didi’s Elasticsearch (ES) platform powers the majority of the company’s on‑line text retrieval, a portion of log processing, and vector‑search scenarios such as map POI lookup, order search, customer‑service queries, internal search, and log analysis. Since 2020 the platform has been upgraded from 2.x to 7.6.0, with continuous improvements in stability, cost control, efficiency, and ecosystem health.

Architecture

The overall product architecture consists of ES services deployed on physical machines, while the Gateway and control layers run in containers. Physical‑machine deployment was chosen for higher stability in online text‑search workloads.

Control Layer

Key functions include intelligent segment merge (preventing segment bloat and Full GC), index lifecycle management, index pre‑creation (avoiding midnight OOM spikes), and tenant governance.

Gateway Layer

The Gateway handles read/write forwarding, query optimisation (e.g., rewriting BKD queries to numeric equality or range queries), three‑level rate limiting (by AppID, index‑template, and DSL), tenant authentication, and SQL support (a customised version of the open‑source ES‑SQL from NLPChina). All external search traffic is exposed through the Gateway API.

User Console

The console provides business teams with a platform to request AppIDs, manage indices (creation, permission, mapping, cleanup), and perform queries via Kibana, DSL, or SQL. Monitoring panels show index metadata (document count, size) and read/write rates.

Operations Control Platform

The platform supports daily operational needs: cluster management (logical clusters spanning multiple physical machines), tenant management with per‑tenant rate limiting, template management (shard‑count adjustments, template throttling), anomaly analysis (template, slow‑query, and exception analysis), and operation logs.

Elasticsearch Applications at Didi

Online full‑text search (e.g., map POI start‑end lookup)

MySQL real‑time snapshots for order queries

One‑stop log retrieval via Kibana (trace logs)

Time‑series analysis for security monitoring

Simple OLAP for internal dashboards

Vector search for customer‑service RAG

Deployment Model

Physical‑machine + small‑cluster deployment, scaling up to roughly 100 physical nodes.

Access Methods

Create indices through the user console; business teams select the target cluster.

Query through the Gateway using a VIP address; an SDK abstracts the HTTP calls and routes requests to the appropriate ES cluster.

Data Synchronisation

Two synchronisation approaches are used:

Real‑time: Log → ES, MQ (Kafka/Pulsar) → ES, MySQL → ES via a unified DSINK tool built on Flink.

Offline: Hive → ES using batch load. MapReduce generates Lucene files, which are stored in HDFS, pulled to DataNodes, and appended to ES via a custom AppendLucene plugin.

Engine Iteration – Fine‑Grained Tiered Protection

Cluster‑level isolation: four protection levels (log cluster, public cluster, independent cluster, dual‑active cluster). Mis‑assigned clusters can be migrated transparently via DCDR.

Clientnode isolation: read/write separation; a failing clientnode only impacts writes, not queries.

Datanode region isolation: problematic indices can be labelled and migrated to specific nodes to avoid affecting other workloads.

Multi‑Active Construction – Didi Cross‑Datacenter Replication (DCDR)

DCDR provides push‑based, real‑time replication between ES clusters, using a leader‑follower index model with checkpoints, sequence numbers, and a write queue to avoid OOM during massive data copy.

Performance Optimisation – JDK 17 + ZGC

JDK 11 + G1 caused GC pauses exceeding P99 latency requirements (e.g., POI 180 ms vs. 60 ms target). Switching to JDK 17 + ZGC reduced pause times to <10 ms, while JDK 17 + G1 improved throughput by ~15 %. Upgrades also included Groovy syntax, plugin refactoring, dependency updates, and a ZGC monitoring metric suite.

After ZGC deployment, payment‑cluster P99 latency dropped from 800 ms to 30 ms (96 % reduction) and average query time fell from 25 ms to 6 ms (75 % reduction). Log clusters saw a 15‑20 % write‑performance boost.

Cost Optimisation

Index mapping optimisation (disable unnecessary inverted/forward indexes).

Adopt ZSTD compression, reducing CPU usage by ~15 %.

Integrate with the big‑data asset‑management platform to identify and retire unused partitions and indices.

Multi‑Tenant Resource Isolation

Inspired by Presto’s resource‑group isolation, the search thread pool is split into multiple sub‑pools per AppID, each with configurable thread counts and queue sizes. This prevents a single tenant’s bursty queries from saturating CPU and affecting other tenants.

The workflow extracts the AppID, looks up its isolation configuration, and dispatches the request to the corresponding sub‑thread‑group.

Data Security

Gateway‑level authentication (AppID‑based).

ES‑node level authentication (Clientnode, Datanode, MasterNode).

Didi built a lightweight custom security plugin that performs simple string checks at the HTTP layer, enabling rolling upgrades, one‑click enable/disable, and avoiding the stability issues of X‑Pack.

Stability Governance

Pre‑incident prevention: yearly stability “mine‑clearing” projects solved 61 issues (e.g., Gateway Full GC, thread‑local leaks).

Real‑time monitoring and alerting (hardware, shard counts, master pending tasks, MQ lag).

Grafana dashboards for cluster, node, and shard metrics, enabling sub‑5‑minute fault localisation.

Self‑healing mechanisms (disk‑failure auto‑recovery, traffic‑spike throttling) and dual‑active failover drills.

Summary & Outlook

Currently Didi runs ES 7.6 across all online search scenarios. The next major goal is a smooth upgrade to ES 8.13 to gain a more efficient master node, automatic disk‑balancing, reduced segment memory footprint, and new features such as ANN vector search. Performance work will focus on write‑path optimisation and improved merge strategies, while exploring the machine‑learning capabilities of newer ES releases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Elasticsearch Didi multi-tenancy

Written by

Didi Tech

Official Didi technology account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Architecture

Control Layer

Gateway Layer

User Console

Operations Control Platform

Elasticsearch Applications at Didi

Deployment Model

Access Methods

Data Synchronisation

Engine Iteration – Fine‑Grained Tiered Protection

Multi‑Active Construction – Didi Cross‑Datacenter Replication (DCDR)

Performance Optimisation – JDK 17 + ZGC

Cost Optimisation

Multi‑Tenant Resource Isolation

Data Security

Stability Governance

Summary & Outlook

Didi Tech

How this landed with the community

Was this worth your time?

0 Comments

Performance Optimisation – JDK 17 + ZGC