Big Data 16 min read

Elasticsearch Adoption Cases in Chinese Companies: JD.com, Ctrip, Qunar, 58.com, Didi and More

This article surveys how major Chinese internet companies such as JD.com, Ctrip, Qunar, 58.com and Didi have adopted Elasticsearch and the Elastic Stack for high‑volume order queries, log analysis, real‑time monitoring, security analytics, and large‑scale distributed search, describing their architecture evolution, shard strategies, and operational practices.

Architecture Digest

May 8, 2020

Elasticsearch Adoption Cases in Chinese Companies: JD.com, Ctrip, Qunar, 58.com, Didi and More

Many domestic companies such as JD.com, Didi, Toutiao, Ele.me, 360 Security, Xiaomi and Vivo are using Elasticsearch.

Beyond search, combined with Kibana, Logstash and Beats, the Elastic Stack is widely applied in near‑real‑time big‑data analysis, including log analysis, metric monitoring, information security and other fields.

It helps explore massive structured and unstructured data, create visual reports on demand, set alert thresholds for monitoring data, and even use machine‑learning techniques to automatically detect anomalies.

1. JD.com to Home Order Center Elasticsearch Evolution

In JD.com to Home order center, both external merchant order production and internal upstream/downstream system dependencies generate a huge number of order queries, resulting in a read‑heavy, write‑light workload.

Order data is stored in MySQL, but relying solely on the database for massive queries is impractical, and MySQL does not handle complex queries well, so the order center uses Elasticsearch to bear the main query pressure.

Elasticsearch, as a powerful distributed search engine, supports near‑real‑time storage and search, playing a huge role in JD.com to Home order system; the ES cluster now stores 1 billion documents with a daily query volume of 500 million.

With rapid business growth, the ES cluster architecture has evolved into a real‑time active‑backup solution that ensures stable read/write performance.

The architecture uses a VIP for external load balancing; the first layer of gateway nodes are ES client nodes acting as intelligent load balancers that distribute requests.

The second layer consists of data nodes that store data and execute operations. The cluster has one primary shard set and two replica sets (one primary, two replicas). Requests forwarded from gateway nodes are round‑robin balanced before reaching data nodes. Adding a replica set and scaling machines increases throughput and query performance.

Choosing the appropriate number of shards is critical; the team performed extensive load testing and settled on a shard configuration that balances single‑ID query throughput and pagination aggregation performance.

Most ES query traffic comes from recent orders, and older closed orders are archived to a historical order database after a certain number of days.

There is no single best architecture; the goal is to find the most suitable one that can evolve as business grows, delivering higher throughput, better performance and stronger stability.

2. Ctrip Elasticsearch Application Cases

1. Ctrip Hotel Order Elasticsearch Practice

Real‑time indexing of sharded databases and routing queries to an independent web service improves query convenience while maintaining performance.

Elasticsearch was chosen for its lightweight nature, ease of use, and strong distributed support; the installation package is only a few tens of megabytes.

Reference: http://developer.51cto.com/art/201807/579354.htm

2. Ctrip Flight Elasticsearch Cluster Operations

Data flows from Kafka to Elasticsearch via ETL, with hot, warm and cold storage tiers (HDFS for cold data, databases and caches for warm/hot data). Two main application scenarios are traditional BI reporting for decision makers and fast data‑driven decision loops where programs consume analysis results to adjust strategies.

Reference: http://www.sohu.com/a/199672012_411876

3. Ctrip Large‑Scale Elasticsearch Cluster Management Insights

The largest log cluster has 120 data nodes on 70 physical servers, handling 60 billion daily index records (25 TB new index files, 50 TB with one replica).

Peak indexing rate reaches millions of records per second.

Retention period varies from 10 to 90 days based on business needs.

Cluster contains 3 441 indices, 17 000 shards, about 9.3 × 10¹¹ documents, consuming roughly 1 PB of disk.

Reference: https://www.jianshu.com/p/6470754b8248

3. Qunar Order Center Elasticsearch Solution

In 2015 Qunar's hotel daily orders exceeded 300 k, and aggregated orders across platforms reached about 1 million per day.

Previously, a hot‑table sharding approach stored recent six‑month orders in one table and older orders in a history table, but this could not meet the scale required for Ctrip‑Elong integration.

With projected data volumes exceeding 100 million records, a new approach was needed.

Elasticsearch was introduced to handle storage and search for order data, abstracting the order model and separating searchable fields (stored in ES) from detailed fields (stored in DB).

Simple queries by OrderNo go to the DB, while complex searches go directly to Elasticsearch.

Each index has 8 shards; a single index holds 140 million documents (total ~200 million), occupying 64 GB. The cluster machines have 240 GB disk capacity.

Reference: https://elasticsearch.cn/article/6197

4. Elasticsearch in 58.com Information Security Department

The Elastic Stack is fully deployed in 58.com’s security department, covering integration background, storage selection, performance challenges, master and data node optimizations, security practices, high‑throughput low‑latency search tuning, and localized Kibana for product and operations use.

Reference: https://elasticsearch.cn/slides/124

5. Didi Elasticsearch Multi‑Cluster Architecture Practice

Since early 2016 Didi has built an Elasticsearch platform with over 3 500 instances, more than 5 PB of data, and peak write throughput exceeding 20 million TPS.

Elasticsearch supports a wide range of scenarios: core ride‑hailing map search, multi‑dimensional queries for customer service and operations, and log services for thousands of internal platforms.

In a single‑cluster setup, writes and queries are managed by Sink and Gateway services.

1. Sink Service

Almost all data written to Elasticsearch is consumed from Kafka. The Sink service ingests business logs, MySQL binlog data and custom reports into Elasticsearch in real time.

The service protects the cluster from overwhelming write loads and has been extracted into a separate Didi Sink data delivery platform that can sync data to Elasticsearch, HDFS, Ceph and other storage systems.

With multi‑cluster architecture, the same MQ data can be written to multiple Elasticsearch clusters for disaster recovery and fault‑tolerant rollback.

2. Gateway Service

All query traffic passes through the Gateway service, which implements Elasticsearch HTTP RESTful and TCP protocols, allowing clients to access Elasticsearch via language‑specific SDKs.

The Gateway also provides a SQL interface, access control, logging, rate limiting, degradation, index storage separation, DSL‑level throttling, and multi‑cluster disaster recovery capabilities.

Reference: https://mp.weixin.qq.com/s/K44-L0rclaIM40hma55pPQ

6. Practical Order Search Solution Based on Elasticsearch

Elasticsearch supports structured data queries and real‑time frequent updates, addressing pain points of traditional order query reporting.

The business line uses a service‑oriented approach: the Elasticsearch cluster and database are partitioned, and the order service encapsulates them into a unified external API for front‑end, back‑end and reporting applications.

Reference: https://my.oschina.net/u/2485991/blog/533163