Elasticsearch Version Upgrade: Architecture, Challenges, and Performance Optimization at Didi
Over seven months, Didi’s Elasticsearch team upgraded more than 30 clusters, 2,000 nodes and 4 PB of data from version 2.3.3 to 6.6.1, overcoming protocol and mapping incompatibilities with a multi‑version Arius Gateway, custom Java SDK, ECM and AMS, while saving 1 PB of storage, decommissioning 400 machines, boosting query speed by 40 %, write throughput by 30 % and cutting CPU use 10 % for an estimated 80 w/month cost reduction.
This article details Didi's Elasticsearch team's successful upgrade of 30+ clusters, 2000+ nodes, and 4PB of data from version 2.3.3 to 6.6.1 over 7 months. The upgrade addressed critical challenges including protocol incompatibility, mapping differences, and resource constraints while maintaining zero impact on user queries.
The team implemented a comprehensive architecture upgrade including multi-version support through Arius Gateway, a custom ES Java SDK to handle TCP/HTTP differences, and an ElasticSearch Cluster Manager (ECM) for efficient cluster operations. They also developed an AMS (Arius MetaData Service) for comprehensive monitoring and analysis.
Resource optimization was achieved through data-driven storage tiering, mapping optimization, and the introduction of FastIndex for offline data import, resulting in 1PB data savings and 400+ physical machines returned. Performance improvements included 40% query performance increase, 30% write throughput improvement, and 10% CPU reduction.
The upgrade process involved extensive testing including query traffic replay comparison systems to ensure data consistency and performance parity. The successful migration demonstrates best practices for large-scale Elasticsearch upgrades while maintaining service stability and improving cost efficiency by 80w/month.
Didi Tech
Official Didi technology account
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.