Elasticsearch 8.x Performance Boosts, New Features, and Migration Guide
This article details how upgrading from Elasticsearch 5.x/2.x to 8.x dramatically improves search, aggregation, and write performance while reducing storage costs, introduces vector KNN, synthetic _source, TSDS, searchable snapshots, security enhancements, and provides migration examples and code snippets for enterprise search platforms.
Elasticsearch 8.x (latest 8.13) offers significant performance improvements over 5.x/2.x, including 30‑50% faster search, 60‑90% faster aggregations, 20‑30% faster writes, and roughly 20% lower storage costs, plus new capabilities such as vector KNN, RRF ranking, ESRE, snapshots, and a built‑in time‑series database (TSDB).
The middleware search team upgraded the XSearch platform to support ES8 clusters, enabling seamless one‑click upgrades from older versions and reducing operational overhead.
Background : The company still runs many ES5.x and a few ES2.x clusters, causing slow queries, timeouts, and rising costs due to constant scaling. Growing demand for vector storage and retrieval driven by large language models further motivates the upgrade.
Performance Gains :
Search performance: +30‑50%
Aggregation performance: +60‑90% (some queries 2‑10× faster)
Write performance: +20‑30%
Storage cost: ~20% reduction
Range Query : ES8 reduces latency by ~30% compared with ES5, as shown by esrally benchmarks.
Wildcard Query : ES7.9 added native wildcard support, improving fuzzy matching efficiency. Example mapping and query:
PUT my-index-000001
{
"mappings": {
"properties": {
"my_wildcard": {
"type": "wildcard"
}
}
}
} GET my-index-000001/_search
{
"query": {
"wildcard": {
"my_wildcard": "*quite*lengthy"
}
}
}Vector KNN Retrieval : ES8 uses HNSW for approximate nearest‑neighbor search. Sample query:
POST image-index/_search
{
"knn": {
"field": "image-vector",
"query_vector": [-5, 9, -12],
"k": 10,
"num_candidates": 100
}
}ES8 also integrates Elastic Learned Sparse Encoder (ELSER) for sparse vector retrieval, usable via text_expansion queries:
GET my_index/_search
{
"query": {
"text_expansion": {
"ml.tokens": {
"model_id": ".elser_model_1",
"model_text": "Sample"
}
}
}
}Mixed ranking combines BM25 relevance scores with KNN similarity using either linear weighting or Reciprocal Rank Fusion (RRF). Example of linear fusion for image search:
POST image-index/_search
{
"query": {
"multi_match": {
"query": "flower",
"fields": ["title", "description"],
"boost": 0.6
}
},
"knn": {
"field": "image-vector",
"query_vector": [-5, 9, -12],
"k": 5,
"num_candidates": 100,
"boost": 0.4
}
}RRF example (no tuning required) merges rankings from BM25 and vector models.
Time‑Series Database (TSDS) : ES8 adds a distributed TSDB that stores data in time‑series data streams, reducing storage by ~70% and allowing native time‑based queries. Example index template:
{
"index_patterns": ["metrics-weather_sensors-*"],
"data_stream": {},
"template": {
"settings": {
"index.mode": "time_series",
"index.routing_path": ["sensor_id", "location"]
},
"mappings": {
"properties": {
"sensor_id": {"type": "keyword", "time_series_dimension": true},
"location": {"type": "keyword", "time_series_dimension": true},
"temperature": {"type": "half_float", "time_series_metric": "gauge"},
"humidity": {"type": "half_float", "time_series_metric": "gauge"}
}
}
}
}Searchable Snapshots : Allows querying archived indices directly without restoring, improving efficiency for large backup data.
Security : The open‑source edition now includes built‑in security authentication without third‑party plugins.
Application Scenarios : AIGC knowledge‑base services, insurance core policy search, and auto‑insurance core search have already migrated to ES8, leveraging vector KNN, multi‑path recall, and hybrid ranking.
Future Plans : Company‑wide rollout aims to replace dozens of ES5/2 clusters, saving ~30% of ECS resources annually, and to upgrade the XSearch platform to expose ES8 features such as TSDS, snapshots, OLAP, ML/NLP models, and security to all business lines.
ZhongAn Tech Team
China's first online insurer. Through tech innovation we make insurance simpler, warmer, and more valuable. Powered by technology, we support 50 billion RMB of policies and serve 600 million users with smart, personalized solutions. ZhongAn's hardcore tech and article shares are here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.