Databases 8 min read

Elasticsearch Performance Tuning: Configuration Settings to Boost Write Throughput

This article details practical Elasticsearch tuning steps—adjusting index buffers, thread pools, refresh intervals, and translog settings—to raise average write speed from 3,000 to 8,000 documents per second and maintain stability under load.

Top Architect
Top Architect
Top Architect
Elasticsearch Performance Tuning: Configuration Settings to Boost Write Throughput

Background

The original setup runs Elasticsearch 5.6.0 on three Alibaba Cloud ECS nodes (16 GB RAM, 4 CPU, HDD). Before optimization, write throughput averaged 3000 docs/s and dropped sharply under stress, causing GC pauses and OOM errors.

Production Configuration

Key configuration changes added to elasticsearch.yml :

indices.memory.index_buffer_size: 20%
indices.memory.min_index_buffer_size: 96mb

# Search pool
thread_pool.search.size: 5
thread_pool.search.queue_size: 100
# Bulk pool
thread_pool.bulk.size: 16
thread_pool.bulk.queue_size: 300
# Index pool
thread_pool.index.size: 16
thread_pool.index.queue_size: 300

indices.fielddata.cache.size: 40%

discovery.zen.fd.ping_timeout: 120s
discovery.zen.fd.ping_retries: 6
discovery.zen.fd.ping_interval: 30s

Template for log indices:

PUT /_template/elk
{
  "order": 6,
  "template": "logstash-*",
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 6,
    "refresh_interval": "30s",
    "index.translog.durability": "async",
    "index.translog.sync_interval": "30s"
  }
}

Optimization Parameter Details

Disable analysis for non‑text fields : set not_analyzed to avoid unnecessary tokenization.

Disable the _all field : not needed for log/APM data.

Set replica count to 0 : logs are retained for 7 days; full data lives in Hadoop, so replicas can be omitted.

Use Elasticsearch‑generated IDs : reduces version lookups.

Increase index.refresh_interval to 30 s : lowers refresh overhead when real‑time visibility isn’t required.

Limit segment merge threads to avoid heavy I/O on mechanical disks:

curl -XPUT 'your-es-host:9200/nginx_log-2018-03-20/_settings' -d '{
  "index.merge.scheduler.max_thread_count" : 1
}'

Allowing max_thread_count + 2 threads (i.e., 3) balances concurrency and disk I/O.

Asynchronous translog ( index.translog.durability: async and index.translog.sync_interval: 30s ) tolerates occasional data loss, which is acceptable because logs are backed up in Hadoop.

Increase memory buffers : raise indices.memory.index_buffer_size to 20 % and indices.memory.min_index_buffer_size to 96 MB, preventing frequent segment flushes.

Adjust fielddata cache to 15 % for aggregation‑heavy queries, but keep it low for log workloads where aggregations are rare.

Extend node discovery timeout settings to handle high network traffic during bulk ingestion.

Conclusion

The listed settings collectively raise average write speed to 8000 docs/s and allow the cluster to recover within 30 minutes after a stress test, with all metrics returning to normal.

backendperformanceIndexingElasticsearchConfigurationtuning
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.