Databases 23 min read

Understanding Elasticsearch Architecture: Segments, Translog, Refresh, Shard Allocation and Cluster Operations

This article provides a comprehensive overview of Elasticsearch's internal architecture, explaining how data flows from memory buffers to Lucene segments, the role of refresh and translog for durability, segment merging strategies, shard routing, replica consistency, allocation controls, hot‑cold data separation, and cluster discovery settings.

Architect
Architect
Architect
Understanding Elasticsearch Architecture: Segments, Translog, Refresh, Shard Allocation and Cluster Operations

Elasticsearch Architecture Overview

This guide explains how data is written to Elasticsearch, covering the flow from buffer to segment creation, the role of the translog, refresh mechanisms, segment merging, shard routing, replica consistency, and cluster allocation settings.

Data Flow and Segments

Incoming documents are first stored in an in‑memory buffer, then flushed to disk as new Lucene segments; a commit file tracks all segments. The buffer can be flushed to the filesystem cache, enabling near‑real‑time search.

Refresh occurs by default every second via the /_refresh API; the ?refresh=wait_for parameter can wait for the refresh to complete.

Translog and Durability

The translog records operations before they are committed to a segment, ensuring data can be recovered after a crash. Flush clears the translog; its interval and size can be tuned with index.translog.flush_threshold_period , index.translog.flush_threshold_size , and index.translog.flush_threshold_ops . The durability can be set to "async" for performance.

Segment Merging

Background merge threads combine small segments into larger ones to reduce file‑handle and I/O overhead. Merge throttling and policies are configurable via indices.store.throttle.max_bytes_per_sec , index.merge.policy.floor_segment , index.merge.policy.max_merge_at_once , index.merge.policy.max_merged_segment , and thread‑count settings such as Math.min(3, Runtime.getRuntime().availableProcessors() / 2) .

Shard Routing and Replicas

Routing uses shard = hash(routing) % number_of_primary_shards (default routing = _id). Replica write acknowledgment can be controlled with wait_for_active_shards and timeout parameters, allowing options like one , all , or a calculated majority.

Allocation Controls

Cluster‑wide allocation can be limited with settings such as cluster.routing.allocation.enable , disk watermarks ( cluster.routing.allocation.disk.watermark.low , cluster.routing.allocation.disk.watermark.high ), and per‑node limits ( cluster.routing.allocation.node_concurrent_recoveries , indices.recovery.max_bytes_per_sec , etc.). Manual control is possible via the /_cluster/reroute API (allocate, move, cancel, etc.) and the allocation‑explain API.

Hot‑Cold Data Separation

Using node tags (hot, stale) and index routing filters, hot indices stay on a small set of nodes for fast writes while older data is moved to cold nodes for read‑heavy workloads. Example template snippet: { "order":0, "template":"*", "settings":{ "index.routing.allocation.require.tag":"hot" } }

Discovery and Cluster Settings

Since Elasticsearch 2.0, unicast discovery is the default; settings like discovery.zen.unicast.hosts , discovery.zen.fd.ping_timeout , discovery.zen.ping_timeout , and network.host control cluster formation and fault detection. Example configuration: network.host: "192.168.0.2" discovery.zen.minimum_master_nodes: 3 discovery.zen.ping_timeout: 100s discovery.zen.fd.ping_timeout: 100s discovery.zen.unicast.hosts: ["10.19.0.97","10.19.0.98","10.19.0.99","10.19.0.100"]

Indexingelasticsearchcluster managementShard AllocationtranslogSegments
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.