Databases 23 min read

Understanding Elasticsearch Architecture: Segments, Translog, Refresh, Shard Allocation and Cluster Operations

This article provides a comprehensive overview of Elasticsearch's internal architecture, explaining how data flows from memory buffers to Lucene segments, the role of refresh and translog for durability, segment merging strategies, shard routing, replica consistency, allocation controls, hot‑cold data separation, and cluster discovery settings.

Architect

May 15, 2020

Elasticsearch Architecture Overview

This guide explains how data is written to Elasticsearch, covering the flow from buffer to segment creation, the role of the translog, refresh mechanisms, segment merging, shard routing, replica consistency, and cluster allocation settings.

Data Flow and Segments

Incoming documents are first stored in an in‑memory buffer, then flushed to disk as new Lucene segments; a commit file tracks all segments. The buffer can be flushed to the filesystem cache, enabling near‑real‑time search.

Refresh occurs by default every second via the /_refresh API; the ?refresh=wait_for parameter can wait for the refresh to complete.

Translog and Durability

The translog records operations before they are committed to a segment, ensuring data can be recovered after a crash. Flush clears the translog; its interval and size can be tuned with index.translog.flush_threshold_period, index.translog.flush_threshold_size, and index.translog.flush_threshold_ops. The durability can be set to "async" for performance.

Segment Merging

Background merge threads combine small segments into larger ones to reduce file‑handle and I/O overhead. Merge throttling and policies are configurable via indices.store.throttle.max_bytes_per_sec, index.merge.policy.floor_segment, index.merge.policy.max_merge_at_once, index.merge.policy.max_merged_segment, and thread‑count settings such as Math.min(3, Runtime.getRuntime().availableProcessors() / 2).

Shard Routing and Replicas

Routing uses shard = hash(routing) % number_of_primary_shards (default routing = _id). Replica write acknowledgment can be controlled with wait_for_active_shards and timeout parameters, allowing options like one, all, or a calculated majority.

Allocation Controls

Cluster‑wide allocation can be limited with settings such as cluster.routing.allocation.enable, disk watermarks ( cluster.routing.allocation.disk.watermark.low, cluster.routing.allocation.disk.watermark.high), and per‑node limits ( cluster.routing.allocation.node_concurrent_recoveries, indices.recovery.max_bytes_per_sec, etc.). Manual control is possible via the /_cluster/reroute API (allocate, move, cancel, etc.) and the allocation‑explain API.

Hot‑Cold Data Separation

Using node tags (hot, stale) and index routing filters, hot indices stay on a small set of nodes for fast writes while older data is moved to cold nodes for read‑heavy workloads. Example template snippet:

{ "order":0, "template":"*", "settings":{ "index.routing.allocation.require.tag":"hot" } }

Discovery and Cluster Settings

Since Elasticsearch 2.0, unicast discovery is the default; settings like discovery.zen.unicast.hosts, discovery.zen.fd.ping_timeout, discovery.zen.ping_timeout, and network.host control cluster formation and fault detection. Example configuration:

network.host: "192.168.0.2"
 discovery.zen.minimum_master_nodes: 3
 discovery.zen.ping_timeout: 100s
 discovery.zen.fd.ping_timeout: 100s
 discovery.zen.unicast.hosts: ["10.19.0.97","10.19.0.98","10.19.0.99","10.19.0.100"]

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

indexing Elasticsearch Cluster Management Shard Allocation translog Segments

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.