Databases 14 min read

Elasticsearch Cluster Architecture and Distributed System Design

This article explains the architecture of Elasticsearch clusters, covering node roles, index, shard and replica concepts, deployment models, data storage mechanisms, and compares two distributed system designs—local‑file‑system and shared‑file‑system—highlighting their advantages and trade‑offs.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Elasticsearch Cluster Architecture and Distributed System Design

Elasticsearch Cluster Architecture

Elasticsearch is a widely used open‑source search and analytics system that excels in three main areas: search, JSON document storage, and time‑series data analysis.

Node (Node): a running Elasticsearch instance, typically a process on a machine.

Index (Index): a logical collection that includes mapping and inverted/forward index files, which may be distributed across one or many machines.

Shard (Shard): a partition of an index that is managed by a node; each shard can be a primary or a replica.

Replica (Replica): a copy of a shard that provides fault tolerance and read scalability.

When indexing a document, it is routed to the primary shard, indexed there, then replicated to its replica shards before the operation is acknowledged.

If a primary or replica shard fails, the missing shard is recovered by copying data from an existing replica to a new node, a process that may temporarily reduce service capacity.

Role Deployment Models

Two common deployment approaches are described:

1. Mixed Deployment (left diagram)

All node roles (master, data, transport) run in a single process.

Requests are randomly routed to any node, which uses a global routing table to forward them to the appropriate data nodes.

Advantages: simple setup, easy to start with a single node.

Disadvantages: resource contention in large clusters, higher connection count (each node maintains connections to all others), and no hot‑update support.

2. Tiered Deployment (right diagram)

Transport nodes handle request forwarding and result merging, while data nodes focus solely on data processing.

Roles are isolated, reducing interference and allowing independent scaling.

Advantages: better resource isolation, easier to avoid hot spots, supports hot updates by upgrading data nodes first.

Disadvantages: more complex configuration and planning of node counts.

Elasticsearch Data Layer Architecture

Indexes and metadata are stored on the local file system, with several loading options (niofs, mmap, simplefs, smb). The default is automatic selection, but mmap (memory‑mapped files) often yields the best performance.

Because data resides locally, node failures can cause data loss; replicas are used to mitigate this risk.

Replica Purpose

Ensure service availability by allowing traffic to be served from remaining replicas when one fails.

Guarantee data reliability; without replicas, loss of a primary node would require a full reindex.

Increase query capacity by adding replicas, scaling read throughput proportionally.

Challenges

Replica copies increase storage cost and may waste compute resources when not needed.

Write performance suffers because each write must be propagated to primary and then to replicas.

Adding replicas during scaling can be slow due to full data copy.

Distributed System Designs

1. Local File‑System Based Distributed System

Each shard stores its data locally with a primary and one replica. When a node crashes, the replica is promoted, and a new replica is created on another node, requiring potentially large data copies.

2. Distributed File‑System (Shared Storage) Based Distributed System

Computation and storage are separated: shards contain only compute logic and reference files on a shared distributed file system (e.g., HDFS). If a node fails, a new compute node can quickly attach to the existing shared data without massive data transfer.

Advantages: elastic scaling of compute and storage independently, finer‑grained resource management, better hotspot handling.

Disadvantage: access to shared storage may be slower than local disks, though modern user‑space protocols have narrowed this gap.

Summary

The two architectures each have strengths and weaknesses; choosing between them depends on specific workload requirements, cost considerations, and operational constraints.

Understanding these trade‑offs helps designers build more reliable and efficient distributed data systems.

distributed systemsElasticsearchReplicationdata storageCluster Architecturedeployment models
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.