Big Data 22 min read

Elasticsearch Basics: Concepts, Installation, and Search Operations

This article introduces Elasticsearch as a distributed open‑source search and analytics engine, explains its core concepts and architecture, compares it with relational databases, details installation steps, configuration, indexing, analyzers, query DSL, pagination, sorting, and provides practical examples for building search functionality.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Elasticsearch Basics: Concepts, Installation, and Search Operations

Elasticsearch is a distributed open‑source search and analytics engine built on Lucene, designed to simplify full‑text search via a RESTful API.

Core concepts: an Elasticsearch cluster consists of nodes; data is stored in indices, which contain types (deprecated in ES7), documents, and fields. Mapping defines field types and attributes.

Comparison with MySQL: index ↔ database, type ↔ table, document ↔ row, field ↔ column, mapping ↔ column definition.

From ES5.3 onward, types are removed; each index has a single document structure.

In large‑scale scenarios Elasticsearch runs as a cluster; each node holds shards, and each shard has a replica for high availability.

Installation steps (ES5.3.3 example): ensure JDK 8+, download the tarball, extract, create a dedicated user, adjust ownership, and start with cd bin and ./elasticsearch . Common issues include running as root and low vm.max_map_count; solutions involve creating an elasticsearch user/group and increasing vm.max_map_count in /etc/sysctl.conf .

Creating an index can be done via the REST API, e.g.:

PUT /test
{
  "settings": {},
  "mappings": {
    "type1": {
      "properties": {
        "field1": {
          "type": "text",
          "analyzer": "standard"
        }
      }
    }
  }
}

Alternatively, indexing a document automatically creates the index and mapping:

PUT test/type1/1
{
  "user": "kimchy",
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elasticsearch"
}

Analyzers consist of character filters, tokenizers, and token filters; Elasticsearch provides built‑in analyzers and supports custom ones, such as a digit analyzer defined in the analysis settings.

Search APIs include basic queries (term, match, range) and compound queries (bool, dis_max). Example term query:

GET /index_name/_search
{
  "query": {
    "term": {
      "user": "张三"
    }
  }
}

Pagination uses from and size , while deep pagination can use the scroll API with a _scroll_id . Sorting is specified via the sort array.

A practical example demonstrates creating a user_index with an ik_max_word analyzer for Chinese names, bulk inserting sample documents, and performing match queries with the operator set to and to require all terms.

The article concludes that Elasticsearch, combined with Spring Boot, enables rapid addition of powerful search features, but the presented material only scratches the surface of its capabilities.

Big Dataindexingsearch engineElasticsearchinstallationAnalyzersQuery DSL
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.