Databases 13 min read

Elasticsearch Pagination: From+size, search_after, and Scroll – Differences, Advantages, and Use Cases

This article explains Elasticsearch’s three pagination methods—From + size, search_after, and Scroll—detailing their definitions, code examples, advantages, disadvantages, and suitable scenarios, while also discussing max_result_window limits, PIT views, and best practices for handling large result sets.

Big Data Technology Architecture

May 6, 2021

Elasticsearch Pagination: From+size, search_after, and Scroll – Differences, Advantages, and Use Cases

1. Frequently Asked Questions about Elasticsearch Pagination

How to retrieve all values of a field (about 1 million) without increasing max_result_window?

How to fetch 20 records per page for front‑end display and request the next 20 on page change?

What are the essential differences and application scenarios of from+size, scroll, and search_after?

2. Three Pagination Methods Supported by Elasticsearch

From + Size query

Search After query

Scroll query

Below we analyse the relationship, differences, pros & cons, and applicable scenarios of these three methods.

2.1 From + Size Pagination

2.1.1 Definition and Practical Example

Basic query: GET kibana_sample_data_flights/_search Default returns the first 10 matching documents. Parameters: from: start offset, default 0 (not 1). size: number of documents to return, default 10.

Example with explicit from, size, query and sorting:

GET kibana_sample_data_flights/_search
{
  "from": 0,
  "size": 20,
  "query": { "match": { "DestWeather": "Sunny" } },
  "sort": [ { "FlightTimeHour": { "order": "desc" } } ]
}

This returns 20 documents; the from and size parameters define which slice of the result set is shown.

2.1.2 Advantages, Disadvantages, and Use Cases

Advantages

Supports random page jumps.

Disadvantages

Limited by max_result_window; cannot paginate indefinitely.

Deep pagination becomes slower because each deeper page requires loading more data.

Use Cases

1) Small datasets or large datasets where only the top N (N ≤ 10,000) results are needed.

2) Search engines that allow random page jumps (e.g., Google, Bing, Baidu).

2.1.3 Why From + Size Is Not Recommended for Deep Pagination

Elasticsearch limits the maximum pagination window to avoid performance degradation. The default index.max_result_window is 10,000, meaning with 10 items per page you can only reach page 1,000.

Attempting to request beyond this limit yields an error such as:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10001]. See the scroll api for a more efficient way to request large data sets."
      }
    ]
  }
}

Two common solutions are:

Use the scroll API for large data sets.

Increase index.max_result_window (e.g., to 50,000) via:

PUT kibana_sample_data_flights/_settings
{
  "index.max_result_window": 50000
}

Official recommendation: avoid excessive use of from+size for deep pagination or large result sets.

Deep pagination forces each shard to load all previous hits into memory, leading to high CPU and memory usage.

2.2 Search After Pagination

2.2.1 Definition and Practical Example

search_after

uses the sort values from the last hit of the previous page to retrieve the next page.

Prerequisite: all subsequent requests must use the same sort order as the initial query, ensuring a stable result sequence.

It relies on a Point In Time (PIT) view, introduced in Elasticsearch 7.10, which provides a lightweight snapshot of the index at a specific moment.

Example to create a PIT:

# Create PIT
POST kibana_sample_data_logs/_pit?keep_alive=1m

Search using the PIT and search_after:

# Step 1: Create PIT
POST /_search
{
  "size": 10,
  "query": { "match": { "host": "elastic" } },
  "pit": { "id": "<pit-id>", "keep_alive": "1m" },
  "sort": [ { "response.keyword": "asc" } ]
}

# Step 2: Subsequent page
POST /_search
{
  "size": 10,
  "query": { "match": { "host": "elastic" } },
  "pit": { "id": "<pit-id>", "keep_alive": "1m" },
  "sort": [ { "response.keyword": "asc" } ],
  "search_after": [ "200", 4 ]
}

The array in search_after contains the sort values of the last document from the previous page (e.g., "200" and the hidden _shard_doc value 4). This hidden field, called the tiebreaker, guarantees deterministic ordering.

2.2.2 Advantages, Disadvantages, and Use Cases

Advantages

Not strictly limited by max_result_window; can paginate beyond 10,000 results.

Disadvantages

Only supports forward pagination; random page jumps are impossible.

Use Cases

Mobile or feed‑style applications where users scroll forward (e.g., news feeds like Toutiao).

2.3 Scroll Traversal Query

2.3.1 Definition and Practical Example

Scroll API retrieves large result sets (potentially all hits) in a cursor‑like fashion, suitable for full data extraction rather than real‑time paging.

Typical workflow:

Issue an initial search with a scroll parameter to keep the context alive:

POST kibana_sample_data_logs/_search?scroll=3m
{
  "size": 100,
  "query": { "match": { "host": "elastic" } }
}

Repeatedly request the next batch using the returned _scroll_id until no hits remain:

POST _search/scroll
{
  "scroll": "3m",
  "scroll_id": "<scroll-id>"
}

2.3.2 Advantages, Disadvantages, and Use Cases

Advantages

Supports full‑index traversal.

Disadvantages

Not real‑time; response time can be longer for large data sets.

Requires heap memory to retain the scroll context.

Use Cases

Exporting or processing the entire dataset when pagination is insufficient.

When deep pagination beyond 10,000 results is needed, the official recommendation is to use PIT + search_after instead of scroll.

3. Summary

From+size : Suitable for random page jumps and top‑10k results.

search_after : Ideal for forward‑only pagination beyond 10k results.

Scroll : Best for full‑data traversal.

Increasing max_result_window only masks the problem; it should not be set excessively.

PIT provides a consistent snapshot view for reliable pagination.

All statements are based on the official Elasticsearch documentation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend Elasticsearch pagination bigdata Search

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.