Backend Development 10 min read

How to Achieve MySQL‑LIKE Style Fuzzy Search in Elasticsearch 8.x

This article walks through the challenge of implementing MySQL‑LIKE style front‑and‑back wildcard searches in Elasticsearch, comparing match, match_phrase, n‑gram, legacy wildcard queries, and the new wildcard field type introduced in ES 7.9+, with code samples, performance benchmarks, and practical recommendations for choosing the optimal solution.

Rare Earth Juejin Tech Community

Nov 7, 2025

How to Achieve MySQL‑LIKE Style Fuzzy Search in Elasticsearch 8.x

Introduction

When a product manager demanded a MySQL‑LIKE "LIKE '%keyword%'" style fuzzy search, the Elasticsearch team realized a deeper technical challenge.

Tokenization Basics

Understanding Elasticsearch’s core concept of tokenization is essential for fuzzy search.

Tokenization Example

原始文本："苹果手机真香"
分词结果：["苹果", "手机", "真", "香"]

Match Query and Its Limitation

GET /products/_search
{
  "query": {
    "match": {
      "name": "苹果手机"
    }
  }
}

Problem: The default match query uses the or operator, returning many unrelated results such as "苹果电脑" or "华为手机".

Match with Operator "and"

GET /products/_search
{
  "query": {
    "match": {
      "name": {
        "query": "苹果手机",
        "operator": "and"
      }
    }
  }
}

Result: Only documents containing both "苹果" and "手机" are returned, but order is still flexible.

Match_phrase

GET /products/_search
{
  "query": {
    "match_phrase": {
      "name": "苹果手机"
    }
  }
}

Result: Exact phrase match with correct order, but no fuzzy capability.

n‑gram + match_phrase (Pre‑7.9)

PUT /products
{
  "settings": {
    "analysis": {
      "analyzer": {
        "ngram_analyzer": {
          "tokenizer": "ngram_tokenizer"
        }
      },
      "tokenizer": {
        "ngram_tokenizer": {
          "type": "ngram",
          "min_gram": 2,
          "max_gram": 3
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "ngram_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}

GET /products/_search
{
  "query": {
    "match_phrase": {
      "name": "果手"
    }
  }
}

Result: Successfully matches "苹果手机" via substring "果手".

✅ Supports substring matching anywhere.

❌ Index size grows roughly 3×.

❌ Query performance degrades.

❌ Requires careful n‑gram tuning.

Wildcard Queries (Pre‑7.9)

Legacy wildcard queries can be used on keyword fields but are risky.

GET /products/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "*iPhone*",
        "case_insensitive": true
      }
    }
  }
}

Leading wildcard (*) forces enumeration of all terms, causing high CPU and memory usage.

Wildcard Field Type (ES 7.9+)

PUT /products
{
  "mappings": {
    "properties": {
      "name": {
        "type": "wildcard"
      }
    }
  }
}

GET /products/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "*果手*"
      }
    }
  }
}

Performance: ~25 ms latency, index size ~1.4×, low impact on the cluster.

Comparison Summary

match

: Simple, low precision. match + operator "and": Better relevance, order‑independent. match_phrase: Exact phrase, order‑sensitive. n‑gram + match_phrase: Full fuzzy capability, high index cost.

Legacy wildcard: Easy to use but terrible performance.

Wildcard field type: Best for front‑and‑back fuzzy matching with good performance.

Final Recommendation

Deploy an Elasticsearch 8.x cluster.

Use the wildcard field type for fuzzy matching requirements.

Keep traditional searches with match_phrase or other mature queries.

Tip: If a product manager asks for deep pagination, remind them that even large platforms limit pages for usability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Elasticsearch fuzzy-search N-gram

Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Introduction

Tokenization Basics

Tokenization Example

Match Query and Its Limitation

Match with Operator "and"

Match_phrase

n‑gram + match_phrase (Pre‑7.9)

Wildcard Queries (Pre‑7.9)

Wildcard Field Type (ES 7.9+)

Comparison Summary

Final Recommendation

Rare Earth Juejin Tech Community

How this landed with the community

Was this worth your time?

0 Comments

Wildcard Field Type (ES 7.9+)