Fundamentals 10 min read

Understanding Brute‑Force, Boyer‑Moore, and KMP String‑Matching Algorithms with Java Code

This article explains the Brute‑Force (BF), Boyer‑Moore (BM), and Knuth‑Morris‑Pratt (KMP) string‑matching algorithms, illustrates their operation with step‑by‑step visual examples, discusses the inefficiencies of BF, the shift rules of BM, and the prefix‑suffix logic of KMP, and provides a complete Java implementation including the construction of the next array.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Understanding Brute‑Force, Boyer‑Moore, and KMP String‑Matching Algorithms with Java Code

The article revisits three classic string‑matching algorithms—Brute‑Force (BF), Boyer‑Moore (BM), and Knuth‑Morris‑Pratt (KMP)—showing how each compares a pattern string against a text string and highlighting their respective strengths and weaknesses.

BF is described as a straightforward but inefficient method that slides the pattern one position at a time, performing character‑by‑character comparisons until a mismatch occurs, which leads to many unnecessary checks.

BM improves on BF by using the "bad character" and "good suffix" heuristics to shift the pattern multiple positions in a single step, dramatically reducing the number of comparisons required.

KMP avoids redundant comparisons by pre‑computing a next array that records the length of the longest proper prefix which is also a suffix for each prefix of the pattern; during matching, the algorithm jumps to the next viable position based on this array when a mismatch (bad character) is encountered.

The construction of the next array is illustrated with a concrete example (pattern GTGTGC ), showing how the array values are derived by comparing characters, backtracking using previously computed next values, and handling mismatches.

The article then provides a complete Java implementation of the KMP algorithm, including the main search routine and the helper method that builds the next array. The code is presented unchanged and wrapped in tags for clarity.

// KMP algorithm core logic. str is the main string, pattern is the pattern string
public static int kmp(String str, String pattern) {
    // preprocess, generate next array
    int[] next = getNexts(pattern);
    int j = 0;
    // main loop, iterate over main string characters
    for (int i = 0; i < str.length(); i++) {
        while (j > 0 && str.charAt(i) != pattern.charAt(j)) {
            // when encountering a bad character, query next array and change pattern start
            j = next[j];
        }
        if (str.charAt(i) == pattern.charAt(j)) {
            j++;
        }
        if (j == pattern.length()) {
            // match found, return index
            return i - pattern.length() + 1;
        }
    }
    return -1;
}

// generate Next array
private static int[] getNexts(String pattern) {
    int[] next = new int[pattern.length()];
    int j = 0;
    for (int i = 2; i < pattern.length(); i++) {
        while (j != 0 && pattern.charAt(j) != pattern.charAt(i - 1)) {
            // backtrack to next[j]
            j = next[j];
        }
        if (pattern.charAt(j) == pattern.charAt(i - 1)) {
            j++;
        }
        next[i] = j;
    }
    return next;
}

public static void main(String[] args) {
    String str = "ATGTGAGCTGGTGTGTGCFAA";
    String pattern = "GTGTGCF";
    int index = kmp(str, pattern);
    System.out.println("首次出现位置:" + index);
}

Through detailed diagrams and the provided source code, readers gain a clear understanding of how each algorithm works and how to implement KMP efficiently in Java.

JavaalgorithmKMPstring matchingBoyer-MooreNext Array
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.