Understanding Brute‑Force, Boyer‑Moore, and KMP String‑Matching Algorithms with Java Code
This article explains the Brute‑Force (BF), Boyer‑Moore (BM), and Knuth‑Morris‑Pratt (KMP) string‑matching algorithms, illustrates their operation with step‑by‑step visual examples, discusses the inefficiencies of BF, the shift rules of BM, and the prefix‑suffix logic of KMP, and provides a complete Java implementation including the construction of the next array.
The article revisits three classic string‑matching algorithms—Brute‑Force (BF), Boyer‑Moore (BM), and Knuth‑Morris‑Pratt (KMP)—showing how each compares a pattern string against a text string and highlighting their respective strengths and weaknesses.
BF is described as a straightforward but inefficient method that slides the pattern one position at a time, performing character‑by‑character comparisons until a mismatch occurs, which leads to many unnecessary checks.
BM improves on BF by using the "bad character" and "good suffix" heuristics to shift the pattern multiple positions in a single step, dramatically reducing the number of comparisons required.
KMP avoids redundant comparisons by pre‑computing a next array that records the length of the longest proper prefix which is also a suffix for each prefix of the pattern; during matching, the algorithm jumps to the next viable position based on this array when a mismatch (bad character) is encountered.
The construction of the next array is illustrated with a concrete example (pattern GTGTGC ), showing how the array values are derived by comparing characters, backtracking using previously computed next values, and handling mismatches.
The article then provides a complete Java implementation of the KMP algorithm, including the main search routine and the helper method that builds the next array. The code is presented unchanged and wrapped in tags for clarity.
// KMP algorithm core logic. str is the main string, pattern is the pattern string
public static int kmp(String str, String pattern) {
// preprocess, generate next array
int[] next = getNexts(pattern);
int j = 0;
// main loop, iterate over main string characters
for (int i = 0; i < str.length(); i++) {
while (j > 0 && str.charAt(i) != pattern.charAt(j)) {
// when encountering a bad character, query next array and change pattern start
j = next[j];
}
if (str.charAt(i) == pattern.charAt(j)) {
j++;
}
if (j == pattern.length()) {
// match found, return index
return i - pattern.length() + 1;
}
}
return -1;
}
// generate Next array
private static int[] getNexts(String pattern) {
int[] next = new int[pattern.length()];
int j = 0;
for (int i = 2; i < pattern.length(); i++) {
while (j != 0 && pattern.charAt(j) != pattern.charAt(i - 1)) {
// backtrack to next[j]
j = next[j];
}
if (pattern.charAt(j) == pattern.charAt(i - 1)) {
j++;
}
next[i] = j;
}
return next;
}
public static void main(String[] args) {
String str = "ATGTGAGCTGGTGTGTGCFAA";
String pattern = "GTGTGCF";
int index = kmp(str, pattern);
System.out.println("首次出现位置:" + index);
}Through detailed diagrams and the provided source code, readers gain a clear understanding of how each algorithm works and how to implement KMP efficiently in Java.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.