Mastering Regular Expressions: Essential Rules and Advanced Techniques
This article provides a comprehensive guide to regular expressions, covering basic concepts, character classes, quantifiers, special symbols, greedy vs. lazy matching, backreferences, lookahead/lookbehind assertions, and practical tips for writing robust patterns in programming.
What Is a Regular Expression?
A regular expression (regex) uses a string pattern to describe a feature and then checks whether another string matches that feature, such as validating email addresses, searching within text, or performing flexible replacements.
Basic Rules
Literal Characters
Letters, digits, Chinese characters, underscores, and any punctuation without special meaning match themselves directly.
Escape Characters
Special characters are escaped with a backslash, e.g.,
\r(carriage return),
\n(newline),
\t(tab),
\\(a literal backslash), and symbols like
\^,
\$,
\..
Character Classes
\d: any digit (0‑9)
\w: any word character (letters, digits, underscore)
\s: any whitespace character
.: any character except newline
Custom classes can be defined with brackets, e.g.,
[123]matches 1, 2, or 3, and
[^abc]matches any character except a, b, or c.
Quantifiers
{n}: exactly n repetitions
{m,n}: between m and n repetitions
{m,}: at least m repetitions
?: 0 or 1 time
+: 1 or more times
*: 0 or more times
Special Symbols
^: start of string (or line with multiline mode)
$: end of string (or line with multiline mode)
\b: word boundary
|: alternation (OR)
( ): grouping and capturing
Advanced Rules
Greedy vs. Lazy Matching
Quantifiers are greedy by default, matching as much as possible; appending
?makes them lazy, matching as little as needed.
Backreferences
Captured groups can be referenced later with
\1,
\2, etc., allowing reuse of previously matched text.
Lookahead and Lookbehind
Positive lookahead
(?=pattern)asserts that
patternfollows without consuming characters, while negative lookahead
(?!pattern)asserts it does not. Similarly, positive lookbehind
(?<=pattern)and negative lookbehind
(?<!pattern)assert conditions before the current position.
Practical Tips
Use
^and
$to anchor the entire string.
Use
\bto match whole words.
Avoid patterns that can match an empty string to prevent infinite loops.
Ensure alternation branches do not overlap ambiguously.
Choose greedy or lazy quantifiers wisely based on the desired match.
Source: Backend Tech Talk, author: 飒然Hang
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.