Fundamentals 11 min read

Mastering Regular Expressions: Essential Rules and Advanced Techniques

This article provides a comprehensive guide to regular expressions, covering basic concepts, character classes, quantifiers, special symbols, greedy vs. lazy matching, backreferences, lookahead/lookbehind assertions, and practical tips for writing robust patterns in programming.

Efficient Ops
Efficient Ops
Efficient Ops
Mastering Regular Expressions: Essential Rules and Advanced Techniques

What Is a Regular Expression?

A regular expression (regex) uses a string pattern to describe a feature and then checks whether another string matches that feature, such as validating email addresses, searching within text, or performing flexible replacements.

Basic Rules

Literal Characters

Letters, digits, Chinese characters, underscores, and any punctuation without special meaning match themselves directly.

Escape Characters

Special characters are escaped with a backslash, e.g.,

\r

(carriage return),

\n

(newline),

\t

(tab),

\\

(a literal backslash), and symbols like

\^

,

\$

,

\.

.

Character Classes

\d

: any digit (0‑9)

\w

: any word character (letters, digits, underscore)

\s

: any whitespace character

.

: any character except newline

Custom classes can be defined with brackets, e.g.,

[123]

matches 1, 2, or 3, and

[^abc]

matches any character except a, b, or c.

Quantifiers

{n}

: exactly n repetitions

{m,n}

: between m and n repetitions

{m,}

: at least m repetitions

?

: 0 or 1 time

+

: 1 or more times

*

: 0 or more times

Special Symbols

^

: start of string (or line with multiline mode)

$

: end of string (or line with multiline mode)

\b

: word boundary

|

: alternation (OR)

( )

: grouping and capturing

Advanced Rules

Greedy vs. Lazy Matching

Quantifiers are greedy by default, matching as much as possible; appending

?

makes them lazy, matching as little as needed.

Backreferences

Captured groups can be referenced later with

\1

,

\2

, etc., allowing reuse of previously matched text.

Lookahead and Lookbehind

Positive lookahead

(?=pattern)

asserts that

pattern

follows without consuming characters, while negative lookahead

(?!pattern)

asserts it does not. Similarly, positive lookbehind

(?<=pattern)

and negative lookbehind

(?<!pattern)

assert conditions before the current position.

Practical Tips

Use

^

and

$

to anchor the entire string.

Use

\b

to match whole words.

Avoid patterns that can match an empty string to prevent infinite loops.

Ensure alternation branches do not overlap ambiguously.

Choose greedy or lazy quantifiers wisely based on the desired match.

Source: Backend Tech Talk, author: 飒然Hang
regular expressionsprogramming fundamentalsRegexPattern Matchingstring validation
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.