Fundamentals 12 min read

Python Regular Expressions (re Module) – Concepts, Syntax, and Common Functions

This article explains Python regular expressions, covering basic and special characters, non‑print escapes, quantifier types, the re module’s compile function with flags, and the most frequently used pattern‑object methods such as match, search, findall, finditer, split, sub and subn, plus practical usage notes.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Python Regular Expressions (re Module) – Concepts, Syntax, and Common Functions

Regular expressions (Regex) are textual patterns composed of ordinary characters and special meta‑characters that describe and match strings according to specific syntactic rules.

Ordinary characters include all printable and non‑printable characters that are not defined as meta‑characters, such as letters, digits, punctuation and other symbols.

Special meta‑characters and their meanings are listed in the table below:

Special Character

Description

$

Matches the end of the input string (or a line break when the MULTILINE flag is set).

()

Groups a sub‑expression and captures its match for later use.

*

Matches the preceding sub‑expression zero or more times.

+

Matches the preceding sub‑expression one or more times.

.

Matches any character except a newline.

[ ]

Matches any single character inside the brackets, e.g., [0-9a-zA-Z].

?

Matches the preceding sub‑expression zero or one time, or makes a quantifier lazy.

\

Escapes a meta‑character so it is treated as a literal.

^

Matches the start of the string (or, inside brackets, negates the set).

{}

Specifies the exact or range of repetitions, e.g., {n}, {n,}, {n,m}.

|

Acts as an OR operator between two alternatives.

Non‑print characters can also be part of a regex. Common escape sequences include:

Escape

Description

\b

Word boundary.

\d

Digit character.

\w

Word character (letters, digits, underscore).

\s

Whitespace.

\B

Non‑word boundary.

\D

Non‑digit.

\W

Non‑word character.

\S

Non‑whitespace.

Quantifiers have three important concepts:

Greedy (e.g., * , + ) – tries to match as much as possible, backtracking when necessary.

Lazy (e.g., ? after a quantifier) – matches as little as possible.

Possessive (e.g., + with a possessive modifier) – matches the maximum without backtracking.

The re module provides the compile() function to compile a pattern into a reusable object. Common flags include:

Flag

Meaning

re.S (DOTALL)

Dot matches newline characters.

re.I (IGNORECASE)

Case‑insensitive matching.

re.L (LOCALE)

Locale‑aware matching.

re.M (MULTILINE)

^ and $ match start/end of each line.

re.X (VERBOSE)

Allows whitespace and comments in the pattern.

re.U

Unicode‑aware matching (affects \w, \W, \b, \B).

Typical pattern‑object methods:

match() – checks for a match only at the beginning of the string.

search() – scans the entire string for the first match.

findall() – returns a list of all non‑overlapping matches.

finditer() – returns an iterator yielding match objects.

split() – splits a string by the pattern.

sub() – replaces matches with a replacement string or function.

subn() – like sub() but also returns the number of substitutions made.

Example code snippets:

<code>import re
pattern = re.compile(r"\d+")
match = pattern.match("aaa123bbb")
print(match)  # None (no match at start)
match = pattern.match("aaa123bbb", 3, 6)
print(match)   # <_sre.SRE_Match object; span=(3, 6), match='123'>
print(match.group())  # 123
</code>
<code>import re
pattern = re.compile(r"\d+")
match = pattern.search("aaaa1111bbbb1234")
print(match.group())  # 1111
</code>
<code>import re
pattern = re.compile(r"\d+")
print(pattern.findall("aaaa1111bbbb1234cccc1243"))  # ['1111', '1234', '1243']
</code>

Important notes:

re.match only matches at the start of the string, whereas re.search searches anywhere and re.findall returns all matches.

Greedy quantifiers ( * , + , ? ) try to consume as much as possible; appending ? makes them lazy (non‑greedy).

Disclaimer: This article is compiled from online sources; copyright belongs to the original author. Contact us for removal or licensing requests.

Pythonregular expressionsregexPattern Matchingre module
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.