Understanding Regular Expressions in Python with Practical Code Examples
This article introduces Python's regular expression capabilities, covering fundamental concepts, metacharacters, character classes, quantifiers, and practical usage of the re module's functions such as search, match, findall, and sub, with clear code examples.
Regular expressions (often called regex) are powerful tools for text processing and pattern matching. In Python, the re module provides rich regex functionality, allowing developers to efficiently handle complex text.
Basic Concepts of Regular Expressions
A regular expression is a sequence of characters that defines a search pattern, useful for tasks such as searching, editing, and manipulating text.
Literal Text
Plain text in a regex matches the exact characters in the target string. For example, to find the word "cat" in "The cat sat on the mat", the pattern cat matches the text directly.
Metacharacters
Metacharacters have special meanings in a regex:
. matches any character except a newline.
^ matches the start of a string.
$ matches the end of a string.
* matches zero or more repetitions of the preceding element.
+ matches one or more repetitions of the preceding element.
? matches zero or one repetition of the preceding element.
{n} matches exactly n repetitions of the preceding element.
Character Classes
Character classes match any character from a given set. Common examples include:
[abc] matches any of a, b, or c.
\d matches any digit (equivalent to [0-9]).
\w matches any word character (letters, digits, underscore).
\s matches any whitespace character.
Quantifiers
Quantifiers specify how many times a character or group may appear:
* : 0 or more.
+ : 1 or more.
? : 0 or 1.
{n} : exactly n.
{n,} : n or more.
{n,m} : between n and m.
The re Module in Python
The re module is part of Python's standard library and requires no installation. Below are basic examples of its most common functions.
re.search()
This function searches for the first location where the regex pattern matches in a string.
<code>import re
# Define the pattern
pattern = r"\d+"
# Define the text to analyze
text = "There are 123 apples."
# Apply the regex
match = re.search(pattern, text)
print(match.group())</code>Output:
<code>123</code>The pattern r"\d+" is a raw string where \d matches any digit and + requires one or more occurrences.
re.match()
This function checks for a match only at the beginning of the string.
<code>import re
# Define the pattern
pattern = r"\d+"
# Define the text to analyze
text = "There are 123 apples."
# Apply the regex
match = re.match(r"\d+", text)
print(match)</code>Output:
<code>None</code>In this case, the string does not start with a digit, so no match is found.
re.findall()
This function returns a list of all non‑overlapping matches of the pattern in the string.
<code>import re
pattern = r"\d+"
text = "There are 123 apples in 2 trees."
matches = re.findall(pattern, text)
print(matches)</code>Output:
<code>['123', '2']</code>Two numeric expressions are found in the string.
re.sub()
This function replaces occurrences of the pattern with a replacement string.
<code>import re
# Define the pattern
pattern = r"\d+"
# Define the text to analyze
text = "There are 123 apples."
# Apply the regex
replaced_text = re.sub(r"\d+", "many", text)
print(replaced_text)</code>Output:
<code>There are many apples.</code>The function replaces all matches, so multiple numbers can be replaced at once:
<code>import re
# Define the pattern
pattern = r"\d+"
# Define the text to analyze
text = "There are 123 apples in 3 trees."
# Apply the regex
replaced_text = re.sub(r"\d+", "many", text)
print(replaced_text)</code> <code>There are many apples in many trees.</code>When to Use Which Function
re.search() – use when you need to find the first occurrence of a pattern anywhere in the string.
re.match() – use when you need to verify that the string starts with a specific pattern.
re.findall() – use when you need all occurrences of a pattern as a list.
This concludes the basic tutorial; advanced usage will be covered in future articles.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.