Working with Python
Regular Expressions
Regular expressions in Python uses the standard module re
It is recommended to use raw strings for a regex pattern to avoid issues with escaping special characters. Recent versions of Python will issue a warning when not using a raw string for escaped characters.
Basic regular expression matching
Typically I want to use re.search
instead of re.match
, the difference is .search
will scan the entire string to match the expression anywhere, while .match
matches against the whole string at once.
import re
s = "There are 13 dogs outside."
m = re.match(r"(\d+)", s)
if m is None:
print("No match")
m = re.search(r"(\d+)", s)
if m is not None:
print(f"Match found: {m.group(1)} dogs")
Both .match
and .search
return a Match object or None
if no match found. The Match object first value will be the entire portion of the string matched and then the next items in the group the list of matches parentheses.
import re
s = "There are 13 dogs outside."
m = re.match(r".*?(\d+)\s(\w+)", s)
if m is None:
print("No match")
else:
print(f"Matched String: {m.group(0)}")
print(f"First Paren: {m.group(1)}")
print(f"Second Paren: {m.group(2)}")
m = re.search("(\d+)\s(\w+)", s)
if m is not None:
print(f"Matched String: {m.group(0)}")
print(f"First Paren: {m.group(1)}")
print(f"Second Paren: {m.group(2)}")
Regular Expression Substitution
Use the re.sub()
function to replace text based on a regular expression.
import re
s = "There are 13 dogs by 4 windows."
s = re.sub(r"\d+", "2", s)
s
>>> "There are 2 dogs by 2 windows."