This is a small notebook to review some examples of regular expressions
import re # This just imports python's regex package
In python we can use re.search(
pattern
,
str
)
to search for a given pattern in a string
The results of this call will return a list of Match objects dictating where the pattern was found throughout the provided str
pattern = 'world'
string = 'hello world'
match = re.search(pattern, string)
print(match)
Note how the match object is within span=(6,11)
. That is because the word 'world' begins at index 6 and ends at index 11
We can however extract more granular information by using the start
, end
, and group()
methods of the match object
print(f"Match at index {match.start()}-{match.end()}")
print("Full match: {match.group(0)}")
In many instances we won't know the exact expressions, so we'll need to leverage the special patterns in regular expressions to find our matches.
We can also find (in an iterable format) all instances using re.finditer(<pattern>, <string>)
, which can be looped through to see each result.
string_2 = 'There are 3 blind mice, 5 little pigs, 12 angry men, 500 hats of Bartholomew Cubbins'
pattern_2 = '[0-9]+'
for match in re.finditer(pattern_2, string_2):
print(match.group(0))
You may have noticed that each match is indexed by a group - match.group(0)
This is because you can actually group regex patterns to return indexable values by using ().
string_3 = "yesterday was January 26th, today is January 27th, next week it will be Februray 2nd"
pattern_3 = '([a-zA-Z]+) ([0-9]+)(?:st|[nr]d|th)'
for match in re.finditer(pattern_3, string_3):
print(f"Month: {match.group(1)} - Day: {match.group(2)}")