Named Groups of Regular Expressions

Make use of regular expressions more readable with named groups.

In [8]:
m.group(2)
Out[8]:
'Mackenzie'
In [16]:
m.group('first_name')
Out[16]:
'Mackenzie'
In [16]:
m.group('first_name')
Out[16]:
'Mackenzie'

The Zen of Python, by Tim Peters

Readability counts.

In [1]:
import re

Regular expressions can be used to indicate if a string matches a pattern or not.

Regular expressions can also be used to do some parsing. The substrings of interest are called groups. The traditional way of referring to a group is by index number. Python has another way of referring to a group by name.

Using names give both the regular expression and references to match groups more meaning. They make Python code more readable.

In [2]:
foo_pattern = re.compile('''
    ^
    ([A-Za-z]+)
    ,[ ]
    ([A-Za-z]+)
    $
''', re.VERBOSE)
In [3]:
s = 'James, Mackenzie'
In [4]:
m = re.match(foo_pattern, s)
m
Out[4]:
<_sre.SRE_Match object; span=(0, 16), match='James, Mackenzie'>
In [5]:
m.groups
Out[5]:
<function SRE_Match.groups>
In [6]:
m.group(0)
Out[6]:
'James, Mackenzie'
In [7]:
m.group(1)
Out[7]:
'James'
In [8]:
m.group(2)
Out[8]:
'Mackenzie'
In [9]:
foo_pattern = re.compile('''
    ^
    (?P<last_name>[A-Za-z]+)
    ,[ ]
    (?P<first_name>[A-Za-z]+)
    $
''', re.VERBOSE)
In [10]:
m = re.match(foo_pattern, s)
m
Out[10]:
<_sre.SRE_Match object; span=(0, 16), match='James, Mackenzie'>
In [11]:
m.groups
Out[11]:
<function SRE_Match.groups>
In [12]:
m.group(0)
Out[12]:
'James, Mackenzie'
In [13]:
m.group(1)
Out[13]:
'James'
In [14]:
m.group(2)
Out[14]:
'Mackenzie'
In [15]:
m.group('last_name')
Out[15]:
'James'
In [16]:
m.group('first_name')
Out[16]:
'Mackenzie'

Was Python first to implement names for groups in regular expressions?