CRÜCIAL PŸTHON Week 2: Iteration

In [1]:
from IPython.core.display import Image 
Image(url='http://labrosa.ee.columbia.edu/crucialpython/logo.png', width=600) 
Out[1]:

Iterators

Python makes it easy to iterate over objects which are collections of things (generators). This makes for loops look really clean, among other things.

In [2]:
a_list = ['hey', [1, 2, 3], 2, 10, {}]
for item in a_list:
    print item
hey
[1, 2, 3]
2
10
{}
In [3]:
# Another example of an iterator is a file
with open('some_file.txt', 'r') as f:
    for line in f:
        print line
there are some lines

in this text file

here's another one

and another

enumerate

Often, you want to iterate over the indices of the items in an object. A naive way to do it would be to construct an iterator of the list indices and then access your list using each index. enumerate() provides a cleaner way...

In [4]:
a_dict = {}
# This is kind of ugly and only works for objects with a len
for n in xrange(len(a_list)):
    if isinstance(a_list[n], int):
        a_dict[n] = n*a_list[n]
# You know about pprint (pretty print), right?
import pprint
pprint.pprint(a_dict)
{2: 4, 3: 30}
In [6]:
# file objects don't have a len() because they are not loaded into memory.
# But, you can still use enumerate, which will give you an index.
# Plus, it's cleaner.
for n, item in enumerate(open('some_file.txt', 'r')):
    a_dict[n + 5] = item
pprint.pprint(a_dict)
{2: 4,
 3: 30,
 5: 'there are some lines\n',
 6: 'in this text file\n',
 7: "here's another one\n",
 8: 'and another\n'}

zip

One situation where you might be tempted to iterate over list indices is when you need to access multiple lists at a time. zip() is your friend here.

In [7]:
b_list = ['cool', 'great', 'awesome', 'nice', 'sweet']
c_list = range(10, 15)
# Naw
for n in xrange(len(a_list)):
    print n, a_list[n], b_list[n], c_list[n]
0 hey cool 10
1 [1, 2, 3] great 11
2 2 awesome 12
3 10 nice 13
4 {} sweet 14
In [8]:
# Yeah
for a_item, b_item, c_item in zip(a_list, b_list, c_list):
    print a_item, b_item, c_item
hey cool 10
[1, 2, 3] great 11
2 awesome 12
10 nice 13
{} sweet 14
In [10]:
# When you need indices and items across lists, enumerate and zip!
for n, (a_item, b_item, c_item) in enumerate(zip(a_list, b_list, c_list)):
    print n, a_item, b_item, c_item
0 hey cool 10
1 [1, 2, 3] great 11
2 2 awesome 12
3 10 nice 13
4 {} sweet 14

itertools

The itertools module provides a lot of fancy and useful functions for combining iterators. In general, if you have some kind of funky iteration you want to try, check that it isn't already done more cleanly here: http://docs.python.org/2/library/itertools.html

In [11]:
import itertools
In [13]:
# We can add lists to concatenate them, but we can't add list + file.
# Plus, we don't know how many items the file has in it.
# Chain lets us iterate over file first, then the list.
# We could do this in two for loops, but then we'd have to duplicate code.
for item in itertools.chain(open('some_file.txt', 'r'), a_list):
    print item
there are some lines

in this text file

here's another one

and another

hey
[1, 2, 3]
2
10
{}
In [14]:
# Zip for things of different length
for a, b in itertools.izip(a_list, open('some_file.txt', 'r')):
    print a, b
hey there are some lines

[1, 2, 3] in this text file

2 here's another one

10 and another

In [15]:
for a, b in itertools.izip_longest(a_list, open('some_file.txt', 'r')):
    print a, b
hey there are some lines

[1, 2, 3] in this text file

2 here's another one

10 and another

{} None
In [16]:
short_list = ['dang', 'cool', 'nice']
# Get all length-n combinations of items from the list, ordered
for item_a, item_b in itertools.permutations(short_list, 2):
    print item_a, item_b
dang cool
dang nice
cool dang
cool nice
nice dang
nice cool
In [17]:
# Get all length-n combinations of items from the list, no repeats
for item_a, item_b in itertools.combinations(short_list, 2):
    print item_a, item_b
dang cool
dang nice
cool nice

List comprehension

A truly crucial tool: Create lists using a for loop

In [18]:
# Basic example: Read in all lines
print [line for line in open('some_file.txt', 'r')]
['there are some lines\n', 'in this text file\n', "here's another one\n", 'and another\n']
In [19]:
# We can apply functions to each item too
print [line.strip() for line in open('some_file.txt', 'r')]
['there are some lines', 'in this text file', "here's another one", 'and another']
In [21]:
# Also can use conditional statements
print [line.strip() for line in open('some_file.txt', 'r') if not 'another' in line]
['there are some lines', 'in this text file']
In [22]:
# Also can do nested list comprehension! 
print [line.strip() for f in ['some_file.txt', 'some_other_file.txt'] for line in open(f, 'r')]
['there are some lines', 'in this text file', "here's another one", 'and another', 'these are lines', 'from a different text file', 'a second one, specifically']