# Chapter 1 - Pythonic-Thinking¶

Book by Brett Slatkin. Summary notes by Tyler Banks.

## Item 1: Know Which Version of Python You’re Using¶

In [57]:
import sys
print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0)
3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]


## Item 2: Follow the PEP 8 Style Guide¶

http://www.python.org/dev/peps/pep-0008/

Whitepaces

• Use 4 whitespaces
• Lines should be 79 characters or less
• Continuation of long expressions should be intented by 4 extra spaces
• Functions and classes shoulde be separated by two blank lines
• In a class methods should be spearated by one blank
• Don't put spaces around list indexes, function calls, args or assignments
• Put one space before and after variable assignment

Naming

• Functions, variables and attributes should be in lowercase_underscore format
• Protected instance attributes should be in _leading_underscore format.
• Private instance attributes should be in __double_leading_underscore format
• Classes and exceptions should be in CapitalizedWord format
• Module constants should be in ALL_CAPS format
• Instance methods in class should use self as the name of the first parameter, refering to the object
• Class methods should use cls as the name of the first parameter, refering to the class

Experessions and Statemens

• Use inline negation (if a is not b) instead of negation positive statments (if not a is b)
• Don't check for empyt values by checking length (if len(alist) == 0). Use if not alist
• Avoid single line if statements, for, and while loops, and except statements. Spread over a series of lines.
• Always put import statements at the top of a file
• Always put absolute names for modules, not relative paths. (from bar import foo) not import foo
• Imports should be in sections in the following order: Standard library modules, Third party modules, Your own modules. Subsections should be in alphabetical order

See Pylint (http://www.pylint.org/) to analyze your source code and automatically fix it up!

## Item 3: Know the Differences Between bytes, str, and unicode¶

General Python 3

• There are two types that represent sequences of characters: bytes and str
• bytes contain raw 8-bit values, str contains unicode
In [58]:
#Convert between str and bytes using encode and decode
string = "this is text"
print(string)

bytes_ = string.encode('utf-8')
print("{}".format(bytes_))

string1 = bytes_.decode('utf-8')
print(string1)

print(bytes == string)
print(string == string1)

this is text
b'this is text'
this is text
False
True

• bytes and str are never equivilent
• Files opened will default to UTF-8 encoding not binary
• Use 'wb' to open binary files
In [59]:
#with open('/tmp/random.bin', 'wb') as f:
#    f.write(os.urandom(10))

• bytes contain sequences of 8 bit values. str contains unicode. They can't be used together with operators like > or +

## Item 4: Write Helper Functions Instead of Complex Expressions¶

• Don't overcomplicate one line statements
• Move complex expressions to helper functions, especially for repeated code
• if/else is more readable than or/and
In [60]:
#Example:
my_values = {'red':[9,8,7]}
print(my_values.get('red', [''])[0] or 0)
print(my_values.get('blue', [''])[0] or 0)

9
0

• The preceeding reads: from my_values, if 'red' exists (otherwise return '') get the first value ([0]) if it exists, otherwise return 0
• Do something like this instead
In [61]:
def get_first_int(values, key, default=0):
found = values.get(key, [''])
if found[0]:
found = int(found[0])
else:
found = default
return found

print(my_values.get('blue', [''])[0] or 0)

0


## Item 5: Know How to Slice Sequences¶

• Slicing is built in to list,str, and bytes
• Slicing can be extended to any class that implements __getitem__ and __setitem__ methods. (Inherticance from collections.abc -- Item 28)
• Basic form is alist[start,end] and start is inclusive and end is exclusive.
In [62]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('First four:', a[:4])
print('Last four: ', a[-4:])
print('Middle two:', a[3:-3])

First four: ['a', 'b', 'c', 'd']
Last four:  ['e', 'f', 'g', 'h']
Middle two: ['d', 'e']

• Using alist[0:len(alist)] is redundant
• Slicing a list will result in a whole new list and modifying the result won't affect the original list
In [63]:
b = a[:]
b[0:2] = (1,2)
b[2:4] = ['z','y']
print(b)
print(a)

[1, 2, 'z', 'y', 'e', 'f', 'g', 'h']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

In [64]:
b = a[:]
assert b == a and b is not a

• Slicing is forgiving of start and end indexes that are out of bounds making it easy to express slices in the front or back of the list
• Assigning a list slice will replace the range even if their sizes are different

## Item 6: Avoid Using start,end, and stride in a single slice¶

• Using start, end, and stride in a slice can be confusing
• Prefer using positive stride in slices without start or end indexes and avoid using negative stride if possible
• Avoid using start,end, and stride in a single slice
• Consider doing two assignments (one to slice, another to stride) or use isslice from itertools
In [65]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print(a)
b = a[0:6:2]
print(b)
#Good
c = a[0:6]
d = c[::2]
print(d)
assert b == d

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
['a', 'c', 'e']
['a', 'c', 'e']


## Item 7: Use List Comprehensions Instead of map and filter¶

• List comprehension -- deriving one list from another
• Lists are easier to use than map and filter because they don't require lambda functions.
• Ex: You want to compute the square of each number in a list
In [66]:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

• List comprehension is easier to use and allows for filtering
In [67]:
even_squares = [x**2 for x in a if x % 2 == 0]
print(even_squares)

#Bad, confusing use of map and filter
alt = map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
assert even_squares == list(alt)

[4, 16, 36, 64, 100]

• Dictionaries and sets have their own equivilents.
In [68]:
chile_ranks = {'ghost': 1, 'habanero': 2, 'cayenne': 3}
rank_dict = {rank: name for name, rank in chile_ranks.items()}
chile_len_set = {len(name) for name in rank_dict.values()}
print(rank_dict)
print(chile_len_set)

{1: 'ghost', 2: 'habanero', 3: 'cayenne'}
{8, 5, 7}


## Item 8: Avoid More Than Two Expressions in List Comprehensions¶

• List comprehension allows for more than one loop level
• Don't use more than two for readability
• Ex: flatten a matrix
In [69]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]

• Squaring each
In [70]:
squared = [[x**2 for x in row] for row in matrix]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]

In [71]:
# Additional Examples
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = [x for x in a if x > 4 if x % 2 == 0]
c = [x for x in a if x > 4 and x % 2 == 0]
print(a)
print(b)
print(c)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[6, 8, 10]
[6, 8, 10]

In [72]:
# Bad
# my_lists = [
# [[1, 2, 3], [4, 5, 6]],
# …
# ] flat =
# [
# x
# for sublist1 in my_lists
# for sublist2 in sublist1
# for x in sublist2]


## Item 9: Consider Generator Expressions for Large Comprehensions¶

• List comprehension works well for small lists but large inputs could crash your program due to memory use
• Example: reading a file and returning the number of characters on each line
In [73]:
value = [len(x) for x in open('data/i9_file.txt')]
print(value)

[21, 6, 15, 16, 20]

• Generator expressions don't materialize the whole input sequence when run, it uses an iterator to yeild values as they're called
• Generators are created by puting list-comprehension in between () characters
In [74]:
it = (len(x) for x in open('data/i9_file.txt'))
print(it)

<generator object <genexpr> at 0x0000017B4729FD00>

In [75]:
print(next(it))

21

In [76]:
roots = ((x,x**0.5) for x in it)
print(next(roots))

(6, 2.449489742783178)

• Chaining generators like this runs quickly in Python.
• Useful for large stream of input generators are the best tool
• Iterators are stateful and you need to be careful to only read once

## Item 10: Prefer enumerate over range¶

• range is useful for loops over a set of integers
• Not so much for lists
In [77]:
#random_bits = 0
#for i in range(64):
#    if randint(0, 1):
#        random_bits |= 1 << i

In [78]:
flavor_list = ['vanilla', 'chocolate', 'pecan', 'strawberry']
for flavor in flavor_list:
print('%s is delicious' % flavor)

vanilla is delicious
chocolate is delicious
pecan is delicious
strawberry is delicious

In [79]:
#Clumsy
for i in range(len(flavor_list)):
flavor = flavor_list[i]
print('%d: %s' % (i + 1, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry

In [80]:
# Much better
for i, flavor in enumerate(flavor_list):
print('%d: %s' % (i + 1, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry

• You can even specify the number at which enumerate starts! Notice the second enumerate argument
In [81]:
for i, flavor in enumerate(flavor_list, 1):
print('%d: %s' % (i, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry


## Item 11: Use zip to Process Iterators in Parallel¶

In [82]:
names = ['Cecilia', 'Lise', 'Marie']
letters = [len(n) for n in names]

In [83]:
# Start code
longest_name = None
max_letters = 0
for i in range(len(names)):
count = letters[i]
if count > max_letters:
longest_name = names[i]
max_letters = count
print(longest_name)

Cecilia

In [84]:
# Better
for i, name in enumerate(names):
count = letters[i]
if count > max_letters:
longest_name = name
max_letters = count
print(longest_name)

Cecilia

In [85]:
# Best
for name, count in zip(names, letters):
if count > max_letters:
longest_name = name
max_letters = count
print(longest_name)

Cecilia

• Zip stops when the first iterator is exhausted, be careful
• Zip is a lazy generator producing a tupple
• Use zip_longest from itertools to iterate over multiple iterators regardless of length

## Item 12: Avoid else Blocks After for and while Loops¶

• Python loops allow for else blocks after loops (while and for)
• else only runs if the loop body did not encounter a break statement
• Confusing, don't use
In [86]:
for x in []:
print('Never runs')
else:
print('For Else block!')

For Else block!


## Item 13: Take Advantage of Each Block in try/except/else/finally¶

• try/finally allows for you to run cleanup code regardless of exceptions raised in try block
• else helps minimize the amout of code in try and distinguishes success case from try/except block
• else can be used to perform additional actions after successful try block but before cleanup in finally
In [87]:
UNDEFINED = object()
def divide_json(path):
handle = open(path, 'r+') # May raise IOError
try:
data = handle.read() # May raise UnicodeDecodeError
op = json.loads(data) # May raise ValueError
value = (
op['numerator'] /
op['denominator']) # May raise ZeroDivisionError
except ZeroDivisionError as e:
return UNDEFINED
else:
op['result'] = value
result = json.dumps(op)
handle.seek(0)
handle.write(result) # May raise IOError
return value
finally:
handle.close() # Always runs

In [ ]: