Effective Python - 59 Specific Ways to Write Better Python.¶

Chapter 1 - Pythonic-Thinking¶

Book by Brett Slatkin. Summary notes by Tyler Banks.

Item 1: Know Which Version of Python You’re Using¶

In [57]:

import sys
print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=6, micro=4, releaselevel='final', serial=0)
3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]

Item 2: Follow the PEP 8 Style Guide¶

http://www.python.org/dev/peps/pep-0008/

Whitepaces

Use 4 whitespaces
Lines should be 79 characters or less
Continuation of long expressions should be intented by 4 extra spaces
Functions and classes shoulde be separated by two blank lines
In a class methods should be spearated by one blank
Don't put spaces around list indexes, function calls, args or assignments
Put one space before and after variable assignment

Naming

Functions, variables and attributes should be in lowercase_underscore format
Protected instance attributes should be in _leading_underscore format.
Private instance attributes should be in __double_leading_underscore format
Classes and exceptions should be in CapitalizedWord format
Module constants should be in ALL_CAPS format
Instance methods in class should use self as the name of the first parameter, refering to the object
Class methods should use cls as the name of the first parameter, refering to the class

Experessions and Statemens

Use inline negation (if a is not b) instead of negation positive statments (if not a is b)
Don't check for empyt values by checking length (if len(alist) == 0). Use if not alist
Avoid single line if statements, for, and while loops, and except statements. Spread over a series of lines.
Always put import statements at the top of a file
Always put absolute names for modules, not relative paths. (from bar import foo) not import foo
Imports should be in sections in the following order: Standard library modules, Third party modules, Your own modules. Subsections should be in alphabetical order

See Pylint (http://www.pylint.org/) to analyze your source code and automatically fix it up!

Item 3: Know the Differences Between `bytes`, `str`, and `unicode`¶

General Python 3

There are two types that represent sequences of characters: bytes and str
bytes contain raw 8-bit values, str contains unicode

In [58]:

#Convert between str and bytes using encode and decode
string = "this is text"
print(string)

bytes_ = string.encode('utf-8')
print("{}".format(bytes_))

string1 = bytes_.decode('utf-8')
print(string1)

print(bytes == string)
print(string == string1)

this is text
b'this is text'
this is text
False
True

bytes and str are never equivilent
Files opened will default to UTF-8 encoding not binary
Use 'wb' to open binary files

In [59]:

#with open('/tmp/random.bin', 'wb') as f:
#    f.write(os.urandom(10))

bytes contain sequences of 8 bit values. str contains unicode. They can't be used together with operators like > or +

Item 4: Write Helper Functions Instead of Complex Expressions¶

Don't overcomplicate one line statements
Move complex expressions to helper functions, especially for repeated code
if/else is more readable than or/and

In [60]:

#Example:
my_values = {'red':[9,8,7]}
print(my_values.get('red', [''])[0] or 0)
print(my_values.get('blue', [''])[0] or 0)

9
0

The preceeding reads: from my_values, if 'red' exists (otherwise return '') get the first value ([0]) if it exists, otherwise return 0
Do something like this instead

In [61]:

def get_first_int(values, key, default=0):
    found = values.get(key, [''])
    if found[0]:
        found = int(found[0])
    else:
        found = default
    return found

print(my_values.get('blue', [''])[0] or 0)

Item 5: Know How to Slice Sequences¶

Slicing is built in to list,str, and bytes
Slicing can be extended to any class that implements __getitem__ and __setitem__ methods. (Inherticance from collections.abc -- Item 28)
Basic form is alist[start,end] and start is inclusive and end is exclusive.

In [62]:

a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('First four:', a[:4])
print('Last four: ', a[-4:])
print('Middle two:', a[3:-3])

First four: ['a', 'b', 'c', 'd']
Last four:  ['e', 'f', 'g', 'h']
Middle two: ['d', 'e']

Using alist[0:len(alist)] is redundant
Slicing a list will result in a whole new list and modifying the result won't affect the original list

In [63]:

b = a[:]
b[0:2] = (1,2)
b[2:4] = ['z','y']
print(b)
print(a)

[1, 2, 'z', 'y', 'e', 'f', 'g', 'h']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

In [64]:

b = a[:]
assert b == a and b is not a

Slicing is forgiving of start and end indexes that are out of bounds making it easy to express slices in the front or back of the list
Assigning a list slice will replace the range even if their sizes are different

Item 6: Avoid Using `start`,`end`, and `stride` in a single slice¶

Using start, end, and stride in a slice can be confusing
Prefer using positive stride in slices without start or end indexes and avoid using negative stride if possible
Avoid using start,end, and stride in a single slice
Consider doing two assignments (one to slice, another to stride) or use isslice from itertools

In [65]:

a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print(a)
#Bad
b = a[0:6:2]
print(b)
#Good
c = a[0:6]
d = c[::2]
print(d)
assert b == d

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
['a', 'c', 'e']
['a', 'c', 'e']

Item 7: Use List Comprehensions Instead of `map` and `filter`¶

List comprehension -- deriving one list from another
Lists are easier to use than map and filter because they don't require lambda functions.
Ex: You want to compute the square of each number in a list

In [66]:

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

List comprehension is easier to use and allows for filtering

In [67]:

even_squares = [x**2 for x in a if x % 2 == 0]
print(even_squares)

#Bad, confusing use of map and filter
alt = map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
assert even_squares == list(alt)

[4, 16, 36, 64, 100]

Dictionaries and sets have their own equivilents.

In [68]:

chile_ranks = {'ghost': 1, 'habanero': 2, 'cayenne': 3}
rank_dict = {rank: name for name, rank in chile_ranks.items()}
chile_len_set = {len(name) for name in rank_dict.values()}
print(rank_dict)
print(chile_len_set)

{1: 'ghost', 2: 'habanero', 3: 'cayenne'}
{8, 5, 7}

Item 8: Avoid More Than Two Expressions in List Comprehensions¶

List comprehension allows for more than one loop level
Don't use more than two for readability
Ex: flatten a matrix

In [69]:

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Squaring each

In [70]:

squared = [[x**2 for x in row] for row in matrix]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]

In [71]:

# Additional Examples
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = [x for x in a if x > 4 if x % 2 == 0]
c = [x for x in a if x > 4 and x % 2 == 0]
print(a)
print(b)
print(c)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[6, 8, 10]
[6, 8, 10]

In [72]:

# Bad
# my_lists = [
# [[1, 2, 3], [4, 5, 6]],
# …
# ] flat =
# [
# x
# for sublist1 in my_lists
# for sublist2 in sublist1
# for x in sublist2]

Item 9: Consider Generator Expressions for Large Comprehensions¶

List comprehension works well for small lists but large inputs could crash your program due to memory use
Example: reading a file and returning the number of characters on each line

In [73]:

value = [len(x) for x in open('data/i9_file.txt')]
print(value)

[21, 6, 15, 16, 20]

Generator expressions don't materialize the whole input sequence when run, it uses an iterator to yeild values as they're called
Generators are created by puting list-comprehension in between () characters

In [74]:

it = (len(x) for x in open('data/i9_file.txt'))
print(it)

<generator object <genexpr> at 0x0000017B4729FD00>

In [75]:

print(next(it))

In [76]:

roots = ((x,x**0.5) for x in it)
print(next(roots))

(6, 2.449489742783178)

Chaining generators like this runs quickly in Python.
Useful for large stream of input generators are the best tool
Iterators are stateful and you need to be careful to only read once

Item 10: Prefer `enumerate` over `range`¶

range is useful for loops over a set of integers
Not so much for lists

In [77]:

#random_bits = 0
#for i in range(64):
#    if randint(0, 1):
#        random_bits |= 1 << i

In [78]:

flavor_list = ['vanilla', 'chocolate', 'pecan', 'strawberry']
for flavor in flavor_list:
    print('%s is delicious' % flavor)

vanilla is delicious
chocolate is delicious
pecan is delicious
strawberry is delicious

In [79]:

#Clumsy
for i in range(len(flavor_list)):
    flavor = flavor_list[i]
    print('%d: %s' % (i + 1, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry

In [80]:

# Much better 
for i, flavor in enumerate(flavor_list):
    print('%d: %s' % (i + 1, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry

You can even specify the number at which enumerate starts! Notice the second enumerate argument

In [81]:

for i, flavor in enumerate(flavor_list, 1):
    print('%d: %s' % (i, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry

Item 11: Use `zip` to Process Iterators in Parallel¶

In [82]:

names = ['Cecilia', 'Lise', 'Marie']
letters = [len(n) for n in names]

In [83]:

# Start code
longest_name = None
max_letters = 0
for i in range(len(names)):
    count = letters[i]
    if count > max_letters:
        longest_name = names[i]
        max_letters = count
print(longest_name)

Cecilia

In [84]:

# Better
for i, name in enumerate(names):
    count = letters[i]
    if count > max_letters:
        longest_name = name
        max_letters = count
print(longest_name)

Cecilia

In [85]:

# Best
for name, count in zip(names, letters):
    if count > max_letters:
        longest_name = name
        max_letters = count
print(longest_name)

Cecilia

Zip stops when the first iterator is exhausted, be careful
Zip is a lazy generator producing a tupple
Use zip_longest from itertools to iterate over multiple iterators regardless of length

Item 12: Avoid else Blocks After for and while Loops¶

Python loops allow for else blocks after loops (while and for)
else only runs if the loop body did not encounter a break statement
Confusing, don't use

In [86]:

for x in []:
    print('Never runs')
else:
    print('For Else block!')

For Else block!

Item 13: Take Advantage of Each Block in `try`/`except`/`else`/`finally`¶

try/finally allows for you to run cleanup code regardless of exceptions raised in try block
else helps minimize the amout of code in try and distinguishes success case from try/except block
else can be used to perform additional actions after successful try block but before cleanup in finally

In [87]:

UNDEFINED = object()
def divide_json(path):
    handle = open(path, 'r+') # May raise IOError
    try:
        data = handle.read() # May raise UnicodeDecodeError
        op = json.loads(data) # May raise ValueError
        value = (
        op['numerator'] /
        op['denominator']) # May raise ZeroDivisionError
    except ZeroDivisionError as e:
        return UNDEFINED
    else:
        op['result'] = value
        result = json.dumps(op)
        handle.seek(0)
        handle.write(result) # May raise IOError
        return value
    finally:
        handle.close() # Always runs

In [ ]: