Effective Python - 59 Specific Ways to Write Better Python.

Chapter 2 - Functions

Book by Brett Slatkin. Summary notes by Tyler Banks.

Item 14: Prefer Exceptions to Returning None

  • Instead of returning None as a return type, raise exceptions from helper functions. except them from the calling function
  • Functions returning None as a special meaning are error prone because None and other values like 0 or the empty string evaluate to False as well
In [65]:
#Bad
def divide(a, b):
    try:
        return a / b
    except ZeroDivisionError:
        return None
    
x, y = 0, 5 
result = divide(x, y)
if result is None:
    print('Invalid inputs')    
In [66]:
#Better
def divide(a, b):
    try:
        return a / b
    except ZeroDivisionError as e:
        raise ValueError('Invalid inputs') from e
        
x, y = 5, 2
try:
    result = divide(x, y)
except ValueError:
    print('Invalid inputs')
else:
    print('The result is %.1f' % result)
The result is 2.5

Item 15: Know How Closures Interact with Variable Scope

  • Python suports closures: functions that refer to variables from the scope in which they were defined. The helper functions are able to access the group argument, shown below.
  • Functions are first-class objects. You can refer to them directly, assign them to variables, pass them as arguments to functions, compare them with if statemens. Sort method below is accepting helper function.
In [67]:
def sort_priority(values, group):
    def helper(x):
        if x in group:
            return (0, x)
        return (1, x)
    values.sort(key=helper)

When referencing a variable, the interpreter will traverse the scope in the following order

  1. Current function scope
  2. Any enclosing scopes (other containing functions) (above)
  3. Scope of the module that contains the code (global scope)
  4. The built-in scope (that contains functions like len and str)

If no places have the defined variable NameError exception is raised

  • Assignment works differently, if the variable doesn't exist in the current scope Python treats the assignment as a variable definition
  • Example:
In [68]:
def sort_priority2(numbers, group):
    found = False # Scope: 'sort_priority'’
    def helper(x):
        if x in group:
            found = True # Scope: 'helper' this is bad!
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return found

To fix this, use nonlocal

  • NOTE: nonlocal does not resolve up to the module level
In [69]:
def sort_priority3(numbers, group):
    found = False
    def helper(x):
        nonlocal found
        if x in group:
            found = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return found
  • The author cautions against using nonlocal in most cases outside simple functions like this

Item 16: Consider Generators Instead of Returning Lists

Considering the following code that returns a list of indexes for the start of each word

In [70]:
def index_words(text):
    result = []
    if text:
        result.append(0)
    for index, letter in enumerate(text):
        if letter == ' ':
            result.append(index + 1)
    return result

words = 'This is some text to test'
result = index_words(words)
print(result[:])
[0, 5, 8, 13, 18, 21]

Problems:

  • Noisy and dense
  • With each new result the append method is called

Resolution:

  • Use a generator
  • Generators use yeild expressions
  • A generator returns an iterator instead of the result
  • With each call to the built-in next funtion the iterator will advance the generator to its next yield statement

Ex:

In [71]:
def index_words_iter(text):
    if text:
        yield 0
    for index, letter in enumerate(text):
        if letter == ' ':
            yield index + 1

result = list(index_words_iter(words))
print(result)
[0, 5, 8, 13, 18, 21]

Item 17: Be Defensive When Iterating Over Arguments

  • Beware of functions that iterate over inputs multiple times. If the arguments are iterators you may see strange behavior
  • Consider the following:
In [72]:
def normalize(numbers):
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result
  • This works when using lists but if the list is too big we'll need to suply a generator.
  • However, generators can only be used once and numbers is iterated over multiple times. Note: iterators raise the StopIteration when exhausted.
  • You could just make a copy of the list at the beginning (numbers = list(numbers)) and use that but this would again, only work for small inputs
  • A better solution would be to suppy an iterator instead
In [73]:
def read_visits(data_path):
    with open(data_path) as f:
        for line in f:
            yield int(line)

def normalize_func(get_iter):
    total = sum(get_iter()) # New iterator
    result = []
    for value in get_iter(): # New iterator
        percent = 100 * value / total
        result.append(percent)
    return result

path = "data/none.txt"
#percentages = normalize_func(lambda: read_visits(path))
  • However, this looks terrible
  • The best solution is to write a class that implements the __iter__ function
  • This is known as a container
In [74]:
class ReadVisits(object):
    def __init__(self, data_path):
        self.data_path = data_path
    def __iter__(self):
        with open(self.data_path) as f:
            for line in f:
                yield int(line)
                
def normalize_defensive(numbers):
    if iter(numbers) is iter(numbers): # An iterator - bad!
        raise TypeError('You have to supply a container!')
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result
  • You can detect that a value is an iterator (instead of a container) if calling iter on the iterator produces the same result

Item 18: Reduce Visual Noise with Variable Positional Arguments

  • Use *args for optional parameters to clear up noise
  • Compare the following:
In [75]:
def log(message, values):
    if not values:
        print(message)
    else:
        values_str = ', '.join(str(x) for x in values)
        print('%s: %s' % (message, values_str))
    
log('nums are', [1, 2])
log('yo', [])
nums are: 1, 2
yo
In [76]:
def log(message, *values):
    if not values:
        print(message)
    else:
        values_str = ', '.join(str(x) for x in values)
        print('%s: %s' % (message, values_str))
    
log('nums are', [1, 2])
log('yo') #Nicer
nums are: [1, 2]
yo
  • You can also pass lists as variable arguments using *
In [77]:
favorites = [7, 33, 99]
log('faves', *favorites)
faves: 7, 33, 99

Problems with accepting variable number of position arguments:

  • Variable arguments are always turned into tuples before passed, this will mean that a generator will exhaust its iterator and possibly consume mass amounts of memory. Limit the use of *args to small inputs.
  • You can't add new positional arguments to the functions in the future without migrating the caller functions.

Item 19: Provide Optional Behavior with Keyword Arguments

  • All positional arguments to functions can be passed by keyword. Ex:
In [78]:
def remainder(number, divisor):
    return number % divisor

print(remainder(20, 7))
print(remainder(20, divisor=7))
print(remainder(number=20, divisor=7))
print(remainder(divisor=7, number=20))
6
6
6
6
  • Positional arguments MUST be specified before positional arguments
  • Each argument can only be specified once
In [79]:
remainder(number=20, 7)
  File "<ipython-input-79-9265fd4030d2>", line 1
    remainder(number=20, 7)
                        ^
SyntaxError: positional argument follows keyword argument
In [ ]:
remainder(20, number=7)

Keyword args provide several benefits:

  1. Readability (remainder(20,7) is not as clear as remainder(number=20,divisor=7))
  2. Default values. In the method header you can specify default functionality by def remainder(number, divisor=1):
  3. Provide a way to remain backwards compatible with existing callers, allowing extended function without migration.

Item 20: Use None and Docstrings to Specify Dynamic Default Arguments

  • Important if you want to specify dynamic values when method is called with keyword arguments
  • Ex:
In [ ]:
from datetime import datetime
from time import sleep
def log(message, when=datetime.now()):
    print('%s: %s' % (when, message))
log('hi')
sleep(0.1)
log('hi')
  • Result is the same because python evaluates the method definition on module load.
  • Use none to specify a different behavior, dynamically
In [ ]:
def log(message, when=None):
    """Log a message with a timestamp.
    Args:
        message: Message to print.
        when: datetime of when the message occurred.
            Defaults to the present time.
    """
    when = datetime.now() if when is None else when
    print('%s: %s' % (when, message))

log('hi')
sleep(0.1)
log('hi')
  • The same is true for when you return a blank dictionary. You'll end up returning the same object reference.
  • Ex, with fix:
In [ ]:
import json
#Bad
def decode(data, default={}):
    try:
        return json.loads(data)
    except ValueError:
        return default
    
foo = decode('bad')
foo['a'] = 5
bar = decode('another bad')
bar['b'] = 1
print('Bad results')
print('a: ', foo)
print('b: ', bar)
In [ ]:
#Fixed    
def decode(data, default=None):
    """Load JSON data from a string.
    Args:
        data: JSON data to decode.
        default: Value to return if decoding fails.
            Defaults to an empty dictionary.
    """
    if default is None:
        default = {}
    try:
        return json.loads(data)
    except ValueError:
        return default

foo = decode('bad')
foo['a'] = 5
bar = decode('another bad')
bar['b'] = 1
print('Good results')
print('a: ', foo)
print('b: ', bar)

Item 21: Enforce Clarity with Keyword-Only Arguments

  • Keyword arguments make intention clear
  • Use keyword only args to force callers to supply keyword args for confusing functions
  • Use the * symbol to indicate the end of positional arguments
  • See the difference:
In [ ]:
def safe_division_b(number, divisor,
    ignore_overflow=False,
    ignore_zero_division=False):
    return 0

def safe_division_c(number, divisor, *, #Note the star
    ignore_overflow=False,
    ignore_zero_division=False):
    return 0
In [80]:
print(safe_division_b(1,2,True,False))
0
In [81]:
print(safe_division_c(1,2,True,False))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-81-d83d3f360b5b> in <module>()
----> 1 print(safe_division_c(1,2,True,False))

NameError: name 'safe_division_c' is not defined
In [ ]: