What Does It Take to Be An Expert At Python

Notebook based off James Powell's talk at PyData 2017'

https://www.youtube.com/watch?v=7lmCu8wz8ro

If you want to become an expert in Python, you should definitely watch this PyData talk from James Powell.

Video Index

  • metaclasses: 18:50
  • metaclasses(explained): 40:40
  • decorator: 45:20
  • generator: 1:04:30
  • context manager: 1:22:37
  • summary: 1:40:00

Definitions Python is a language orientated around protocols - Some behavior or syntax or bytecode or some top level function and there is a way to tell python how to implement that on an arbitrary object via underscore methods. The exact correspondance is usually guessable, but if you can't guess it you can it... google python data model

Metaclass Mechanism: Some hook into the class construction process. Questions: Do you have these methods implemented. Meaning: Library code & User code? How do you enforce a constraint?

Decorator Hooks into idea that everything creates a structure at run time. Wrap sets of functions with a before and after behavior.

Generators Take a single computation that would otherwise run eagerly from the injection of its parameters to the final computation and interleaving with other code by adding yield points where you can yield the intermediate result values or one small piece of the computation and also yield back to the caller. Think of a generator of a way to take one long piece of computation and break it up into small parts.

Context managers Two structures that allow you to tie two actions together. A setup action and a teardown action and make sure they always happen in concordance with each other.

In [7]:
# some behavior that I want to implement -> write some __ function __
# top-level function or top-level syntax -> corresponding __
# x + y -> __add__
# init x -> __init__
# repr(x) --> __repr__
# x() -> __call__

class Polynomial:
    def __init__(self, *coeffs):
        self.coeffs = coeffs
        
    def __repr__(self):
        return 'Polynomial(*{!r})'.format(self.coeffs)
    
    def __add__(self, other):
        return Polynomial(*(x + y for x, y in zip(self.coeffs, other.coeffs)))
    
    def __len__(self):
        return len(self.coeffs)
    
    def __call__(self):
        pass
    

3 Core Patterns to understand object orientation

  • Protocol view of python
  • Built-in inheritance protocol (where to go)
  • Caveats around how object orientation in python works
In [8]:
p1 = Polynomial(1, 2, 3)
p2 = Polynomial(3, 4, 3)
In [9]:
p1 + p2
Out[9]:
Polynomial(*(5, 7, 7))
In [10]:
len(p1)
Out[10]:
3

Metaclasses

In [26]:
# File 1 - library.py

class Base:
    def food(self):
        return 'foo'
In [27]:
# File2 - user.py

assert hasattr(Base, 'foo'), "you broke it, you fool!"

class Derived(Base):
    def bar(self):
        return self.foo
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-27-08fa6af0fb76> in <module>()
      1 # File2 - user.py
      2 
----> 3 assert hasattr(Base, 'foo'), "you broke it, you fool!"
      4 
      5 class Derived(Base):

AssertionError: you broke it, you fool!
In [41]:
# File 1 - library.py

class Base:
    def foo(self):
        return self.bar()
In [42]:
# File2 - user.py

assert hasattr(Base, 'foo'), "you broke it, you fool!"

class Derived(Base):
    def bar(self):
        return 'bar'
In [45]:
Derived.bar
Out[45]:
<function __main__.Derived.bar>
In [18]:
def _():
    class Base:
        pass

from dis import dis
In [19]:
dis(_) # LOAD_BUILD_CLASS
  2           0 LOAD_BUILD_CLASS
              2 LOAD_CONST               1 (<code object Base at 0x10f2daed0, file "<ipython-input-18-a194b247271c>", line 2>)
              4 LOAD_CONST               2 ('Base')
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               2 ('Base')
             10 CALL_FUNCTION            2
             12 STORE_FAST               0 (Base)
             14 LOAD_CONST               0 (None)
             16 RETURN_VALUE
In [165]:
# Catch Building of Classes

class Base:
    def foo(self):
        return self.bar()

old_bc = __build_class__
def my_bc(*a, **kw):
    print('my buildclass ->', a, kw)
    return old_bc(*a, **kw)
import builtins
builtins.__build_class__ = my_bc
In [1]:
# Catch Building of Classes

class Base:
    def foo(self):
        return self.bar()

old_bc = __build_class__
def my_bc(fun, name, base=None, **kw):
    if base is Base:
                print('Check if bar method defined')
    if base is not None:
        return old_bc(fun, name, base, **kw)
    return old_bc(fun, name, **kw)

import builtins
builtins.__build_class__ = my_bc
In [80]:
import builtins
In [79]:
import importlib
importlib.reload(builtins)
Out[79]:
<module 'builtins' (built-in)>
In [2]:
class BaseMeta(type):
    def __new__(cls, name, bases, body):
        print('BaseMeta.__new__', cls, name, bases, body)
        return super().__new__(cls, name, bases, body)

class Base(metaclass=BaseMeta):
    def foo(self):
        return self.bar()
BaseMeta.__new__ <class '__main__.BaseMeta'> Base () {'__module__': '__main__', '__qualname__': 'Base', 'foo': <function Base.foo at 0x107378048>}
In [167]:
class BaseMeta(type):
    def __new__(cls, name, bases, body):
        if not 'bar' in body:
            raise TypeError('bad user class')
        return super().__new__(cls, name, bases, body)

class Base(metaclass=BaseMeta):
    def foo(self):
        return self.bar()
my buildclass -> (<function BaseMeta at 0x10749cf28>, 'BaseMeta', <class 'type'>) {}
---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-167-f2133308a9ce> in <module>()
----> 1 class BaseMeta(type):
      2     def __new__(cls, name, bases, body):
      3         if not 'bar' in body:
      4             raise TypeError('bad user class')
      5         return super().__new__(cls, name, bases, body)

<ipython-input-165-f7e591caa0e8> in my_bc(*a, **kw)
      8 def my_bc(*a, **kw):
      9     print('my buildclass ->', a, kw)
---> 10     return old_bc(*a, **kw)
     11 import builtins
     12 builtins.__build_class__ = my_bc

<ipython-input-1-8d1486f0ff44> in my_bc(fun, name, base, **kw)
     10                 print('Check if bar method defined')
     11     if base is not None:
---> 12         return old_bc(fun, name, base, **kw)
     13     return old_bc(fun, name, **kw)
     14 

... last 1 frames repeated, from the frame below ...

<ipython-input-1-8d1486f0ff44> in my_bc(fun, name, base, **kw)
     10                 print('Check if bar method defined')
     11     if base is not None:
---> 12         return old_bc(fun, name, base, **kw)
     13     return old_bc(fun, name, **kw)
     14 

RecursionError: maximum recursion depth exceeded
In [5]:
class BaseMeta(type):
    def __new__(cls, name, bases, body):
        if name != 'Base' and not 'bar' in body:
            raise TypeError('bad user class')
        return super().__new__(cls, name, bases, body)

class Base(metaclass=BaseMeta):
    def foo(self):
        return self.bar()
    
    def __init_subclass__(*a, **kw):
        print('init_subclass', a, kw)
        return super().__init_subclass__(*a, **kw)
In [8]:
help(Base.__init_subclass__)
Help on method __init_subclass__ in module __main__:

__init_subclass__(*a, **kw) method of __main__.BaseMeta instance
    This method is called when a class is subclassed.
    
    The default implementation does nothing. It may be
    overridden to extend subclasses.

Decorators

In [12]:
# dec.py
In [77]:
def add(x, y=10):
    return x + y
In [14]:
add(10, 20)
Out[14]:
30
In [15]:
add
Out[15]:
<function __main__.add>
In [28]:
# Name of function
add.__name__
Out[28]:
'add'
In [27]:
# What module function is assigned to
add.__module__
Out[27]:
'__main__'
In [26]:
# Default values
add.__defaults__
Out[26]:
(10,)
In [25]:
# Byte code for function
add.__code__.co_code
Out[25]:
b'|\x00|\x01\x17\x00S\x00'
In [24]:
# Variable names function interacts with
add.__code__.co_varnames
Out[24]:
('x', 'y')

What's your source code?

In [31]:
from inspect import getsource
In [32]:
getsource(add)
Out[32]:
'def add(x, y=10):\n    return x + y\n'
In [33]:
print(getsource(add))
def add(x, y=10):
    return x + y

In [35]:
# What file are you in? 
from inspect import getfile
getfile(add)
Out[35]:
'<ipython-input-19-3cec442ba064>'
In [37]:
from inspect import getmodule
getmodule(add)
Out[37]:
<module '__main__'>
In [38]:
print('add(10)', add(10))
print('add(20, 30)', add(20, 30))
print('add("a", "b")', add("a", "b"))
add(10) 20
add(20, 30) 50
add("a", "b") ab
In [40]:
#Count how long it took to run
In [41]:
def add_timer(x, y=10):
    before = time()
    rv = x + y
    after = time()
    print('elapsed:', after - before)
    return rv
In [42]:
print('add(10)', add_timer(10))
print('add(20, 30)', add_timer(20, 30))
print('add("a", "b")', add_timer("a", "b"))
elapsed: 0.0
add(10) 20
elapsed: 9.5367431640625e-07
add(20, 30) 50
elapsed: 9.5367431640625e-07
add("a", "b") ab

But what if we have multiple functions that require timing?

In [49]:
def sub(x, y=10):
    return x - y
In [46]:
print('sub(10)', sub(10))
print('sub(20, 30)', sub(20, 30))
sub(10) 0
sub(20, 30) -10
In [55]:
def timer(func, x, y=10):
    before = time()
    rv = func(x, y)
    after = time()
    print('elapsed', after - before)
    return rv
In [56]:
print('add(10)', timer(add, 10))
print('add(20, 30)', timer(add, 20, 30))
print('add("a", "b"', timer(add, "a", "b"))
elapsed 9.5367431640625e-07
add(10) 20
elapsed 9.5367431640625e-07
add(20, 30) 50
elapsed 9.5367431640625e-07
add("a", "b") ab
In [74]:
def timer(func):
    def f(x, y=10):
        before = time()
        rv = func(x, y)
        after = time()
        print('elapsed', after - before)
        return rv
    return f
In [78]:
add = timer(add)
In [79]:
print('add(10)', add(10))
print('add(20, 30)', add(20, 30))
print('add("a", "b")', add("a", "b"))
elapsed 9.5367431640625e-07
add(10) 20
elapsed 9.5367431640625e-07
add(20, 30) 50
elapsed 9.5367431640625e-07
add("a", "b") ab
In [80]:
# Don't need to do add = timer(add) with decorators...
In [87]:
@timer
def add_dec(x, y=10):
    return x + y

@timer
def sub_dec(x, y=10):
    return x - y
In [88]:
print('add(10)', add_dec(10))
print('add(20, 30)', add_dec(20, 30))
print('add("a", "b")', add_dec("a", "b"))
print('sub(10)', sub_dec(10))
print('sub(20, 30)', sub_dec(20, 30))
elapsed 9.5367431640625e-07
add(10) 20
elapsed 1.1920928955078125e-06
add(20, 30) 50
elapsed 1.1920928955078125e-06
add("a", "b") ab
elapsed 9.5367431640625e-07
sub(10) 0
elapsed 0.0
sub(20, 30) -10
In [91]:
# Don't hardcode parameters in decorator functions
def timer_k(func):
    def f(*args, **kwargs):
        before = time()
        rv = func(*args, **kwargs)
        after = time()
        print('elapsed', after - before)
        return rv
    return f
In [92]:
@timer_k
def add_dec(x, y=10):
    return x + y

@timer_k
def sub_dec(x, y=10):
    return x - y

print('add(10)', add_dec(10))
print('add(20, 30)', add_dec(20, 30))
print('add("a", "b")', add_dec("a", "b"))
print('sub(10)', sub_dec(10))
print('sub(20, 30)', sub_dec(20, 30))
elapsed 9.5367431640625e-07
add(10) 20
elapsed 9.5367431640625e-07
add(20, 30) 50
elapsed 9.5367431640625e-07
add("a", "b") ab
elapsed 0.0
sub(10) 0
elapsed 9.5367431640625e-07
sub(20, 30) -10
In [93]:
# What if I want to run a function n number of times
In [94]:
# Let's have add run 3 times in a row and sub run twice in a row
In [97]:
n = 2

def ntimes(f):
    def wrapper(*args, **kwargs):
        for _ in range(n):
            print('running {.__name__}'.format(f))
            rv = f(*args, **kwargs)
        return rv
    return wrapper
    
        
@ntimes
def add_dec(x, y=10):
    return x + y

@ntimes
def sub_dec(x, y=10):
    return x - y
In [98]:
print('add(10)', add_dec(10))
print('add(20, 30)', add_dec(20, 30))
print('add("a", "b")', add_dec("a", "b"))
print('sub(10)', sub_dec(10))
print('sub(20, 30)', sub_dec(20, 30))
running add_dec
running add_dec
add(10) 20
running add_dec
running add_dec
add(20, 30) 50
running add_dec
running add_dec
add("a", "b") ab
running sub_dec
running sub_dec
sub(10) 0
running sub_dec
running sub_dec
sub(20, 30) -10

Higher Order Decorators

In [103]:
def ntimes(n):
    def inner(f):
        def wrapper(*args, **kwargs):
            for _ in range(n):
                print('running {.__name__}'.format(f))
                rv = f(*args, **kwargs)
            return rv
        return wrapper
    return inner
    
        
@ntimes(2)
def add_hdec(x, y=10):
    return x + y

@ntimes(4)
def sub_hdec(x, y=10):
    return x - y
In [104]:
print('add(10)', add_hdec(10))
print('add(20, 30)', add_hdec(20, 30))
print('add("a", "b")', add_hdec("a", "b"))
print('sub(10)', sub_hdec(10))
print('sub(20, 30)', sub_hdec(20, 30))
running add_hdec
running add_hdec
add(10) 20
running add_hdec
running add_hdec
add(20, 30) 50
running add_hdec
running add_hdec
add("a", "b") ab
running sub_hdec
running sub_hdec
running sub_hdec
running sub_hdec
sub(10) 0
running sub_hdec
running sub_hdec
running sub_hdec
running sub_hdec
sub(20, 30) -10

Generators

In [105]:
# gen.py - use whenever sequencing is needd
In [109]:
# top-level syntax, function -> underscore method
# x()               __call__

def add1(x, y):
    return x + y

class Adder:
    def __call__(self, x, y):
        return x + y
add2 = Adder()    
In [110]:
add1(10, 20)
Out[110]:
30
In [111]:
add2(10, 20)
Out[111]:
30
In [113]:
# top-level syntax, function -> underscore method
# x()               __call__

def add1(x, y):
    return x + y

class Adder:
    def __init__(self):
        self.z = 0
        
    def __call__(self, x, y):
        self.z += 1
        return x + y + self.z

add2 = Adder()    
In [130]:
from time import sleep
# This example has storage... and has eager return of the result sets
def compute():
    rv = []
    for i in range(10):
        sleep(.5)
        rv.append(i)
    return rv
In [120]:
compute()
Out[120]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Wasteful because we have to wait for the entire action to complete and be read into memory, when we really just care about each number (one by one)

In [129]:
class Compute:
    def __call__(self):
        rv = []
        for i in range(100000):
            sleep(5)
            rv.append(i)
        return rv
    
    def __iter__(self):
        self.last = 0
        return self
    
    def __next__(self):
        rv = self.last
        self.last += 1
        if self.last > 10:
            raise StopIteration()
        sleep(.5)
        return self.last
        
compute = Compute()

# THIS IS UGLY... now let's make a generator
In [131]:
#This is a generator... don't eagerly compute. Return to user as they ask for it...

def compute():
    for i in range(10):
        sleep(.5)
        yield i
In [125]:
# for x in xs:
#    pass

# xi = iter(xs)    -> __iter__
# while True:
#   x = next(xi)   -> __next__
In [136]:
for val in compute():
    print(val)
0
1
2
3
4
5
6
7
8
9
In [133]:
class Api:
    def run_this_first(self):
        first()
    def run_this_second(self):
        second()
    def run_this_last(self):
        last()    
In [137]:
def api():
    first()
    yield
    second()
    yield
    last()

Context Manager

In [2]:
# cty.py

from sqlite3 import connect
In [156]:
# with ctx() as x:
#   pass

# x = ctx().__enter__
# try:
#   pass
# finally:
#    x.__exit__

class temptable:
    def __init__(self, cur):
        self.cur = cur
    def __enter__(self):
        print('__enter__')
        self.cur.execute('create table points(x int, y int)')
    def __exit__(self, *args):
        print('__exit__')
        self.cur.execute('drop table points')

with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
        for row in cur.execute('select sum(x * y) from points'):
            print(row)
__enter__
(1, 1)
(1, 2)
(2, 1)
(2, 2)
(9,)
__exit__
In [162]:
rm test.db
In [164]:
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
    
class contextmanager:
    def __init__(self, cur):
        self.cur = cur
    def __enter__(self):
        self.gen = temptable(self.cur)
        next(self.gen)
    def __exit__(self, *args):
        next(self.gen, None)
        
with connect('test.db') as conn:
    cur = conn.cursor()
    with contextmanager(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
        for row in cur.execute('select sum(x * y) from points'):
            print(row)
created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
(9,)
dropped table
In [8]:
class contextmanager:
    def __init__(self, gen):
        self.gen = gen
    def __call__(self, *args, **kwargs):
        self.args, self.kwargs = args, kwargs
        return self
    def __enter__(self):
        self.gen_inst = self.gen(*self.args, **self.kwargs)
        next(self.gen_inst)
    def __exit__(self, *args):
        next(self.gen_inst, None)

def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
temptable = contextmanager(temptable)
            
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
dropped table
In [12]:
class contextmanager:
    def __init__(self, gen):
        self.gen = gen
    def __call__(self, *args, **kwargs):
        self.args, self.kwargs = args, kwargs
        return self
    def __enter__(self):
        self.gen_inst = self.gen(*self.args, **self.kwargs)
        next(self.gen_inst)
    def __exit__(self, *args):
        next(self.gen_inst, None)
        
@contextmanager
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    yield
    cur.execute('drop table points')
    print('dropped table')
            
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
dropped table
In [15]:
from sqlite3 import connect
from contextlib import contextmanager
        
@contextmanager
def temptable(cur):
    cur.execute('create table points(x int, y int)')
    print('created table')
    try:
        yield
    finally:
        cur.execute('drop table points')
        print('dropped table')
            
with connect('test.db') as conn:
    cur = conn.cursor()
    with temptable(cur):
        cur.execute('insert into points (x, y) values(1, 1)')
        cur.execute('insert into points (x, y) values(1, 2)')
        cur.execute('insert into points (x, y) values(2, 1)')
        cur.execute('insert into points (x, y) values(2, 2)')
        for row in cur.execute("select x, y from points"):
            print(row)
created table
(1, 1)
(1, 2)
(2, 1)
(2, 2)
dropped table