Functions

Actually we have already seen python functions. All the methods are functions that belong to a class (unbound only in python 2) or an instance (bound), in python we call function all the callable objects that belong to a module.

In [1]:
def f(x): 
    return x
In [2]:
f
Out[2]:
<function __main__.f>
In [3]:
class A: pass
In [4]:
a = A()
In [5]:
a.f = f
In [6]:
a.f
Out[6]:
<function __main__.f>
In [7]:
class A:
    def f(self, x):
        return x
In [8]:
A.f
Out[8]:
<function __main__.f>
In [9]:
a = A()
In [10]:
a.f
Out[10]:
<bound method A.f of <__main__.A object at 0x7ff04f1afb10>>

So we already know how to write a function now let's play a bit more with functions!

lambda

Define simple function in one line using lambda

In [1]:
f = lambda x, y: x * y
In [2]:
f
Out[2]:
<function __main__.<lambda>>
In [3]:
f(2, 4)
Out[3]:
8

That is equivalent to:

In [5]:
def f(x, y): return x * y
In [6]:
f(2, 4)
Out[6]:
8

Count the most frequent words in a text

As first step we can define a function to download a text from a url and save it to the hard disk.

So we need to import the urllib module in the standard library, that unfortunately has been change from python2 to python3 and an external library to transform the html to text.

In [8]:
from sys import version_info as vinfo

#=============================================
# solve comaptibility between python2 and python3
if vinfo.major == 2:
    from urllib2 import urlopen
else:
    from urllib.request import urlopen

# import the library that convert an html in to text
import html2text

Define some global variables

In [9]:
# Measure for Measure ~ Shakespeare
URL = "http://shakespeare.mit.edu/measure/full.html"
FNAME = "shakespeare.txt"

Define a function to download the html, convert the bytes to utf8, convert the html to text and clean the string.

In [10]:
def get_text_from_url(url):
    # urlopen return bytes that need to be decoded in to text
    html = ''.join([line.decode() for line in urlopen(url)])
    return html2text.html2text(html).replace('\n\n', '\n')
In [11]:
def write_text(text, fname):
    with open(fname, 'w') as txt:
        txt.writelines(text)
In [ ]:
def write_text2(text, fname):
    txt = open(fname, 'w')
    txt.writelines(text)
    txt.close()

def write_text3(text, fname):
    txt = open(fname, 'w')
    for line in text:
        txt.write(line)
    txt.close()

We can write the second function also in this two other ways, but if we are not using the with statement when we open a file we have to remember also close it!

We can also iterate and write line by line...

Then we have to call our functions to execute:

In [12]:
write_text(get_text_from_url(URL), FNAME)

Time for Coding

Now define your own function, open the file, and count the frequency of each word present in the text.

A possible solution could be:

In [30]:
def word_counter0(fname, nwds):
    counter = {}
    with open(fname, 'r') as txt:
        for line in txt:
            for wd in line.lower().split():
                counter[wd] = 1 + counter.get(wd, 0)
    words = sorted(counter.keys())
    return words
    #for wd in words[:nwds]:
        #print("%25s: %s" % (wd, counter[wd]))

May be is nicer if we print the first most frequent words, so we can write:

In [62]:
def get_second(x): 
    return x[1]

def word_counter1(fname, nwds):
    counter = {}
    with open(fname, 'r') as txt:
        for line in txt:
            for wd in line.lower().split():
                counter[wd] = 1 + counter.get(wd, 0)
    words = sorted(counter.items(), key=get_second, reverse=True)
    return words
    #for wd, num in words[:nwds]:
        #print("%25s: %d" % (wd, num))
In [63]:
from operator import itemgetter
    

def word_counter2(fname, nwds):
    counter = {}
    with open(fname, 'r') as txt:
        for line in txt:
            for wd in line.lower().split():
                counter[wd] = 1 + counter.get(wd, 0)
    words = sorted(counter.items(), key=itemgetter(1), reverse=True)
    return words
    #for wd, num in words[:nwds]:
        #print("%25s: %d" % (wd, num))
In [71]:
import dis

print('get_second')
dis.dis(get_second)

gitm = itemgetter(1)
print('itemgetter')
dis.dis(gitm._call)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-71-eaa4493728c8> in <module>()
      6 gitm = itemgetter(1)
      7 print('itemgetter')
----> 8 dis.dis(gitm._call)

AttributeError: 'operator.itemgetter' object has no attribute '_call'
get_second
  3           0 LOAD_FAST                0 (x) 
              3 LOAD_CONST               1 (1) 
              6 BINARY_SUBSCR        
              7 RETURN_VALUE         
itemgetter

itemgetter return an object that is callable and return always the second element:

In [14]:
item1 = itemgetter(1)
In [16]:
item1([0, 1, 2, 3])
Out[16]:
1

Or we can define which item we want to extract from an object.

In [17]:
item2 = itemgetter(1, 3)
In [18]:
item2([0, 1, 2, 3])
Out[18]:
(1, 3)

So itemgetter is what is called a factory, so un object that return another object.

With a bit more effort we can make the function more clean and coincise:

In [33]:
from collections import Counter


def word_counter3(fname, nwds):
    with open(fname) as txt:
        words = Counter(txt.read().lower().split()).most_common(nwds)
        return words
        #for wd, num in Counter(txt.read().lower().split()).most_common(nwds):
            #print("%25s: %d" % (wd, num))

Now we can compare the speed of the different implementations:

In [67]:
from timeit import timeit

NWORDS = 20
NUMBER = 100

t0 = timeit("word_counter0(FNAME, NWORDS)",
            setup="from __main__ import word_counter0, FNAME, NWORDS",
            number=NUMBER)
t1 = timeit("word_counter1(FNAME, NWORDS)",
            setup="from __main__ import word_counter1, FNAME, NWORDS",
            number=NUMBER)
t2 = timeit("word_counter2(FNAME, NWORDS)",
            setup="from __main__ import word_counter2, FNAME, NWORDS",
            number=NUMBER)
t3 = timeit("word_counter3(FNAME, NWORDS)",
            setup="from __main__ import word_counter3, FNAME, NWORDS",
            number=NUMBER)

print("word_counter0: ", t0)
print("word_counter1: ", t1)
print("word_counter2: ", t2)
print("word_counter3: ", t3)
word_counter0:  1.3633682700001373
word_counter1:  1.327042075999998
word_counter2:  1.2726883029990859
word_counter3:  0.6232755900000484

Master Mind

Ok counting the number of words is boring! Implement your first game.

As first step implement a function that given a certain attempt return the number of black and white.

In [ ]:
def check(solution, attempt):
    """Return a tuple with the number of black and white. ::

    >>> check(['a', 'b', 'c', 'd', 'e'], ['e', 'd', 'c', 'b', 'a'])
    (1, 4)
    >>> check(['l', 'l', 'l', 'c'], ['c', 'c', 'c', 'c'])
    (1, 0)
    """
    pass

Then we can implement another funtion that manage the game.

In [ ]:
def mastermind():
    pass

We can read a text input from a user using the input function:

In [8]:
from sys import version_info as vinfo

# solve comaptibility between python2 and python3
input = raw_input if vinfo.major == 2 else input
Try: pietro
In [9]:
input("Try: ")
Out[9]:
'pietro'

Time for coding!

A possible solution for the first function could be:

In [ ]:
def check(solution, attempt):
    """Return a tuple with the number of black and white. ::

    >>> check(['a', 'b', 'c', 'd', 'e'], ['e', 'd', 'c', 'b', 'a'])
    (1, 4)
    >>> check(['l', 'l', 'l', 'c'], ['c', 'c', 'c', 'c'])
    (1, 0)
    """
    solution1 = list(solution)
    white = 0
    for c in attempt:
        if c in solution1:
            white += 1
            solution1.remove(c)

    black = 0
    for a, s in zip(attempt, solution):
        if a == s:
            black += 1
    return black, white - black

or in a more coincise form:

In [ ]:
def check2(solution, attempt):
    """Return a tuple with the number of black and white. ::

    >>> check2(['a', 'b', 'c', 'd', 'e'], ['e', 'd', 'c', 'b', 'a'])
    (1, 4)
    >>> check2(['l', 'l', 'l', 'c'], ['c', 'c', 'c', 'c'])
    (1, 0)
    """
    solution1 = list(solution)
    white = len([solution1.remove(c) for c in attempt if c in solution1])
    black = len([(a, s) for a, s in zip(attempt, solution) if a == s])
    return black, white - black

Then I split the logic in two other funtions:

In [ ]:
def user(board=None, length=4, validset=''):
    """Function that interface the user with the mastermind function."""
    return tuple(input("Try: "))


def solver(board=None, length=4, validset=''):
    pass


def mastermind(length=4, show=False, validset='abcdefghijkl',
               getattempt=user):
    solution = [choice(validset) for _ in range(length)]
    print("Choose %d characters from: %s" % (length, validset))
    if show:
        print(''.join(solution))

    board = []
    black = 0
    while black < length:
        attempt = getattempt(board=board, length=length, validset=validset)
        black, white = check(solution, attempt)
        board.append((attempt, black, white))
        print(" " * 15 + "BLACK: %d; WHITE: %d." % (black, white))
    print("Congratulations, you won!")

Mutable default arguments

Generally is a good rule to avoid to use mutable objects as default parameters in functions and methods, because they can have an unexpected results!

In [25]:
def append(a, flist=[]):
    flist.append(a)
    return flist
In [26]:
append(10, [0, 1, 2, 3])
Out[26]:
[0, 1, 2, 3, 10]
In [28]:
append(10)
Out[28]:
[10]
In [29]:
append(10)
Out[29]:
[10, 10]

Why? Because the objects that provide the default values are not created at the time that out function is called. They are created at the time that the statement that defines the function is executed! If we want to have a new list everytime that wa call the function we have to use a sintax like:

In [30]:
def append(a, flist=None):
    flist = [] if flist is None else flist
    flist.append(a)
    return flist
In [31]:
append(10)
Out[31]:
[10]
In [32]:
append(10)
Out[32]:
[10]

Time for coding!

Using a mutable object write a singleton. It is called singleton an object that is instantiate only once. Suppose that you want a connection with a database, but you don't want to instantiate several connection but just one.

In [140]:
class Connection:
    number_of_connection = 0
    def __init__(self):
        Connection.number_of_connection += 1
        self.n_con = Connection.number_of_connection + 0
    
    def __repr__(self):
        """Return a representation. ::
        
            >>> c0 = Connection()
            >>> c0
            Connection(1)
            >>> c1 = Connection()
            >>> c1
            Connection(2)
        """
        return "Connection(%d/%d)" % (self.n_con, Connection.number_of_connection)
In [135]:
c0 = Connection()
c0
Out[135]:
Connection(1/1)
In [136]:
c1 = Connection()
c1
Out[136]:
Connection(2/2)
In [137]:
c0
Out[137]:
Connection(1/2)

The expected behaviour of your function is:

>>> c0 = get_connection()
>>> c1 = get_connection()
>>> c0 is c1
True
>>> c2 = Connection()
>>> c0 is c2
False

A possible solution is:

In [141]:
def get_connection(_connection=[None, ]):
    """
    >>> c0 = get_connection()
    >>> c1 = get_connection()
    >>> c0 is c1
    True
    >>> c2 = Connection()
    >>> c0 is c2
    False
    """
    if _connection[0] is None:
        _connection[0] = Connection()
    return _connection[0]
In [142]:
c0 = get_connection()
c1 = get_connection()
In [143]:
c0
Out[143]:
Connection(1/1)
In [144]:
c1
Out[144]:
Connection(1/1)
In [145]:
c0 is c1
Out[145]:
True
In [146]:
c2 = Connection()
In [147]:
c2
Out[147]:
Connection(2/2)
In [121]:
c0 is c2
Out[121]:
False
In [122]:
id(c0)
Out[122]:
140264372436880
In [123]:
id(c1)
Out[123]:
140264372436880
In [125]:
id(c2)
Out[125]:
140264372435216

Arbitrary number of arguments and/or keyword arguments

In [33]:
def func(*args, **kwargs):
    for arg in args:
        print(arg)
    for key, arg in kwargs.items():
        print(key, arg)
In [34]:
func('apple', 'orange', 'banana')
apple
orange
banana
In [35]:
func(fruit=['apple', 'orange', 'banana'], other=['turnip', 'carots', 'potatos'])
fruit ['apple', 'orange', 'banana']
other ['turnip', 'carots', 'potatos']
In [36]:
func('apple', 'orange', 'banana', other=['turnip', 'carots', 'potatos'])
apple
orange
banana
other ['turnip', 'carots', 'potatos']
In [37]:
func(*['apple', 'orange', 'banana'])
apple
orange
banana
In [38]:
func(**dict(fruit=['apple', 'orange', 'banana'], other=['turnip', 'carots', 'potatos']))
fruit ['apple', 'orange', 'banana']
other ['turnip', 'carots', 'potatos']

Time for coding!

Define a function that acept an arbitrary number of parameters and return their product.

In [ ]:
def multiply(*args):
    """Return the product of all the arguments. ::
    
        >>> multiply(0, 1, 2, 3)
        0
        >>> multiply(1, 2, 3):
        6
    """
    pass

A possible solution:

In [106]:
def multiply(*args):
    """Return the product of all the arguments. ::
    
        >>> multiply(0, 1, 2, 3)
        0
        >>> multiply(1, 2, 3):
        6
    """
    result = 1
    for arg in args:
        result *= arg
    return result
In [107]:
multiply(0, 1, 2, 3)
Out[107]:
0
In [108]:
multiply(1, 2, 3)
Out[108]:
6

Nested Functions

In [10]:
def maker(n):
    def action(x):
        return x ** n
    return action
In [12]:
f = maker(2)

print(f(3))
print(f(4))
9
16
In [14]:
g = maker(3)
print(g(4))
64

The function maker return a function (action), the value n is statically defined in the function maker and is used as a constant number by the function action.

How this feature of the language could be useful for our problem?

In [18]:
def percent(tot):
    def perc(i):
        return "%03d%%" % int(i * 100 / tot)
    return perc
In [19]:
p100 = percent(100)
In [20]:
p100(1)
Out[20]:
'001%'
In [21]:
p100(25)
Out[21]:
'025%'
In [22]:
p250 = percent(250)
In [23]:
p250(1)
Out[23]:
'000%'
In [24]:
p250(25)
Out[24]:
'010%'

Time for coding!

Using the closure define a function that return a function that behave like itemgetter.

In [76]:
itemget1 = itemgetter(1)
itemget1([0, 1, 2, 3, 4])
Out[76]:
1
In [78]:
itemget1_2 = itemgetter(1, 2)
itemget1_2([0, 1, 2, 3, 4])
Out[78]:
(1, 2)

A possible solution could be:

In [1]:
def getitems(*args):
    if len(args) == 1:
        def func(obj):
            return obj[args[0]]
    else:
        def func(obj):
            return tuple(obj[i] for i in args)
    return func
In [74]:
itm = getitems(1)
itm([0, 1, 2, 3, 4])
Out[74]:
1
In [75]:
itm = getitems(1, 2)
itm([0, 1, 2, 3, 4])
Out[75]:
(1, 2)
In [4]:
getval = getitems('pippo', 'pluto')
In [5]:
getval({'pippo': 1, 'pluto':2})
Out[5]:
(1, 2)

Print and update a row

Write a function that print a download bar with the percentage using the new feature of the language that we have learnt.

In [8]:
import time
for i in range(101):
    print("\r%3d%%" % i, end='', flush=True)  # to overwrite the row we need add '\r'
    time.sleep(0.05)
100%

A possible solution could be:

In [9]:
def percent(total, step, fill='#', empty='-', barsize=30):
    total -= 1

    def printpercent(i):
        rest = i / total
        ifill = int(rest * barsize)
        print('\r[%s%s] %3d%%"' % (fill * ifill,
                                   empty * (barsize - ifill),
                                   int(rest * 100.)), end='', flush=True)
    return printpercent
In [10]:
perc = percent(100, 2)
for i in range(100): 
    perc(i)
    time.sleep(0.05)
[##############################] 100%"

Decorator

Function

There is a special type of objects call decorator. We have already seen one that is: property.

class A:
    @property
    def read_only(self):
        return 5

Decorator are objects that are called to do something before and after a method or a function.

In [11]:
VERBOSE = True

def verbose(func):
    def wrapper(*args, **kargs):
        if VERBOSE:
            print("Before to execute: %s" % func.__name__)
        result = func(*args, **kargs)
        if VERBOSE:
            print("After the execution: %s" % func.__name__)
        return result
    return wrapper
In [12]:
@verbose
def add(a, b):
    return a + b
In [13]:
add(1, 2)
Before to execute: add
After the execution: add
Out[13]:
3
In [12]:
def notimplemented(func):
    def wrapper(*args, **kargs):
        print("%s is not implemented yet." % func.__name__)
    return wrapper
In [13]:
@notimplemented
def add(a, b):
    return a + b
In [14]:
add(1, 2)
add is not implemented yet.

Decorator with parameters

In [79]:
DEPRECATEMSG = ("WARNING: `{func}` is a deprecate and will be remove"
              " in the next release, use `{use}` instead.")

def deprecated(use, msg=DEPRECATEMSG):
    def decorator(func):
        def wrapper(*args, **kargs):
            print(msg.format(func=func.__name__, use=use))
            return func(*args, **kargs)
        return wrapper
    return decorator
In [80]:
@deprecated("numpy.add")
def add(a, b):
    return a + b
In [81]:
add(1, 2)
WARNING: `add` is a deprecate and will be remove in the next release, use `numpy.add` instead.
Out[81]:
3
In [28]:
@deprecated("numpy.add", "WARNING: `{func}` will be definetly remove in the next release.")
def add(a, b):
    return a + b
In [29]:
add(1, 2)
WARNING: `add` will be definetly remove in the next release.
Out[29]:
3

Class

Not only functions, but also classes can be used as decorators!

In [82]:
class Deprecated:
    def __init__(self, func, use, msg):
        self.func = func
        self.use = use
        self.msg = msg

    def __call__(self, *args, **kargs):
        print(self.msg.format(func=self.func.__name__, use=self.use))
        return self.func(*args, **kargs)
    
def deprecated(use, msg=DEPRECATEMSG):
    def decorator(func):
        return Deprecated(func, use, msg)
    return decorator
        
In [83]:
@deprecated("numpy.add", "WARNING: `{func}` will be definetly remove in the next release.")
def add(a, b):
    return a + b
In [84]:
add(1, 2)
WARNING: `add` will be definetly remove in the next release.
Out[84]:
3

Time for coding!

Develop a decorator function or class to measure the execution time of a function or method.

To measure the time you can use the function time in the time module.

In [86]:
import time
In [87]:
time.time()
Out[87]:
1392474055.6548846
In [89]:
tstart = time.time()
tstop = time.time()
print(tstop - tstart)
4.649162292480469e-05
In [ ]:
A possible solution using function.
In [14]:
def timeit(func):
    def wrapper(*args, **kargs):
        start = time.time()
        result = func(*args, **kargs)
        stop = time.time()
        print("`{func}` required: {time}s".format(func=func.__name__, time=stop - start))
        return result
    return wrapper
In [15]:
@timeit
def add(a, b):
    return a + b
In [16]:
add(1, 2)
`add` required: 1.430511474609375e-06s
Out[16]:
3

This decorator is nice but the output it is not easy to read.

In [17]:
def timeit(nice=False, msg="`{func}` required: {time}"):

    def decorator(func):

        def beautiful(x):
            symbols = ('s', 'ms', 'µs', 'ns')
            step = 1e3
            if nice:
                for i, symbol in enumerate(symbols):
                    value = x * step**i
                    if value // 1:
                        break
            else:
                value = x
                symbol = symbols[0]
            return "%5.2f%s" % (value, symbol)
            
        def wrapper(*args, **kargs):
            start = time.time()
            result = func(*args, **kargs)
            stop = time.time()
            print(msg.format(func=func.__name__, time=beautiful(stop - start)))
            return result
        return wrapper
    return decorator
In [23]:
@verbose
@timeit()
def add(a, b):
    return a + b
In [22]:
timeit(nice=True)(add)(1, 2)
`add` required:  1.19µs
`wrapper` required: 36.00µs
Out[22]:
3
In [20]:
@timeit(nice=True)
def add(a, b):
    return a + b
In [21]:
add(1, 2)
`add` required:  1.19µs
Out[21]:
3
In [24]:
def add(): pass

def add(x, y): return x*y
In [26]:
add()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-09e9fb56b0a4> in <module>()
----> 1 add()

TypeError: add() missing 2 required positional arguments: 'x' and 'y'
In [9]:
def prova(x):
    for i in range(x):
        yield i
In [10]:
[i*2 for i in prova(5)]
Out[10]:
[0, 2, 4, 6, 8]
In [17]:
rng = range(5)
rng[-1]
Out[17]:
4

Summary

What we have seen:

  • How to to read/write a file;
  • How to use the with statement;
  • How to measure a preformance using timeit;
  • How to use mutable objects as default parameter;
  • How to define an arbitrary number of arguments;
  • How to define and use nested functions;
  • How to use and write a decorator.