# Unit 6, Lecture 3¶

Numerical Methods and Statistics

# Lecture Goals¶

1. Know what a python function is and be able to define one
2. Be able to call a function and understand how arguments are passed to the function
3. Know the difference between arguments and variables
4. Understand how to return values
5. Be able to design and document a function
6. Be able to convert a function to work with numpy
In [1]:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import matplotlib


# Defining Functions¶

Sometimes it's nice to use functions instead of writing the equation for a geometric distribution each time.

# Function Definition¶

You may define your own functions using the def command.

In [2]:
def print_hello():
print('Hello')

print_hello()

Hello


You may give arguments by putting them inside the (). Like this:

In [3]:
def print_string(s):
print(s) # <--- I can now use s anywhere inside the function

print_string('Go dog')
print_string('See dog go')

Go dog
See dog go


What if you want to return something? You can, with the return function.

In [4]:
def square(x):
return x * x

x_squared = square(2)
print(x_squared)

4


## What is the difference between returning and printing?¶

In [5]:
def square_print(x):
print(x * x)

square_print(5) * 4

25

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
2     print(x * x)
3
----> 4 square_print(5) * 4

TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'
In [6]:
square(5) * 4

Out[6]:
100

You can pass multple arguments and your function can be multiple lines, just like for loops and if statements.

In [5]:
def geometric(n, p):
P_n = (1 - p)**(n - 1) * p
return P_n
geometric(5, 0.1)

Out[5]:
0.06561
In [6]:
Q = np.arange(1,100)
plt.plot(Q, geometric(Q, 0.1), 'o')
plt.show()


You can make your own documentation. Use the ''' instead of just one ' so that you can use multiple lines

In [7]:
def my_geom(n, p=0.5):
'''This function will compute the probability of n from a geometric distribution given p

Args:
p: The probability of success
n: The number of times before and including success

returns: the probability'''
return (1 - p)**(n - 1)* p

In [8]:
help(my_geom)

Help on function my_geom in module __main__:

my_geom(n, p=0.5)
This function will compute the probability of n from a geometric distribution given p

Args:
p: The probability of success
n: The number of times before and including success

returns: the probability



Notice I used a default value for p

In [9]:
my_geom(n=4)

Out[9]:
0.0625

Notice I used the name of the argument with the n=4. This is called a named argument. You can mix the use of named arguments and positional arguments interchangably. Positional arguments are just regular arguments. For example:

In [38]:
my_geom(3, p=0.3)

Out[38]:
0.14699999999999996

There is one rule though, you must put positional arguments before named arguments. For example, this will not work:

In [39]:
my_geom(n=3, 0.2)

  File "<ipython-input-39-c496f4e9a124>", line 1
my_geom(n=3, 0.2)
^
SyntaxError: positional argument follows keyword argument


## Things to consider in creating a fucntion¶

1. Write the signature and doc string describing what your function should do and its input/output
2. Write a function that generates correct output for most inputs
3. Consider edge cases and possible user error
4. Validate for numpy (optional)

## Example¶

For our example, we'll write a function that computes the Fibonacci sequence up to the $n$th term. The Fibonacci sequence is a sequence whose $n$th element is the sum of the $n-1$ and $n-2$ elements. The first two elements are defined to be $1$. Here's the first few terms:

$$1,\,1,\,2,\,3,\,5,\,8$$

So our function should return 5 if it is passed 4 (first term is indexed as 0).

## Write doc string¶

In [10]:
def fib():
'''Returns the ith element from the Fibonacci sequence

'''


## Define the input/output¶

We decide only 1 input is necessary

In [11]:
def fib(i):
'''Returns the ith element from the Fibonacci sequence

Args:
i: the index of the element in the Fibonacci sequence

returns: the element at the given index in the Fibonacci sequence
'''


## Make it work¶

Now we actually test it out! I'll omit the docstring to make the slides easier to view. We should use inline comments with a # to indicate our logic in the program

In [12]:
def fib_1(i):
last = 1 # our n - 1 element
last_last = 1 # our n -2 element
for j in range(i): #go until we are at the ith element
current = last + last_last #the add
last_last = last #our new n - 2
last = current # our new n - 1
return current

In [13]:
fib_1(4)

Out[13]:
8

It looks pretty good! Except, it is off by 1 from our original example.

In [14]:
def fib_2(i):
last = 1 # our n - 1 element
last_last = 1 # our n -2 element
#sub 1 because last and last_last
#are set corresponding to n=2
for j in range(i - 1):
current = last + last_last #the add
last_last = last #our new n - 2
last = current # our new n - 1
return current

In [15]:
fib_2(4)

Out[15]:
5
In [16]:
fib_2(5)

Out[16]:
8
In [17]:
fib_2(6)

Out[17]:
13

Looks like it works on our original example

## Consider Edge Case¶

How does it deal with the starting point?

In [18]:
fib_2(0)

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-18-d4a739e2f51f> in <module>()
----> 1 fib_2(0)

<ipython-input-14-dbfebd64ba2d> in fib_2(i)
8         last_last = last #our new n - 2
9         last = current # our new n - 1
---> 10     return current

UnboundLocalError: local variable 'current' referenced before assignment
In [19]:
fib_2(1)

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-19-b9eccf2b3e3b> in <module>()
----> 1 fib_2(1)

<ipython-input-14-dbfebd64ba2d> in fib_2(i)
8         last_last = last #our new n - 2
9         last = current # our new n - 1
---> 10     return current

UnboundLocalError: local variable 'current' referenced before assignment
In [20]:
fib_2(2)

Out[20]:
2
In [21]:
def fib_3(i):

#deal with edge cases
if i == 0 or i == 1:
return 1

last = 1 # our n - 1 element
last_last = 1 # our n -2 element
#sub 1 because last and last_last
#are set corresponding to n=2
for j in range(i - 1):
current = last + last_last #the add
last_last = last #our new n - 2
last = current # our new n - 1
return current

In [22]:
fib_3(0)

Out[22]:
1

## Deal with use error¶

Make your function easy to use by gracefully dealing with bad input conditions.

In [23]:
def fib_4(i):

#check that i is valid
if(type(i) != int or i < 0):
return None

#deal with edge cases
if i == 0 or i == 1:
return 1

last = 1 # our n - 1 element
last_last = 1 # our n -2 element
#sub 1 because last and last_last
#are set corresponding to n=2
for j in range(i - 1):
current = last + last_last #the add
last_last = last #our new n - 2
last = current # our new n - 1
return current

In [24]:
print(fib_4(-4))

None

In [25]:
print(fib_4('fdsa'))

None

In [26]:
fib_4(8)

Out[26]:
34

Dealing with user error is a fascinating topic and gets into user interface design, which is covered nicely in this introductory video from the web design community.

# How to convert a function into a numpy function¶

np.vectorize(fxn) will turn your function into a numpy version. You pass it your function (fxn) and it returns a new function which is yours but upgraded to work on numpy arrays.

In [27]:
import numpy as np
from math import sin

#Don't actually do this, this is an example. Just use np.sin instead.
my_np_sin = np.vectorize(sin) #<-- I'm turning the math sine into my own version

x = np.linspace(0,3,5)
print(my_np_sin(x))

[ 0.          0.68163876  0.99749499  0.7780732   0.14112001]

In [28]:
def my_distribution(x):
'''my special snowflake distribution'''
if x > 40:
return 0.2
else:
return 0.8

x = np.linspace(0,100,10)
print(my_distribution(x))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-b2bbe32da974> in <module>()
7
8 x = np.linspace(0,100,10)
----> 9 print(my_distribution(x))

<ipython-input-28-b2bbe32da974> in my_distribution(x)
1 def my_distribution(x):
2     '''my special snowflake distribution'''
----> 3     if x > 40:
4         return 0.2
5     else:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
In [29]:
numpy_version_distribution = np.vectorize(my_distribution)
print(numpy_version_distribution(x))

[ 0.8  0.8  0.8  0.8  0.2  0.2  0.2  0.2  0.2  0.2]


What happened with the doc string though? You can only access it through the np.info, instead of help

In [30]:
np.info(numpy_version_distribution)

my special snowflake distribution


This is not always necessary - usually only needed when working with loops and if statements.

In [31]:
def foo(x):
return 2 ** x
data = np.linspace(1,2,10)

print(foo(data))

[ 2.          2.16011948  2.33305808  2.5198421   2.72158     2.93946898
3.1748021   3.42897593  3.70349885  4.        ]


## Revise¶

It's always good to revise your functions after considering each of these aspects. Let's say we want our Fibonacci function to be capable of specifying the starting point? That would require specifying two values.

In [32]:
def fib_5(i, start=(1,1)):

#check that i is valid
if(type(i) != int or i < 0):
return None

#deal with edge cases
if i == 0:
return start[0]
if i == 1:
return start[1]

last = start[0] # our n - 1 element
last_last = start[1] # our n -2 element
#sub 1 because last and last_last
#are set corresponding to n=2
for j in range(i - 1):
current = last + last_last #the add
last_last = last #our new n - 2
last = current # our new n - 1
return current

In [33]:
fib_5(2, start=(1, 2))

Out[33]:
3

Now we'll put it all together with numpy and the doc string

In [34]:
def py_fib(i, start=(1,1)):
'''Returns the nth element from the Fibonacci sequence

Args:
i: the index of the element in the Fibonacci sequence
start: the first and second elements of the sequence. Default is 1,1

returns: the element at the given index in the Fibonacci sequence
'''
#check that i is valid
if(type(i) != int or i < 0):
return None

#deal with edge cases
if i == 0:
return start[0]
if i == 1:
return start[1]

last = start[0] # our n - 1 element
last_last = start[1] # our n -2 element
#sub 1 because last and last_last
#are set corresponding to n=2
for j in range(i - 1):
current = last + last_last #the add
last_last = last #our new n - 2
last = current # our new n - 1
return current

In [35]:
#you can specifiy that it returns integers
fib = np.vectorize(py_fib, otypes=[np.int])

In [36]:
i = np.arange(10)
fib(i)

Out[36]:
array([ 1,  1,  2,  3,  5,  8, 13, 21, 34, 55])
In [37]:
np.info(fib)

Returns the nth element from the Fibonacci sequence

Args:
i: the index of the element in the Fibonacci sequence
start: the first and second elements of the sequence. Default is 1,1

returns: the element at the given index in the Fibonacci sequence