Last week, we learned how to run Python code, assign variables, and write control flow statements, which allows us to write programs that can do calculations. In fact, this is all you really need to write programs (except for being able to read in and write out data which we will talk about later). However, if you try and write more complicated programs than those you wrote for Problem Set 1, the programs will quickly become very long and unreadable. So one very important rule in programming is to avoid repetition.
Let's say you want to write a program that checks if certain numbers are prime, and then you want to check later in the program if another set of values is prime. You could write:
for value in [3,5,6,8,9]:
prime = True
for i in range(2, int(value ** 0.5) + 1):
if value % i == 0:
prime = False
if prime:
print value, "is prime"
3 is prime 5 is prime
then:
for value in [34,12,27,21,23,13]:
prime = True
for i in range(2, int(value ** 0.5) + 1):
if value % i == 0:
prime = False
if prime:
print value, "is prime"
23 is prime 13 is prime
But this is inefficient, because you are duplicating the code that checks if a value is prime. This is why, if you ever need to do something twice or more, and it is a well-defined task, think about putting it in a function.
For example:
def is_prime(value):
prime = True
for i in range(2, int(value ** 0.5) + 1):
if value % i == 0:
prime = False
return prime
Once the function is defined, you can then do:
for value in [3,5,6,8,9]:
if is_prime(value):
print value, "is prime"
for value in [34,12,27,21,23,13]:
if is_prime(value):
print value, "is prime"
3 is prime 5 is prime 23 is prime 13 is prime
The syntax for a function is:
def function_name(arguments):
# code here
return values
Functions are the building blocks of programs - think of them as basic units that are given a certain input an accomplish a certain task. Over time, you can build up more complex programs while preserving readability.
Similarly to if
statements and for
and while
loops, indentation is very important because it shows where the function starts and ends.
Note: it is a common convention to always use lowercase names for functions.
A function can take multiple arguments...
def add(a, b):
return a + b
print add(1,3)
print add(1.,3.2)
print add(4,3.)
4 4.2 7.0
... and can also return multiple values:
def double_and_halve(value):
return value * 2., value / 2.
print double_and_halve(5.)
(10.0, 2.5)
If multiple values are returned, you can store them in separate variables.
d, h = double_and_halve(5.)
print d
10.0
print h
2.5
Functions can call other functions:
def is_divisible_by(value, other_value):
return value % other_value == 0
def is_prime(value):
prime = True
for i in range(2, int(value ** 0.5) + 1):
if is_divisible_by(value, i):
prime = False
return prime
for value in range(2, 30):
if is_prime(value):
print value, "is prime"
2 is prime 3 is prime 5 is prime 7 is prime 11 is prime 13 is prime 17 is prime 19 is prime 23 is prime 29 is prime
Just because you can put code in functions doesn't mean you always should. It's best to try and break up the code into units that make sense - in the end, your function should ideally have a name that everyone can understand.
If we take the example from above, the following, in my opinion, is not as good even though it's one line shorter:
def is_prime(value):
prime = True
for i in range(2, int(value ** 0.5) + 1):
if value % i == 0:
prime = False
print value, "is prime"
for value in [3,5,6,8,9]:
is_prime(value)
3 is prime 5 is prime 6 is prime 8 is prime 9 is prime
The issue is that to me, is_prime
means - just from the name - that it will return True
or False
depending on whether the value passed is prime, and doesn't say anything about printing. If you wanted to do this, the function should ideally be called print_message_if_prime
or something similar. So don't define functions based on just making the shortest possible program, but also take into account that functions should be basic units that make sense conceptually.
Some of you may have already noticed that there are a few functions that are defined by default in Python:
x = [1,3,6,8,3]
len(x)
5
sum(x)
21
int(1.2)
1
A full list of built-in functions is available here. Note that there are not that many - these are only the most common functions. Most functions are in fact kept inside modules, which we will now cover.
One of the strengths of Python is that there are many built-in add-ons - or modules - which contain existing functions, classes, and variables which allow you to do complex tasks in only a few lines of code. In addition, there are many other third-party modules (e.g. Numpy, Scipy, Matplotlib) that can be installed, and you can also develop your own modules that include functionalities you commonly use.
The built-in modules are referred to as the Standard Library, and you can find a full list of the available functionality in the Python Documentation.
To use modules in your Python session or script, you need to import them. The
following example shows how to import the built-in math
module, which
contains a number of useful mathematical functions:
import math
You can then access functions and other objects in the module with math.<function>
, for example:
math.sin(2.3)
0.7457052121767203
math.factorial(20)
2432902008176640000
math.pi
3.141592653589793
Because these modules exist, it means that if what you want to do is very common, it means it probably already exists, and you won't need to write it (making your code easier to read).
For example, the numpy
module, which we will talk about next week, contains useful functions for finding e.g. the mean, median, and standard deviation of a sequence of numbers:
import numpy as np
li = [1,2,7,3,1,3]
np.mean(li)
2.8333333333333335
np.median(li)
2.5
np.std(li)
2.0344259359556172
Notice that in the above case, we used import numpy as np
instead of import numpy
- this means that we can rename the module so that it's not as long to type in the program.
Finally, it's also possible to simply import the functions needed directly:
from math import sin, cos
sin(3.4)
cos(3.4)
-0.9667981925794611
You may find examples on the internet that use e.g. from module import *
, but this is not recommended, because it will make it difficult to debug programs, since common debugging tools that rely on just looking at the programs will not know all the functions that are being imported (more on this later).
How do you know which modules exist in the first place? The Python documentation contains a list of modules in the Standard Library, but you can also simply search the web (Google is your friend!). Once you have a module that you think should contain the right kind of function, you can either look at the documentation for that module, or you can use the tab-completion in IPython:
In [2]: math.<TAB>
math.acos math.degrees math.fsum math.pi
math.acosh math.e math.gamma math.pow
math.asin math.erf math.hypot math.radians
math.asinh math.erfc math.isinf math.sin
math.atan math.exp math.isnan math.sinh
math.atan2 math.expm1 math.ldexp math.sqrt
math.atanh math.fabs math.lgamma math.tan
math.ceil math.factorial math.log math.tanh
math.copysign math.floor math.log10 math.trunc
math.cos math.fmod math.log1p
math.cosh math.frexp math.modf
Consider the following example:
a = 1
def show_var():
print a,b
b = 2
show_var()
1 2
In this case, the function knows about the variables defined outside the function.
Now consider the following example:
a = 1
def show_var():
print a
a = 2
show_var()
--------------------------------------------------------------------------- UnboundLocalError Traceback (most recent call last) <ipython-input-27-e84ac8cb6bcc> in <module>() 5 a = 2 6 ----> 7 show_var() <ipython-input-27-e84ac8cb6bcc> in show_var() 2 3 def show_var(): ----> 4 print a 5 a = 2 6 UnboundLocalError: local variable 'a' referenced before assignment
What happened? To understand this behavior, we have to talk about variable scope
Variables defined anywhere inside a function are part of the local scope of the function. Any variable in the local scope takes precedence over any other variable, even before it is actually used:
def show_var():
print a
a = 2
In this case, a
is defined inside the function and so it doesn't matter if a is used anywhere else in the Python code. The above function will therefore not work because a
is used before it is defined.
Variables defined at the top level of a script are in the global scope. If a variable is not defined inside a function, then variables from the global scope are used.
a = 1
def show_var():
print a
In this case, a
gets used from the global scope.
This is very useful because it means that modules don't have to be imported inside functions, you can import them at the top level:
import numpy as np
def normalize(x):
return x / np.mean(x)
This works because modules are objects in the same sense as any other variable.
In practice, this does not mean that you should ever use:
a = 1
def show_var():
print a
because it makes the code harder to read. The exception to this are modules, and variables that remain constant during the execution of the program.
We just touched on the idea of constants being used in functions - but Python does not really have constants, so how do we recognize these? We now need to speak about coding style.
There is a set of style guidelines referred to as PEP8, which you can find here. These guidelines are not compulsory, but you should follow them as much as possible, especially when you have to work with other people or need other people to read your code.
You don't need to read the guidelines now, but I will first give a couple of examples, then I will show you a tool that can help you follow the guidelines. The following example does not follow the style guidelines:
pi = 3.1415926
def CalculateValues(x):
return x*pi
Constants should be made uppercase, and function names should be lower case separated by underscores (the so called camel-case used above is reserved for classes).
This is the correct way to write the code:
PI = 3.1415926
def calculate_values(x):
return x * PI
Other examples include that indentation should always be 4 spaces, etc. In practice, you can check your code with this script. Download the script to the folder where you are writing code, and do:
python pep8.py my_script.py
where my_script.py
is the script you want to check. For example, you might see:
my_script.py:2:1: W191 indentation contains tabs
The errors include the line number. In general, try and make sure that your scripts do not return any errors!