Why Python?

  • Concise, intuitive programming language.
  • Ability to play with data and computation ideas.
    • Data persistence
    • Matrix, vector structures and operations are easy
    • Interpreter, using same language as programs.
    • Functional and object-oriented programming style
  • Rich, easy-to-use visualization
  • Fast computation. (See Python Speed) and search using "speed"
  • Large community of users and developers
  • It is FREE for all platforms, even Raspberry Pi and Android phones

What is Python?

  • an open-source language and environment for computing and graphics
  • started in 1989 by Guido van Rossum in the Netherlands
  • is a multi-paradigm programming language
  • has dynamic typing
  • has garbage collection
  • easily extensible in C, C++ and Fortran
  • available for Unix, Windows, and MacOS systems
  • is the language of choice for many researchers in AI
  • jobs!, indeed

Installing and Running Python

Python is installed on our department's systems.

Download and install python on your own computer by following instructions at http://www.python.org or install the Anaconda distribution.

On our systems, enter the ipython interactive environment with

ipython

To quit, type control-d

To run python code in file code.py, either type

run code.py

in ipython, or type

python code.py

at the unix command prompt.

When in the ipython, you may type python statements or expressions that are evaluated, or ipython commands. See the Video tutorial on using ipython, in five parts by Jeff Rush, for help getting started with ipython.

Here we will use the jupyter notebook to provide examples that you can download and run.

Assigning values to variables

In [ ]:
x = 33
y = 432.1
x + y
In [ ]:
who
In [ ]:
whos
In [ ]:
me = 'chuck'
you = 'jill'
In [ ]:
whos
In [ ]:
me + you
In [ ]:
me + ' ' + you

Defining and using functions

In [ ]:
def add(x, increment=1):
    '''This is a very powerful addition function.
    Usage:
      >>> add(10)
      11
      >>> add(10, 3)
      13'''
    return x + increment    
In [ ]:
add?
In [ ]:
add(30)
In [ ]:
add(30, 100)

Random search for polynomial roots, using python

Let's say you need to know, approximately, the roots of $x^3 - 4x^2 -11x + 30$. What are they?

You have a fast computer at your disposal, and python is installed on it. What do you do?

Define a function named f that takes one argument x and returns the value of the above polynomial.

In [ ]:
def f(x):
    return x**3 - 4 * x**2 - 11 * x + 30
In [ ]:
f(2.3)

Now we need to figure out how to generate a random number, say between -10 and 10, in python. Try searching the net for python random number. You soon discover the random module.

Modules in python are defined by python source files. The random module is defined in a file named random.py. Where might it be on our system? Try

locate random.py

On my computer I see it is in /usr/lib64/python3.5. Take a look at the file contents.

To use this module, you must first import it.

import random

This interprets the contents of the file, defining variables, functions, and classes for your use during the python session.

On the random module web page we found, you can read through the available functions, such as uniform(a,b) which generates a pseudo-random number between a and b. Call it like

randomNumber = random.uniform(-10, 10)

because uniform lives in the module name's namespace.

Now we need a way to loop a number of times, each time generating a new pseudo-random number and testing it as the argument to our polynomial function f. We can use a for loop, like

In [ ]:
for i in range(5):
    print(i)

range(5) produces a generator for the sequence 0, 1, 2, 3, 4, so the same loop could be written

In [ ]:
for i in [0, 1, 2, 3, 4]:
    print(i)

Now, we are ready to try a bunch of random argument values.

In [ ]:
import random
In [ ]:
for i in range(5):
    x = random.uniform(-10, 10)
    fx = f(x)
    print(x, fx)

Humm...we need to try many more $x$ values to find some for which $f(x)\approx 0$. And we must keep track of the $x$ value for which $|f(x)|$ is the closest to zero. We could use two variables for this, one for the best $x$ and $f(x)$ so far, but let's do this in a more pythonic way...as a pair, or a tuple, like

bestSoFar = (x,fx)

The elements of a tuple cannot be modified, but they can be accessed by

bestSoFar[0]  # x
bestSoFar[1]  # f(x)

Lists can also be accessed this way. Multiple elements can be accessed with slices.

In [ ]:
nums = [0, 1, 2, 3, 4, 5]
nums
In [ ]:
nums[2:4]
In [ ]:
nums[:3]
In [ ]:
nums[2:]

Okay, so now we have x, fx, and bestSoFar. How do we test to see if we have a new best result and then update bestSoFar?

Develop your python thoughts in small steps, testing each piece.

In [ ]:
firstx = 0
bestSoFar = (firstx, f(firstx))
bestSoFar
In [ ]:
newx = 3
fx = f(newx)
fx

Which value for $x$ is better, 30 or -12?

In [ ]:
abs(fx) < abs(bestSoFar[1])

So, we can update bestSoFar using

if abs(fx) < abs(bestSoFar[1]):
    bestSoFar = (x, fx)

It's just that simple.

Now, putting the pieces together, the complete python code to find the value of $x$ from a set of 100 random values for which $f(x)$ is closest to zero, we do the following.

In [ ]:
import random

def f(x):
    return x**3 - 4*x**2 - 11*x + 30

bestSoFar = (0, f(0))

for step in range(100):
    x = random.uniform(-10, 10)
    fx = f(x)
    if (abs(fx) < abs(bestSoFar[1])):
        bestSoFar = (x, fx)
In [ ]:
bestSoFar

Not such a great root value. $f(x)$ is not very close to zero.

Let's try 10000 random numbers.

In [ ]:
bestSoFar = (0, f(0))

for step in range(1000):
    x = random.uniform(-10, 10)
    fx = f(x)
    if (abs(fx) < abs(bestSoFar[1])):
        bestSoFar = (x, fx)
In [ ]:
bestSoFar

That's better. Try again with more numbers? But this is getting tedious. Let's write a function to do this that takes one argument as the number of values of $x$ to try.

Also, let's include an argument that accepts the function for which we want to find a root. How do you pass a function into another function as an argument?

We will do this in class. First, download this jupyter notebook. Then, run

 jupyter notebook

open this notebook, then scroll down to this cell and define and test the function.

In [ ]:
def findRoot(f, xmin, xmax, n=1000):
    bestSoFar = (0, f(0))
    for step in range(n):
        x = random.uniform(xmin, xmax)
        fx = f(x)
        if (abs(fx) < abs(bestSoFar[1])):
            bestSoFar = (x, fx)
    return bestSoFar
In [ ]:
findRoot(f, -10, 10, 100000)

Plotting

Now for some fun. Let's plot the progress of our search by collecting the bestSoFar values every time they are updated. How do we plot in python? Again, you can search the net. To speed things up, here is an example.

import matplotlib.pyplot as plt
plt.ion()  #interactive plotting on
plt.plot(range(10), 'o-')

In this jupyter notebook, we will specify the plots to be inline, rather than calling plt.ion().

In [ ]:
import matplotlib.pyplot as plt
%matplotlib inline
In [ ]:
plt.plot(range(10), 'o-')

Ta-da! A plot! The x-axis values were assumed to be $0,\ldots,n$ where $n$ is the number of values given as the first argument.

Here is another example. Let's plot the values of the sine function, which is defined in the numpy module. We will also make use of the linspace function in the numpy module to generate some $x$ values, and the sin function.

In [ ]:
import numpy as np

xs = np.linspace(-10,10,100)
plt.plot(xs,np.sin(xs))

If we were working from inside the ipython environment, this adds the sine curve to the current plot. We could first clear the figure by doing

plt.clf()

Now, to collect the bestSoFar values. Let's just build up a list called trace by adding the new value of bestSoFar whenever it is changed. Can use the append list method for this.

trace.append(bestSoFar)

assuming trace was initialized.

Here is a function for this. Let's put it with the import statements in a python source file, named findRoot.py. Notice that the function returns two things, the bestSoFar pair, and the trace, as a tuple.

In [ ]:
import random
import matplotlib.pyplot as plt

def findRoot(f, xmin, xmax, maxSteps=1000):
    bestSoFar = (0, f(0))
    trace = [bestSoFar]
    for step in range(maxSteps):
        x = random.uniform(xmin, xmax)
        fx = f(x)
        if (abs(fx) < abs(bestSoFar[1])):
            bestSoFar = (x,fx)
            trace.append(bestSoFar)
    return (bestSoFar, trace)
In [ ]:
findRoot(f, -10, 10, 100)

Now we see this function returned the best guess at the root, but it also returned the sequence of best guesses so far.

From this notebook, we can write this code into a file named findRoot.py.

In [ ]:
%%writefile findRoot.py
import random
import matplotlib.pyplot as plt

def findRoot(f, xmin, xmax, maxSteps=1000):
    bestSoFar = (0, f(0))
    trace = [bestSoFar]
    for step in range(maxSteps):
        x = random.uniform(xmin, xmax)
        fx = f(x)
        if (abs(fx) < abs(bestSoFar[1])):
            bestSoFar = (x,fx)
            trace.append(bestSoFar)
    return (bestSoFar, trace)

This file can be loaded into ipython and used as shown here.

In [ ]:
import findRoot
In [ ]:
def f(x):
    return x**3 - 4 * x**2 - 11 * x + 30
In [ ]:
findRoot.findRoot(f, -10, 10, 1000)

What we want to plot is the second of each pair in the list that is the second thing returned.

Hummm...here is a chance to show the beauty of list comprehensions. Remember set notation? What is the set $\{x^2 | x \in \mathcal{N}, x < 10\}$? List comprehensions mimic this notation. The same set in python is

In [ ]:
[x**2 for x in range(10)]

Now, if we assign the result of findRoot, like

In [ ]:
result = findRoot.findRoot(f, -10, 10, 10000)

we can collect just the best $f(x)$ values by

In [ ]:
values = [a[1] for a in result[1]]
values

and these can be plotted like

In [ ]:
plt.plot(values, 'o-')
plt.plot([0, len(values)-1], [0, 0], 'r--')

We computer scientists are all about testing our code, right? So, for every module (meaning every python source file) you write, you should include some testing code. An easy way to do this is to add some statements at the end of the file that call the functions defined above. We don't want these called every time you import your file, though. Only when it is run, by doing

run findRoot

in ipython, or by doing

python findRoot.py

from the unix command line. When your code is run by either of these, the variable __name__ has the string value __main__, so the testing code can be in the true block of an if statement that checks this. The whole file is now

In [ ]:
%%writefile findRoot.py
import random
import matplotlib.pyplot as plt

def findRoot(f, xmin, xmax, maxSteps=1000):
    bestSoFar = (0, f(0))
    trace = [bestSoFar]
    for step in range(maxSteps):
        x = random.uniform(xmin, xmax)
        fx = f(x)
        if (abs(fx) < abs(bestSoFar[1])):
            bestSoFar = (x, fx)
            trace.append(bestSoFar)
    return (bestSoFar, trace)

if __name__ == '__main__':
    
    def f(x):
        return x**3 - 4*x**2 - 11*x + 30
    
    result = findRoot(f, -10, 10, 10000)
    print(result)
    values = [bests[1] for bests in result[1]]
    plt.plot(values,'o-')
    plt.plot([0, len(values)-1], [0, 0], 'r--')
In [ ]:
import findRoot
In [ ]:
run findRoot

Dictionaries

Python inludes a very efficient implementation of associative maps, which are called dictionaries in python. Each entry has a key and a value. Keys must only be immutable objects.

Here is a dictionary for associating grades with people. The key is a string.

In [ ]:
grades = {'Jim' : 88.2, 'Kim': 93, 'Slim': 75.2}
grades
In [ ]:
grades['Jim']
In [ ]:
grades['Kim'] = 94.2
In [ ]:
grades
In [ ]:
grades['Wim'] = 52
In [ ]:
grades
In [ ]:
grades['Nim']
In [ ]:
for k, v in grades.items():
    print(k,v)
In [ ]:
grades.get('Nim', 'grade missing')

Pass by reference versus pass by value

Most everything in python is a reference, except primitive types. So

In [ ]:
x = [1, 2]
x
In [ ]:
y = x
y
In [ ]:
y.append(3)
y
In [ ]:
x

Careful, though. When concatenating two lists with +, copies are made.

In [ ]:
x = [1, 2]
x
In [ ]:
y = x
y
In [ ]:
y = y + [3]
y
In [ ]:
x

With primitive types, the result is more what you would expect.

In [ ]:
x = 42
x
In [ ]:
y = x
y
In [ ]:
y += 1
y
In [ ]:
x

Therefore, passing arguments to functions is by reference except for primitive types.

In [ ]:
def changeInt(x):
    x = x + 2
    print('In changeInt', x)

def changeStr(s):
    s = s + ' you'
    print('In changeStr', s)

def changeList(lst):
    lst = lst + [23, 52.0]
    print('In changeList', lst)

def changeList2(lst):
    lst.append([23, 52.0])
    print('In changeList2', lst)

def changeDict(dict):
    dict['newone'] = 42
    print('In changeDict', dict)
In [ ]:
num = 33
s = "hello"
lst = [1, 2, 3]
dict = {'a':34, 'b':22}
In [ ]:
print(num)
changeInt(num)
print(num)
print()

print(s)
changeStr(s)
print(s)
print()

print(lst)
changeList(lst)
print(lst)
print()

print(lst)
changeList2(lst)
print(lst)
print()

print(dict)
changeDict(dict)
print(dict)
print()

Writing and debugging a module

Say you are editing some python code in a file named search.py and you write some test code in a file named testSearch.py. You want to run testSearch.py or use control-c control-c in emacs to repeatedly run the test code after editing search.py. You must force python to reload the search module by starting testSearch.py with

import search
import imp
imp.reload(search)

Confusing scope of variables

In most languages, since

In [ ]:
x = 20
def addTen():
    print(x + 10)
In [ ]:
addTen()

works, you would expect

In [ ]:
x = 20
def addTen():
    x = x + 10

to work, too, but

In [ ]:
addTen()

What's going on?

The assignment to x in the second function forces x to be local to the function, therefore its use on the right-hand side is referencing a local variable that does not have a value yet. The first version of the function does not assign to x so uses the reference to x defined outside the function. If you want to change x in the outer environment, then you must set its value using a returned result, or add a global statement.

In [ ]:
def addTen():
    return x + 10
In [ ]:
x
In [ ]:
x = addTen()
x

Watch out! This still messes me up from time to time.

Passing by reference is great, except when it's not

Data structures like lists are easily changed in a function. Here is one.

In [ ]:
def removeLast(stuff):
    return stuff.pop()
In [ ]:
nums = range(10)
In [ ]:
nums
In [ ]:
nums = list(range(10))
In [ ]:
nums
In [ ]:
removeLast(nums)
In [ ]:
removeLast(nums)
In [ ]:
removeLast(nums)
In [ ]:
nums

Now let's try appending elements.

In [ ]:
def addToEnd(stuff, item):
    stuff.append(item)
In [ ]:
addToEnd(nums, 10)
In [ ]:
nums

Okay. Let's add more than one element.

In [ ]:
def addToEnd(stuff, items):
    stuff = stuff + items  # or stuff += items
In [ ]:
addToEnd(nums, [20, 21])
In [ ]:
nums

Rats! What's going on?

The + operator on lists creates a new list rather than modifying the old list. And the reference to the new list is assigned to the local variable stuff. So, stuff first referred to the nums list which was used on the right hand side of the assignment, but then stuff was reassigned to refer to the new list. What do you do if you want to change nums?

In [ ]:
def addToEnd(stuff, items):
    return stuff + items
In [ ]:
addToEnd(nums, [20, 21])

Oops. Oh yeah . . .

In [ ]:
nums = addToEnd(nums, [20, 21])
In [ ]:
nums

Let's say we use a list to represent the state of a search problem. We write a function to modify the state in several ways, to represent the successors of that state. As an example, let's just add 1 to an element.

In [ ]:
def addOne(state, index):
    state[index] = state[index] + 1  # or state[index] += 1
In [ ]:
state = [1, 2, 3]
state
In [ ]:
addOne(state, 0)
In [ ]:
state

That works. But now state does not have its original value.

In [ ]:
addOne(state, 1)
In [ ]:
state

If you want to keep it, you can always store the original state in a separate variable, right?

In [ ]:
origState = [1, 2, 3]
In [ ]:
state = origState
In [ ]:
addOne(state, 0)
In [ ]:
state
In [ ]:
origState

Doh! origState and state are references to the same object. Solution is to use the copy function. (See the documentation).

In [ ]:
import copy
In [ ]:
origState = [1, 2, 3]
state = copy.copy(origState)
addOne(state, 0)
In [ ]:
state
In [ ]:
origState

Now, let's get fancy and use a list comprehension to create a list of all ways of adding 1 to the elements of the state.

In [ ]:
[addOne(copy.copy(state), i) for i in range(len(state))]

Arrgh! What happened? None was returned by addOne. Why? Because we didn't return anything!

Try again.

In [ ]:
def addOne(state, index):
    state[index] += 1
    return state
In [ ]:
state = [1, 2, 3]
In [ ]:
[addOne(copy.copy(state), i) for i in range(len(state))]

Should we remove an element from the beginning or the end of a list?

Which end of a list is it faster to pop from? Remember your linked lists?

Use the ipython magic function timeit.

In [ ]:
def testPopFront():
    x = list(range(10000))
    for i in range(100):
        x.pop(0)
        
def testPopBack():
    x = list(range(10000))
    for i in range(100):
        x.pop()
In [ ]:
timeit testPopFront()
In [ ]:
timeit testPopBack()

Popping from the back is faster, about twice as fast.

In [ ]: