In [1]:

%matplotlib inline

Overview of Python¶

Python is interpreted¶

Python is an interpreted language, in contrast to Java and C which are compiled languages.
This means we can type statements into the interpreter and they are executed immediately.

In [2]:

5 + 5

Out[2]:

Groups of statements are all executed one after the other:

In [3]:

x = 5
y = 'Hello There'
z = 10.5

We can visualize the above code using PythonTutor.

In [4]:

x + 5

Out[4]:

Assignments versus equations¶

In Python when we write x = 5 this means something different from an equation $x=5$.
Unlike variables in mathematical models, variables in Python can refer to different things as more statements are interpreted.

In [5]:

x = 1
print('The value of x is', x)

x = 2.5
print('Now the value of x is', x)

x = 'hello there'
print('Now it is ', x)

The value of x is 1
Now the value of x is 2.5
Now it is  hello there

Calling Functions¶

We can call functions in a conventional way using round brackets

In [6]:

round(3.14)

Out[6]:

Types¶

Values in Python have an associated type.
If we combine types incorrectly we get an error.

In [7]:

print(y)

Hello There

In [8]:

y + 5

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-b85a2dbb3f6a> in <module>
----> 1 y + 5

TypeError: can only concatenate str (not "int") to str

The type function¶

We can query the type of a value using the type function.

In [9]:

type(1)

Out[9]:

int

In [10]:

type('hello')

Out[10]:

str

In [11]:

type(2.5)

Out[11]:

float

In [12]:

type(True)

Out[12]:

bool

Null values¶

Sometimes we represent "no data" or "not applicable".
In Python we use the special value None.
This corresponds to Null in Java or SQL.

In [13]:

result = None

When we fetch the value None in the interactive interpreter, no result is printed out.

In [14]:

result

Testing for Null values¶

We can check whether there is a result or not using the is operator:

In [15]:

result is None

Out[15]:

True

In [16]:

x = 5
x is None

Out[16]:

False

Converting values between types¶

We can convert values between different types.

Converting to floating-point¶

To convert an integer to a floating-point number use the float() function.

In [17]:

x = 1
x

Out[17]:

In [18]:

type(x)

Out[18]:

int

In [19]:

y = float(x)
y

Out[19]:

1.0

Converting to integers¶

To convert a floating-point to an integer use the int() function.

In [20]:

type(y)

Out[20]:

float

In [21]:

int(y)

Out[21]:

Variables are not typed¶

Variables themselves, on the other hand, do not have a fixed type.
It is only the values that they refer to that have a type.
This means that the type referred to by a variable can change as more statements are interpreted.

In [22]:

y = 'hello'
print('The type of the value referred to by y is ', type(y))
y = 5.0
print('And now the type of the value is ', type(y))

The type of the value referred to by y is  <class 'str'>
And now the type of the value is  <class 'float'>

Polymorphism¶

The meaning of an operator depends on the types we are applying it to.

In [23]:

1 + 1

Out[23]:

In [24]:

'a' + 'b'

Out[24]:

'ab'

In [25]:

'1' + '1'

Out[25]:

'11'

Conditional Statements and Indentation¶

The syntax for control structures in Python uses colons and indentation.
Beware that white-space affects the semantics of Python code.
Statements that are indented using the Tab key are grouped together.

`if` statements¶

In [26]:

x = 5
if x > 0:
    print('x is strictly positive.')
    print(x)
    
print('finished.')

x is strictly positive.
5
finished.

Visualize the above on PythonTutor.

Changing indentation¶

In [27]:

x = 0
if x > 0:
    print('x is strictly positive.')
print(x)
    
print('finished.')

0
finished.

Visualize the above on PythonTutor.

`if` and `else`¶

In [28]:

x = 0
print('Starting.')
if x > 0:
    print('x is strictly positive.')
else:
    if x < 0:
        print('x is strictly negative.')
    else:
        print('x is zero.')
print('finished.')

Starting.
x is zero.
finished.

Visualize the above on PythonTutor.

`elif`¶

In [29]:

print('Starting.')
if x > 0:
    print('x is strictly positive')
elif x < 0:
    print('x is strictly negative')
else:
    print('x is zero')
print('finished.')

Starting.
x is zero
finished.

Lists¶

We can use lists to hold an ordered sequence of values.

In [30]:

l = ['first', 'second', 'third']
l

Out[30]:

['first', 'second', 'third']

Lists can contain different types of variable, even in the same list.

In [31]:

another_list = ['first', 'second', 'third', 1, 2, 3]
another_list

Out[31]:

['first', 'second', 'third', 1, 2, 3]

Mutable Datastructures¶

Lists are mutable; their contents can change as more statements are interpreted.

In [32]:

l.append('fourth')
l

Out[32]:

['first', 'second', 'third', 'fourth']

References¶

Whenever we bind a variable to a value in Python we create a reference.
A reference is distinct from the value that it refers to.
Variables are names for references.

In [33]:

X = [1, 2, 3]
Y = X

Side effects¶

The above code creates two different references (named X and Y) to the same value [1, 2, 3]
Because lists are mutable, changing them can have side-effects on other variables.
If we append something to X what will happen to Y?

In [34]:

X.append(4)
X

Out[34]:

[1, 2, 3, 4]

In [35]:

Out[35]:

[1, 2, 3, 4]

Visualize the above on PythonTutor.

State and identity¶

The state referred to by a variable is different from its identity.
To compare state use the == operator.
To compare identity use the is operator.
When we compare identity we check equality of references.
When we compare state we check equality of values.

Example¶

We will create two different lists, with two associated variables.

In [36]:

X = [1, 2]
Y = [1]
Y.append(2)

Visualize the above code on PythonTutor.

Comparing state¶

In [37]:

Out[37]:

[1, 2]

In [38]:

Out[38]:

[1, 2]

In [39]:

X == Y

Out[39]:

True

Comparing identity¶

In [40]:

X is Y

Out[40]:

False

Copying data prevents side effects¶

In this example, because we have two different lists we avoid side effects

In [41]:

Y.append(3)
X

Out[41]:

[1, 2]

In [42]:

X == Y

Out[42]:

False

In [43]:

X is Y

Out[43]:

False

Iteration¶

We can iterate over each element of a list in turn using a for loop:

In [44]:

my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
    print(i)

first
second
third
fourth

Visualize the above on PythonTutor.

Including more than one statement inside the loop¶

In [45]:

my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
    print("The next item is:")
    print(i)
    print()

The next item is:
first

The next item is:
second

The next item is:
third

The next item is:
fourth

Visualize the above code on PythonTutor.

Looping a specified number of times¶

To perform a statement a certain number of times, we can iterate over a list of the required size.

In [46]:

for i in [0, 1, 2, 3]:
    print("Hello!")

Hello!
Hello!
Hello!
Hello!

The `range` function¶

To save from having to manually write the numbers out, we can use the function range() to count for us.
We count starting at 0 (as in Java and C++).

In [47]:

list(range(4))

Out[47]:

[0, 1, 2, 3]

`for` loops with the `range` function¶

In [48]:

for i in range(4):
    print("Hello!")

Hello!
Hello!
Hello!
Hello!

List Indexing¶

Lists can be indexed using square brackets to retrieve the element stored in a particular position.

In [49]:

my_list

Out[49]:

['first', 'second', 'third', 'fourth']

In [50]:

my_list[0]

Out[50]:

'first'

In [51]:

my_list[1]

Out[51]:

'second'

List Slicing¶

We can also a specify a range of positions.
This is called slicing.
The example below indexes from position 0 (inclusive) to 2 (exclusive).

In [52]:

my_list[0:2]

Out[52]:

['first', 'second']

Indexing from the start or end¶

If we leave out the starting index it implies the beginning of the list:

In [53]:

my_list[:2]

Out[53]:

['first', 'second']

If we leave out the final index it implies the end of the list:

In [54]:

my_list[2:]

Out[54]:

['third', 'fourth']

Copying a list¶

We can conveniently copy a list by indexing from start to end:

In [55]:

new_list = my_list[:]

In [56]:

new_list

Out[56]:

['first', 'second', 'third', 'fourth']

In [57]:

new_list is my_list

Out[57]:

False

In [58]:

new_list == my_list

Out[58]:

True

Negative Indexing¶

Negative indices count from the end of the list:

In [59]:

my_list[-1]

Out[59]:

'fourth'

In [60]:

my_list[:-1]

Out[60]:

['first', 'second', 'third']

Collections¶

Lists are an example of a collection.
A collection is a type of value that can contain other values.
There are other collection types in Python:
- tuple
- set
- dict

Tuples¶

Tuples are another way to combine different values.
The combined values can be of different types.
Like lists, they have a well-defined ordering and can be indexed.
To create a tuple in Python, use round brackets instead of square brackets

In [61]:

tuple1 = (50, 'hello')
tuple1

Out[61]:

(50, 'hello')

In [62]:

tuple1[0]

Out[62]:

In [63]:

type(tuple1)

Out[63]:

tuple

Tuples are immutable¶

Unlike lists, tuples are immutable. Once we have created a tuple we cannot add values to it.

In [64]:

tuple1.append(2)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-64-46e3866e32ee> in <module>
----> 1 tuple1.append(2)

AttributeError: 'tuple' object has no attribute 'append'

Sets¶

Lists can contain duplicate values.
A set, in contrast, contains no duplicates.
Sets can be created from lists using the set() function.

In [65]:

X = set([1, 2, 3, 3, 4])
X

Out[65]:

{1, 2, 3, 4}

In [66]:

type(X)

Out[66]:

set

Alternatively we can write a set literal using the { and } brackets.

In [67]:

X = {1, 2, 3, 4}
type(X)

Out[67]:

set

Sets are mutable¶

Sets are mutable like lists:

In [68]:

X.add(5)
X

Out[68]:

{1, 2, 3, 4, 5}

Duplicates are automatically removed

In [69]:

X.add(5)
X

Out[69]:

{1, 2, 3, 4, 5}

Sets are unordered¶

Sets do not have an ordering.
Therefore we cannot index or slice them:

In [70]:

X[0]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-70-19c40ecbd036> in <module>
----> 1 X[0]

TypeError: 'set' object is not subscriptable

Operations on sets¶

Union: $X \cup Y$

In [71]:

X = {1, 2, 3}
Y = {4, 5, 6}
X | Y

Out[71]:

{1, 2, 3, 4, 5, 6}

Intersection: $X \cap Y$:

In [72]:

X = {1, 2, 3, 4}
Y = {3, 4, 5}
X & Y

Out[72]:

{3, 4}

Difference $X - Y$:

In [73]:

X - Y

Out[73]:

{1, 2}

Dictionaries¶

A dictionary contains a mapping between keys, and corresponding values.
- Mathematically it is a one-to-one function with a finite domain and range.
Given a key, we can very quickly look up the corresponding value.
The values can be any type (and need not all be of the same type).
Keys can be any immutable (hashable) type.
They are abbreviated by the keyword dict.
In other programming languages they are sometimes called associative arrays.

Creating a dictionary¶

A dictionary contains a set of key-value pairs.
To create a dictionary:

In [74]:

students = { 107564: 'Xu', 108745: 'Ian', 102567: 'Steve' }

The above initialises the dictionary students so that it contains three key-value pairs.
The keys are the student id numbers (integers).
The values are the names of the students (strings).
Although we use the same brackets as for sets, this is a different type of collection:

In [75]:

type(students)

Out[75]:

dict

Accessing the values in a dictionary¶

We can access the value corresponding to a given key using the same syntax to access particular elements of a list:

In [76]:

students[108745]

Out[76]:

'Ian'

Accessing a non-existent key will generate a KeyError:

In [77]:

students[123]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-77-26e887eb0296> in <module>
----> 1 students[123]

KeyError: 123

Updating dictionary entries¶

Dictionaries are mutable, so we can update the mapping:

In [78]:

students[108745] = 'Fred'
print(students[108745])

Fred

We can also grow the dictionary by adding new keys:

In [79]:

students[104587] = 'John'
print(students[104587])

John

Dictionary keys can be any immutable type¶

We can use any immutable type for the keys of a dictionary
For example, we can map names onto integers:

In [80]:

age = { 'John':21, 'Steve':47, 'Xu': 22 }

In [81]:

age['Steve']

Out[81]:

Creating an empty dictionary¶

We often want to initialise a dictionary with no keys or values.
To do this call the function dict():

In [82]:

result = dict()

We can then progressively add entries to the dictionary, e.g. using iteration:

In [83]:

for i in range(5):
    result[i] = i**2
print(result)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

Iterating over a dictionary¶

We can use a for loop with dictionaries, just as we can with other collections such as sets.
When we iterate over a dictionary, we iterate over the keys.
We can then perform some computation on each key inside the loop.
Typically we will also access the corresponding value.

In [84]:

for id in students:
    print(students[id])

Xu
Fred
Steve
John

The size of a collection¶

We can count the number of values in a collection using the len (length) function.
This can be used with any type of collection (list, set, tuple etc.).

In [85]:

len(students)

Out[85]:

In [86]:

len(['one', 'two'])

Out[86]:

In [87]:

len({'one', 'two', 'three'})

Out[87]:

Empty collections¶

Empty collections have a size of zero:

In [88]:

empty_list = []
len(empty_list) == 0

Out[88]:

True

Arrays¶

Python also has arrays which contain a single type of value.
i.e. we cannot have different types of value within the same array.
Arrays are mutable like lists; we can modify the existing elements of an array.
However, we typically do not change the size of the array; i.e. it has a fixed length.

The `numpy` module¶

Arrays are provided by a separate module called numpy. Modules correspond to packages in e.g. Java.
We can import the module and then give it a shorter alias.

In [89]:

import numpy as np

We can now use the functions defined in this package by prefixing them with np.
The function array() creates an array given a list.

Creating an array¶

We can create an array from a list by using the array() function defined in the numpy module:

In [90]:

x = np.array([0, 1, 2, 3, 4])
x

Out[90]:

array([0, 1, 2, 3, 4])

In [91]:

type(x)

Out[91]:

numpy.ndarray

Functions over arrays¶

When we use arithmetic operators on arrays, we create a new array with the result of applying the operator to each element.

In [92]:

y = x * 2
y

Out[92]:

array([0, 2, 4, 6, 8])

The same goes for functions:

In [93]:

x = np.array([-1, 2, 3, -4])
y = abs(x)
y

Out[93]:

array([1, 2, 3, 4])

Populating Arrays¶

To populate an array with a range of values we use the np.arange() function:

In [94]:

x = np.arange(0, 10)
x

Out[94]:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

We can also use floating point increments.

In [95]:

x = np.arange(0, 1, 0.1)
x

Out[95]:

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

Basic Plotting¶

We will use a module called matplotlib to plot some simple graphs.
This module provides functions which are very similar to MATLAB plotting commands.

In [96]:

import matplotlib.pyplot as plt

y = x*2 + 5
plt.plot(x, y)
plt.show()

Plotting a sine curve¶

In [97]:

from numpy import pi, sin

x = np.arange(0, 2*pi, 0.01)
y = sin(x)
plt.plot(x, y)
plt.show()

Plotting a histogram¶

We can use the hist() function in matplotlib to plot a histogram

In [98]:

# Generate some random data
data = np.random.randn(1000)

ax = plt.hist(data)
plt.show()

Computing histograms as matrices¶

The function histogram() in the numpy module will count frequencies into bins and return the result as a 2-dimensional array.

In [99]:

np.histogram(data)

Out[99]:

(array([ 14,  41, 128, 178, 243, 203, 109,  66,  14,   4]),
 array([-2.81515826, -2.19564948, -1.57614071, -0.95663193, -0.33712315,
         0.28238562,  0.9018944 ,  1.52140318,  2.14091195,  2.76042073,
         3.3799295 ]))

Defining new functions¶

In [100]:

def squared(x):
    return x ** 2

squared(5)

Out[100]:

Local Variables¶

Variables created inside functions are local to that function.
They are not accessable to code outside of that function.

In [101]:

def squared(x):
    temp = x ** 2
    return temp

squared(5)

Out[101]:

In [102]:

temp

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-102-da77557ed0c8> in <module>
----> 1 temp

NameError: name 'temp' is not defined

Functional Programming¶

Functions are first-class citizens in Python.
They can be passed around just like any other value.

In [103]:

squared

Out[103]:

<function __main__.squared(x)>

In [104]:

y = squared
y

Out[104]:

<function __main__.squared(x)>

In [105]:

y(5)

Out[105]:

Mapping the elements of a collection¶

We can apply a function to each element of a collection using the built-in function map().
This will work with any collection: list, set, tuple or string.
This will take as an argument another function, and the list we want to apply it to.
It will return the results of applying the function, as a list.

In [106]:

list(map(squared, [1, 2, 3, 4]))

Out[106]:

[1, 4, 9, 16]

List Comprehensions¶

Because this is such a common operation, Python has a special syntax to do the same thing, called a list comprehension.

In [107]:

[squared(i) for i in [1, 2, 3, 4]]

Out[107]:

[1, 4, 9, 16]

If we want a set instead of a list we can use a set comprehension

In [108]:

{squared(i) for i in [1, 2, 3, 4]}

Out[108]:

{1, 4, 9, 16}

Cartesian product using list comprehensions¶

^{image courtesy of Quartl}

The Cartesian product of two collections $X = A \times B$ can be expressed by using multiple for statements in a comprehension.

example¶

In [109]:

A = {'x', 'y', 'z'}
B = {1, 2, 3}
{(a,b) for a in A for b in B}

Out[109]:

{('x', 1),
 ('x', 2),
 ('x', 3),
 ('y', 1),
 ('y', 2),
 ('y', 3),
 ('z', 1),
 ('z', 2),
 ('z', 3)}

Cartesian products with other collections¶

The syntax for Cartesian products can be used with any collection type.

In [110]:

first_names = ('Steve', 'John', 'Peter')
surnames = ('Smith', 'Doe', 'Rabbit')

[(first_name, surname) for first_name in first_names for surname in surnames]

Out[110]:

[('Steve', 'Smith'),
 ('Steve', 'Doe'),
 ('Steve', 'Rabbit'),
 ('John', 'Smith'),
 ('John', 'Doe'),
 ('John', 'Rabbit'),
 ('Peter', 'Smith'),
 ('Peter', 'Doe'),
 ('Peter', 'Rabbit')]

Joining collections using a zip¶

The Cartesian product pairs every combination of elements.
If we want a 1-1 pairing we use an operation called a zip.
A zip pairs values at the same position in each sequence.
Therefore:
- it can only be used with sequences (not sets); and
- both collections must be of the same length.

In [111]:

list(zip(first_names, surnames))

Out[111]:

[('Steve', 'Smith'), ('John', 'Doe'), ('Peter', 'Rabbit')]

Anonymous Function Literals¶

We can also write anonymous functions.
These are function literals, and do not necessarily have a name.
They are called lambda expressions (after the $\lambda-$calculus).

In [112]:

list(map(lambda x: x ** 2, [1, 2, 3, 4]))

Out[112]:

[1, 4, 9, 16]

Filtering data¶

We can filter a list by applying a predicate to each element of the list.
A predicate is a function which takes a single argument, and returns a boolean value.
filter(p, X) is equivalent to $\{ x : p(x) \; \forall x \in X \}$ in set-builder notation.

In [113]:

list(filter(lambda x: x > 0, [-5, 2, 3, -10, 0, 1]))

Out[113]:

[2, 3, 1]

We can use both filter() and map() on other collections such as strings or sets.

In [114]:

list(filter(lambda x: x > 0, {-5, 2, 3, -10, 0, 1}))

Out[114]:

[1, 2, 3]

Filtering using a list comprehension¶

Again, because this is such a common operation, we can use simpler syntax to say the same thing.
We can express a filter using a list-comprehension by using the keyword if:

In [115]:

data = [-5, 2, 3, -10, 0, 1]
[x for x in data if x > 0]

Out[115]:

[2, 3, 1]

We can also filter and then map in the same expression:

In [116]:

from numpy import sqrt
[sqrt(x) for x in data if x > 0]

Out[116]:

[1.4142135623730951, 1.7320508075688772, 1.0]

The reduce function¶

The reduce() function recursively applies another function to pairs of values over the entire list, resulting in a single return value.

In [117]:

from functools import reduce
reduce(lambda x, y: x + y, [0, 1, 2, 3, 4, 5])

Out[117]:

Big Data¶

The map() and reduce() functions form the basis of the map-reduce programming model.
Map-reduce is the basis of modern highly-distributed large-scale computing frameworks.
It is used in BigTable, Hadoop and Apache Spark.
See these examples in Python for Apache Spark.