In [1]:
%matplotlib inline

Overview of Python

(c) 2019 Steve Phelps

Python is interpreted

  • Python is an interpreted language, in contrast to Java and C which are compiled languages.

  • This means we can type statements into the interpreter and they are executed immediately.

In [2]:
5 + 5
Out[2]:
10
  • Groups of statements are all executed one after the other:
In [3]:
x = 5
y = 'Hello There'
z = 10.5
In [4]:
x + 5
Out[4]:
10

Assignments versus equations

  • In Python when we write x = 5 this means something different from an equation $x=5$.

  • Unlike variables in mathematical models, variables in Python can refer to different things as more statements are interpreted.

In [5]:
x = 1
print('The value of x is', x)

x = 2.5
print('Now the value of x is', x)

x = 'hello there'
print('Now it is ', x)
The value of x is 1
Now the value of x is 2.5
Now it is  hello there

Calling Functions

We can call functions in a conventional way using round brackets

In [6]:
round(3.14)
Out[6]:
3

Types

  • Values in Python have an associated type.

  • If we combine types incorrectly we get an error.

In [7]:
print(y)
Hello There
In [8]:
y + 5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-b85a2dbb3f6a> in <module>
----> 1 y + 5

TypeError: can only concatenate str (not "int") to str

The type function

  • We can query the type of a value using the type function.
In [9]:
type(1)
Out[9]:
int
In [10]:
type('hello')
Out[10]:
str
In [11]:
type(2.5)
Out[11]:
float
In [12]:
type(True)
Out[12]:
bool

Null values

  • Sometimes we represent "no data" or "not applicable".

  • In Python we use the special value None.

  • This corresponds to Null in Java or SQL.

In [13]:
result = None
  • When we fetch the value None in the interactive interpreter, no result is printed out.
In [14]:
result

Testing for Null values

  • We can check whether there is a result or not using the is operator:
In [15]:
result is None
Out[15]:
True
In [16]:
x = 5
x is None
Out[16]:
False

Converting values between types

  • We can convert values between different types.

Converting to floating-point

  • To convert an integer to a floating-point number use the float() function.
In [17]:
x = 1
x
Out[17]:
1
In [18]:
type(x)
Out[18]:
int
In [19]:
y = float(x)
y
Out[19]:
1.0

Converting to integers

  • To convert a floating-point to an integer use the int() function.
In [20]:
type(y)
Out[20]:
float
In [21]:
int(y)
Out[21]:
1

Variables are not typed

  • Variables themselves, on the other hand, do not have a fixed type.
  • It is only the values that they refer to that have a type.
  • This means that the type referred to by a variable can change as more statements are interpreted.
In [22]:
y = 'hello'
print('The type of the value referred to by y is ', type(y))
y = 5.0
print('And now the type of the value is ', type(y))
The type of the value referred to by y is  <class 'str'>
And now the type of the value is  <class 'float'>

Polymorphism

  • The meaning of an operator depends on the types we are applying it to.
In [23]:
1 + 1
Out[23]:
2
In [24]:
'a' + 'b'
Out[24]:
'ab'
In [25]:
'1' + '1'
Out[25]:
'11'

Conditional Statements and Indentation

  • The syntax for control structures in Python uses colons and indentation.

  • Beware that white-space affects the semantics of Python code.

  • Statements that are indented using the Tab key are grouped together.

if statements

In [26]:
x = 5
if x > 0:
    print('x is strictly positive.')
    print(x)
    
print('finished.')
x is strictly positive.
5
finished.

Changing indentation

In [27]:
x = 0
if x > 0:
    print('x is strictly positive.')
print(x)
    
print('finished.')
0
finished.

if and else

In [28]:
x = 0
print('Starting.')
if x > 0:
    print('x is strictly positive.')
else:
    if x < 0:
        print('x is strictly negative.')
    else:
        print('x is zero.')
print('finished.')
Starting.
x is zero.
finished.

elif

In [29]:
print('Starting.')
if x > 0:
    print('x is strictly positive')
elif x < 0:
    print('x is strictly negative')
else:
    print('x is zero')
print('finished.')
Starting.
x is zero
finished.

Lists

We can use lists to hold an ordered sequence of values.

In [30]:
l = ['first', 'second', 'third']
l
Out[30]:
['first', 'second', 'third']

Lists can contain different types of variable, even in the same list.

In [31]:
another_list = ['first', 'second', 'third', 1, 2, 3]
another_list
Out[31]:
['first', 'second', 'third', 1, 2, 3]

Mutable Datastructures

Lists are mutable; their contents can change as more statements are interpreted.

In [32]:
l.append('fourth')
l
Out[32]:
['first', 'second', 'third', 'fourth']

References

  • Whenever we bind a variable to a value in Python we create a reference.

  • A reference is distinct from the value that it refers to.

  • Variables are names for references.

In [33]:
X = [1, 2, 3]
Y = X

Side effects

  • The above code creates two different references (named X and Y) to the same value [1, 2, 3]

  • Because lists are mutable, changing them can have side-effects on other variables.

  • If we append something to X what will happen to Y?

In [34]:
X.append(4)
X
Out[34]:
[1, 2, 3, 4]
In [35]:
Y
Out[35]:
[1, 2, 3, 4]

State and identity

  • The state referred to by a variable is different from its identity.

  • To compare state use the == operator.

  • To compare identity use the is operator.

  • When we compare identity we check equality of references.

  • When we compare state we check equality of values.

Example

  • We will create two different lists, with two associated variables.
In [36]:
X = [1, 2]
Y = [1]
Y.append(2)

Comparing state

In [37]:
X
Out[37]:
[1, 2]
In [38]:
Y
Out[38]:
[1, 2]
In [39]:
X == Y
Out[39]:
True

Comparing identity

In [40]:
X is Y
Out[40]:
False

Copying data prevents side effects

  • In this example, because we have two different lists we avoid side effects
In [41]:
Y.append(3)
X
Out[41]:
[1, 2]
In [42]:
X == Y
Out[42]:
False
In [43]:
X is Y
Out[43]:
False

Iteration

  • We can iterate over each element of a list in turn using a for loop:
In [44]:
my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
    print(i)
first
second
third
fourth

Including more than one statement inside the loop

In [45]:
my_list = ['first', 'second', 'third', 'fourth']
for i in my_list:
    print("The next item is:")
    print(i)
    print()
The next item is:
first

The next item is:
second

The next item is:
third

The next item is:
fourth

Looping a specified number of times

  • To perform a statement a certain number of times, we can iterate over a list of the required size.
In [46]:
for i in [0, 1, 2, 3]:
    print("Hello!")
Hello!
Hello!
Hello!
Hello!

The range function

  • To save from having to manually write the numbers out, we can use the function range() to count for us.

  • We count starting at 0 (as in Java and C++).

In [47]:
list(range(4))
Out[47]:
[0, 1, 2, 3]

for loops with the range function

In [48]:
for i in range(4):
    print("Hello!")
Hello!
Hello!
Hello!
Hello!

List Indexing

  • Lists can be indexed using square brackets to retrieve the element stored in a particular position.
In [49]:
my_list
Out[49]:
['first', 'second', 'third', 'fourth']
In [50]:
my_list[0]
Out[50]:
'first'
In [51]:
my_list[1]
Out[51]:
'second'

List Slicing

  • We can also a specify a range of positions.

  • This is called slicing.

  • The example below indexes from position 0 (inclusive) to 2 (exclusive).

In [52]:
my_list[0:2]
Out[52]:
['first', 'second']

Indexing from the start or end

  • If we leave out the starting index it implies the beginning of the list:
In [53]:
my_list[:2]
Out[53]:
['first', 'second']
  • If we leave out the final index it implies the end of the list:
In [54]:
my_list[2:]
Out[54]:
['third', 'fourth']

Copying a list

  • We can conveniently copy a list by indexing from start to end:
In [55]:
new_list = my_list[:]
In [56]:
new_list
Out[56]:
['first', 'second', 'third', 'fourth']
In [57]:
new_list is my_list
Out[57]:
False
In [58]:
new_list == my_list
Out[58]:
True

Negative Indexing

  • Negative indices count from the end of the list:
In [59]:
my_list[-1]
Out[59]:
'fourth'
In [60]:
my_list[:-1]
Out[60]:
['first', 'second', 'third']

Collections

  • Lists are an example of a collection.

  • A collection is a type of value that can contain other values.

  • There are other collection types in Python:

    • tuple
    • set
    • dict

Tuples

  • Tuples are another way to combine different values.

  • The combined values can be of different types.

  • Like lists, they have a well-defined ordering and can be indexed.

  • To create a tuple in Python, use round brackets instead of square brackets

In [61]:
tuple1 = (50, 'hello')
tuple1
Out[61]:
(50, 'hello')
In [62]:
tuple1[0]
Out[62]:
50
In [63]:
type(tuple1)
Out[63]:
tuple

Tuples are immutable

  • Unlike lists, tuples are immutable. Once we have created a tuple we cannot add values to it.
In [64]:
tuple1.append(2)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-64-46e3866e32ee> in <module>
----> 1 tuple1.append(2)

AttributeError: 'tuple' object has no attribute 'append'

Sets

  • Lists can contain duplicate values.

  • A set, in contrast, contains no duplicates.

  • Sets can be created from lists using the set() function.

In [65]:
X = set([1, 2, 3, 3, 4])
X
Out[65]:
{1, 2, 3, 4}
In [66]:
type(X)
Out[66]:
set
  • Alternatively we can write a set literal using the { and } brackets.
In [67]:
X = {1, 2, 3, 4}
type(X)
Out[67]:
set

Sets are mutable

  • Sets are mutable like lists:
In [68]:
X.add(5)
X
Out[68]:
{1, 2, 3, 4, 5}
  • Duplicates are automatically removed
In [69]:
X.add(5)
X
Out[69]:
{1, 2, 3, 4, 5}

Sets are unordered

  • Sets do not have an ordering.

  • Therefore we cannot index or slice them:

In [70]:
X[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-70-19c40ecbd036> in <module>
----> 1 X[0]

TypeError: 'set' object is not subscriptable

Operations on sets

  • Union: $X \cup Y$
In [71]:
X = {1, 2, 3}
Y = {4, 5, 6}
X | Y
Out[71]:
{1, 2, 3, 4, 5, 6}
  • Intersection: $X \cap Y$:
In [72]:
X = {1, 2, 3, 4}
Y = {3, 4, 5}
X & Y
Out[72]:
{3, 4}
  • Difference $X - Y$:
In [73]:
X - Y
Out[73]:
{1, 2}

Dictionaries

  • A dictionary contains a mapping between keys, and corresponding values.

    • Mathematically it is a one-to-one function with a finite domain and range.
  • Given a key, we can very quickly look up the corresponding value.

  • The values can be any type (and need not all be of the same type).

  • Keys can be any immutable (hashable) type.

  • They are abbreviated by the keyword dict.

  • In other programming languages they are sometimes called associative arrays.

Creating a dictionary

  • A dictionary contains a set of key-value pairs.

  • To create a dictionary:

In [74]:
students = { 107564: 'Xu', 108745: 'Ian', 102567: 'Steve' }
  • The above initialises the dictionary students so that it contains three key-value pairs.

  • The keys are the student id numbers (integers).

  • The values are the names of the students (strings).

  • Although we use the same brackets as for sets, this is a different type of collection:

In [75]:
type(students)
Out[75]:
dict

Accessing the values in a dictionary

  • We can access the value corresponding to a given key using the same syntax to access particular elements of a list:
In [76]:
students[108745]
Out[76]:
'Ian'
  • Accessing a non-existent key will generate a KeyError:
In [77]:
students[123]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-77-26e887eb0296> in <module>
----> 1 students[123]

KeyError: 123

Updating dictionary entries

  • Dictionaries are mutable, so we can update the mapping:
In [78]:
students[108745] = 'Fred'
print(students[108745])
Fred
  • We can also grow the dictionary by adding new keys:
In [79]:
students[104587] = 'John'
print(students[104587])
John

Dictionary keys can be any immutable type

  • We can use any immutable type for the keys of a dictionary

  • For example, we can map names onto integers:

In [80]:
age = { 'John':21, 'Steve':47, 'Xu': 22 }
In [81]:
age['Steve']
Out[81]:
47

Creating an empty dictionary

  • We often want to initialise a dictionary with no keys or values.

  • To do this call the function dict():

In [82]:
result = dict()
  • We can then progressively add entries to the dictionary, e.g. using iteration:
In [83]:
for i in range(5):
    result[i] = i**2
print(result)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

Iterating over a dictionary

  • We can use a for loop with dictionaries, just as we can with other collections such as sets.
  • When we iterate over a dictionary, we iterate over the keys.
  • We can then perform some computation on each key inside the loop.
  • Typically we will also access the corresponding value.
In [84]:
for id in students:
    print(students[id])
Xu
Fred
Steve
John

The size of a collection

  • We can count the number of values in a collection using the len (length) function.

  • This can be used with any type of collection (list, set, tuple etc.).

In [85]:
len(students)
Out[85]:
4
In [86]:
len(['one', 'two'])
Out[86]:
2
In [87]:
len({'one', 'two', 'three'})
Out[87]:
3

Empty collections

  • Empty collections have a size of zero:
In [88]:
empty_list = []
len(empty_list) == 0
Out[88]:
True

Arrays

  • Python also has arrays which contain a single type of value.

  • i.e. we cannot have different types of value within the same array.

  • Arrays are mutable like lists; we can modify the existing elements of an array.

  • However, we typically do not change the size of the array; i.e. it has a fixed length.

The numpy module

  • Arrays are provided by a separate module called numpy. Modules correspond to packages in e.g. Java.

  • We can import the module and then give it a shorter alias.

In [89]:
import numpy as np
  • We can now use the functions defined in this package by prefixing them with np.

  • The function array() creates an array given a list.

Creating an array

  • We can create an array from a list by using the array() function defined in the numpy module:
In [90]:
x = np.array([0, 1, 2, 3, 4])
x
Out[90]:
array([0, 1, 2, 3, 4])
In [91]:
type(x)
Out[91]:
numpy.ndarray

Functions over arrays

  • When we use arithmetic operators on arrays, we create a new array with the result of applying the operator to each element.
In [92]:
y = x * 2
y
Out[92]:
array([0, 2, 4, 6, 8])
  • The same goes for functions:
In [93]:
x = np.array([-1, 2, 3, -4])
y = abs(x)
y
Out[93]:
array([1, 2, 3, 4])

Populating Arrays

  • To populate an array with a range of values we use the np.arange() function:
In [94]:
x = np.arange(0, 10)
x
Out[94]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
  • We can also use floating point increments.
In [95]:
x = np.arange(0, 1, 0.1)
x
Out[95]:
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

Basic Plotting

  • We will use a module called matplotlib to plot some simple graphs.

  • This module provides functions which are very similar to MATLAB plotting commands.

In [96]:
import matplotlib.pyplot as plt

y = x*2 + 5
plt.plot(x, y)
plt.show()

Plotting a sine curve

In [97]:
from numpy import pi, sin

x = np.arange(0, 2*pi, 0.01)
y = sin(x)
plt.plot(x, y)
plt.show()

Plotting a histogram

  • We can use the hist() function in matplotlib to plot a histogram
In [98]:
# Generate some random data
data = np.random.randn(1000)

ax = plt.hist(data)
plt.show()

Computing histograms as matrices

  • The function histogram() in the numpy module will count frequencies into bins and return the result as a 2-dimensional array.
In [99]:
np.histogram(data)
Out[99]:
(array([ 14,  41, 128, 178, 243, 203, 109,  66,  14,   4]),
 array([-2.81515826, -2.19564948, -1.57614071, -0.95663193, -0.33712315,
         0.28238562,  0.9018944 ,  1.52140318,  2.14091195,  2.76042073,
         3.3799295 ]))

Defining new functions

In [100]:
def squared(x):
    return x ** 2

squared(5)
Out[100]:
25

Local Variables

  • Variables created inside functions are local to that function.

  • They are not accessable to code outside of that function.

In [101]:
def squared(x):
    temp = x ** 2
    return temp

squared(5)
Out[101]:
25
In [102]:
temp
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-102-da77557ed0c8> in <module>
----> 1 temp

NameError: name 'temp' is not defined

Functional Programming

  • Functions are first-class citizens in Python.

  • They can be passed around just like any other value.

In [103]:
squared
Out[103]:
<function __main__.squared(x)>
In [104]:
y = squared
y
Out[104]:
<function __main__.squared(x)>
In [105]:
y(5)
Out[105]:
25

Mapping the elements of a collection

  • We can apply a function to each element of a collection using the built-in function map().

  • This will work with any collection: list, set, tuple or string.

  • This will take as an argument another function, and the list we want to apply it to.

  • It will return the results of applying the function, as a list.

In [106]:
list(map(squared, [1, 2, 3, 4]))
Out[106]:
[1, 4, 9, 16]

List Comprehensions

  • Because this is such a common operation, Python has a special syntax to do the same thing, called a list comprehension.
In [107]:
[squared(i) for i in [1, 2, 3, 4]]
Out[107]:
[1, 4, 9, 16]
  • If we want a set instead of a list we can use a set comprehension
In [108]:
{squared(i) for i in [1, 2, 3, 4]}
Out[108]:
{1, 4, 9, 16}

Cartesian product using list comprehensions

image courtesy of [Quartl](https://commons.wikimedia.org/wiki/User:Quartl)

The Cartesian product of two collections $X = A \times B$ can be expressed by using multiple for statements in a comprehension.

example

In [109]:
A = {'x', 'y', 'z'}
B = {1, 2, 3}
{(a,b) for a in A for b in B}
Out[109]:
{('x', 1),
 ('x', 2),
 ('x', 3),
 ('y', 1),
 ('y', 2),
 ('y', 3),
 ('z', 1),
 ('z', 2),
 ('z', 3)}

Cartesian products with other collections

  • The syntax for Cartesian products can be used with any collection type.
In [110]:
first_names = ('Steve', 'John', 'Peter')
surnames = ('Smith', 'Doe', 'Rabbit')

[(first_name, surname) for first_name in first_names for surname in surnames]
Out[110]:
[('Steve', 'Smith'),
 ('Steve', 'Doe'),
 ('Steve', 'Rabbit'),
 ('John', 'Smith'),
 ('John', 'Doe'),
 ('John', 'Rabbit'),
 ('Peter', 'Smith'),
 ('Peter', 'Doe'),
 ('Peter', 'Rabbit')]

Joining collections using a zip

  • The Cartesian product pairs every combination of elements.

  • If we want a 1-1 pairing we use an operation called a zip.

  • A zip pairs values at the same position in each sequence.

  • Therefore:

    • it can only be used with sequences (not sets); and
    • both collections must be of the same length.
In [111]:
list(zip(first_names, surnames))
Out[111]:
[('Steve', 'Smith'), ('John', 'Doe'), ('Peter', 'Rabbit')]

Anonymous Function Literals

  • We can also write anonymous functions.
  • These are function literals, and do not necessarily have a name.
  • They are called lambda expressions (after the $\lambda-$calculus).
In [112]:
list(map(lambda x: x ** 2, [1, 2, 3, 4]))
Out[112]:
[1, 4, 9, 16]

Filtering data

  • We can filter a list by applying a predicate to each element of the list.

  • A predicate is a function which takes a single argument, and returns a boolean value.

  • filter(p, X) is equivalent to $\{ x : p(x) \; \forall x \in X \}$ in set-builder notation.

In [113]:
list(filter(lambda x: x > 0, [-5, 2, 3, -10, 0, 1]))
Out[113]:
[2, 3, 1]

We can use both filter() and map() on other collections such as strings or sets.

In [114]:
list(filter(lambda x: x > 0, {-5, 2, 3, -10, 0, 1}))
Out[114]:
[1, 2, 3]

Filtering using a list comprehension

  • Again, because this is such a common operation, we can use simpler syntax to say the same thing.

  • We can express a filter using a list-comprehension by using the keyword if:

In [115]:
data = [-5, 2, 3, -10, 0, 1]
[x for x in data if x > 0]
Out[115]:
[2, 3, 1]
  • We can also filter and then map in the same expression:
In [116]:
from numpy import sqrt
[sqrt(x) for x in data if x > 0]
Out[116]:
[1.4142135623730951, 1.7320508075688772, 1.0]

The reduce function

  • The reduce() function recursively applies another function to pairs of values over the entire list, resulting in a single return value.
In [117]:
from functools import reduce
reduce(lambda x, y: x + y, [0, 1, 2, 3, 4, 5])
Out[117]:
15

Big Data

  • The map() and reduce() functions form the basis of the map-reduce programming model.

  • Map-reduce is the basis of modern highly-distributed large-scale computing frameworks.

  • It is used in BigTable, Hadoop and Apache Spark.

  • See these examples in Python for Apache Spark.