Introduction to Python

Copyright Steve Phelps 2014

Python is interpreted

  • Python is an interpreted language, in contrast to Java and C which are compiled languages.

  • This means we can type statements into the interpreter and they are executed immediately.

In [5]:
5 + 5
Out[5]:
10
In [2]:
x = 5
y = 'Hello There'
z = 10.5
In [3]:
x + 5
Out[3]:
10

Assignments versus equations

  • In Python when we write x = 5 this means something different from an equation $x=5$.

  • Unlike variables in mathematical models, variables in Python can refer to different things as more statements are interpreted.

In [14]:
x = 1
print 'The value of x is ', x

x = 2.5
print 'Now the value of x is ', x

x = 'hello there'
print 'Now it is ', x
The value of x is  1
Now the value of x is  2.5
Now it is  hello there

Calling Functions

We can call functions in a conventional way using round brackets

In [37]:
print round(3.14)
3.0

Types

  • Values in Python have an associated type.

  • If we combine types incorrectly we get an error.

In [4]:
print y
Hello There
In [5]:
y + 5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-c3d4037f4656> in <module>()
----> 1 y + 5

TypeError: cannot concatenate 'str' and 'int' objects

The type function

  • We can query the type of a value using the type function.
In [16]:
type(1)
Out[16]:
int
In [17]:
type('hello')
Out[17]:
str
In [18]:
type(2.5)
Out[18]:
float
In [29]:
type(True)
Out[29]:
bool

Null values

  • Sometimes we represent "no data" or "not applicable".

  • In Python we use the special value None.

  • This corresponds to Null in Java or SQL.

In [6]:
result = None
  • When we fetch the value None in the interactive interpreter, no result is printed out.
In [7]:
result
  • We can check whether there is a result or not using the is operator:
In [8]:
result is None
Out[8]:
True
In [9]:
x = 5
x is None
Out[9]:
False

Converting values between types

  • We can convert values between different types.

  • To convert an integer to a floating-point number use the float() function.

  • To convert a floating-point to an integer use the int() function.
In [7]:
x = 1
print type(x)
print x
<type 'int'>
1
In [6]:
y = float(x)
print type(y)
print y
<type 'float'>
1.0
In [8]:
print int(y)
1

Converting to and from ASCII values

  • The functions chr() and ord() can be used to convert characters from and to ASCII.
In [4]:
print ord('a')
97
In [5]:
print chr(97)
a

Variables are not typed

  • Variables themselves, on the other hand, do not have a fixed type.
  • It is only the values that they refer to that have a type.
  • This means that the type referred to by a variable can change as more statements are interpreted.
In [15]:
y = 'hello'
print 'The type of the value referred to by y is ', type(y)
y = 5.0
print 'And now the type of the value is ', type(y)
The type of the value referred to by y is  <type 'str'>
And now the type of the value is  <type 'float'>

Polymorphism

  • The meaning of an operator depends on the types we are applying it to.
In [17]:
1 / 5
Out[17]:
0
In [18]:
1.0 / 5.0
Out[18]:
0.2
In [19]:
1 + 1
Out[19]:
2
In [30]:
'a' + 'b'
Out[30]:
'ab'
In [31]:
'1' + '1'
Out[31]:
'11'

Conditional Statements and Indentation

  • The syntax for control structures in Python use colons and indentation.

  • Beware that white-space affects the semantics of Python code.

In [1]:
x = 5
if x > 0:
    print 'x is strictly positive'
    print x
    
print 'finished.'
x is strictly positive
5
finished.
In [1]:
x = 0
if x > 0:
    print 'x is strictly positive'
print x
    
print 'finished'
0
finished

Lists

We can use lists to hold an ordered sequence of values.

In [28]:
l = ['first', 'second', 'third']
print l
['first', 'second', 'third']

Lists can contain different types of variable, even in the same list.

In [29]:
another_list = ['first', 'second', 'third', 1, 2, 3]
print another_list
['first', 'second', 'third', 1, 2, 3]

Mutable Datastructures

Lists are mutable; their contents can change as more statements are interpreted.

In [32]:
l.append('fourth')
print l
['first', 'second', 'third', 'fourth']

References

  • Whenever we bind a variable to a value in Python we create a reference.

  • A reference is distinct from the value that it refers to.

  • Variables are names for references.

In [22]:
X = [1, 2, 3]
Y = X
  • The above code creates two different references (named X and Y) to the same value [1, 2, 3]

  • Because lists are mutable, changing them can have side-effects on other variables.

  • If we append something to X what will happen to Y?

In [23]:
X.append(4)
X
Out[23]:
[1, 2, 3, 4]
In [24]:
Y
Out[24]:
[1, 2, 3, 4]

State and identity

  • The state referred to by a variable is different from its identity.

  • To compare state use the == operator.

  • To compare identity use the is operator.

  • When we compare identity we check equality of references.

  • When we compare state we check equality of values.

In [33]:
X = [1, 2]
Y = [1]
Y.append(2)
In [34]:
X == Y
Out[34]:
True
In [35]:
X is Y
Out[35]:
False
In [36]:
Y.append(3)
X
Out[36]:
[1, 2]
In [38]:
X = Y
In [39]:
X is Y
Out[39]:
True

Iteration

  • We can iterate over each element of a list in turn using a for loop:
In [33]:
for i in l:
    print i
first
second
third
fourth
  • To perform a statement a certain number of times, we can iterate over a list of the required size.
In [39]:
for i in [0, 1, 2, 3]:
    print "Hello!"
Hello!
Hello!
Hello!
Hello!

For loops with the range function

  • To save from having to manually write the numbers out, we can use the function range() to count for us. As in Java and C, we count starting at 0.
In [40]:
range(4)
Out[40]:
[0, 1, 2, 3]
In [42]:
for i in range(4):
    print "Hello!"
Hello!
Hello!
Hello!
Hello!

List Indexing

  • Lists can be indexed using square brackets to retrieve the element stored in a particular position.
In [43]:
print l[0]
first
In [44]:
print l[1]
second

List Slicing

  • We can also a specify a range of positions.

  • This is called slicing.

  • The example below indexes from position 0 (inclusive) to 2 (exclusive).

In [45]:
print l[0:2]
['first', 'second']
  • If we leave out the starting index it implies the beginning of the list:
In [47]:
print l[:2]
['first', 'second']
  • If we leave out the final index it implies the end of the list:
In [48]:
print l[2:]
['third', 'fourth']

Negative Indexing

  • Negative indices count from the end of the list:
In [49]:
print l[-1]
fourth
In [50]:
print l[:-1]
['first', 'second', 'third']

Collections

  • Lists are an example of a collection.

  • A collection is a type of value that can contain other values.

  • There are other collection types in Python:

    • tuple
    • set
    • dict

Tuples

  • Tuples are another way to combine different values.

  • The combined values can be of different types.

  • Like lists, they have a well-defined ordering and can be indexed.

  • To create a tuple in Python, use round brackets instead of square brackets

In [12]:
tuple1 = (50, 'hello')
tuple1
Out[12]:
(50, 'hello')
In [3]:
tuple1[0]
Out[3]:
50
In [4]:
type(tuple1)
Out[4]:
tuple

Tuples are immutable

  • Unlike lists, tuples are immutable. Once we have created a tuple we cannot add values to it.
In [11]:
tuple1.append(2)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-a29bee310764> in <module>()
----> 1 tuple1.append(2)

AttributeError: 'tuple' object has no attribute 'append'

Sets

  • Lists can contain duplicate values.

  • A set, in contrast, contains no duplicates.

  • Sets can be created from lists using the set() function.

In [1]:
X = set([1, 2, 3, 3, 4])
X
Out[1]:
{1, 2, 3, 4}
In [8]:
type(X)
Out[8]:
set
  • Alternatively we can write a set literal using the { and } brackets.
In [27]:
X = {1, 2, 3, 4}
type(X)
Out[27]:
set

Sets are mutable

  • Sets are mutable like lists:
In [3]:
X.add(5)
X
Out[3]:
{1, 2, 3, 4, 5}
  • Duplicates are automatically removed
In [4]:
X.add(5)
X
Out[4]:
{1, 2, 3, 4, 5}

Sets are unordered

  • Sets do not have an ordering.

  • Therefore we cannot index or slice them:

In [28]:
X[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-1d61f6e5db90> in <module>()
----> 1 X[0]

TypeError: 'set' object does not support indexing

Operations on sets

  • Union: $X \cup Y$
In [11]:
X = {1, 2, 3}
Y = {4, 5, 6}
X.union(Y)
Out[11]:
{1, 2, 3, 4, 5, 6}
  • Intersection: $X \cap Y$:
In [12]:
X = {1, 2, 3, 4}
Y = {3, 4, 5}
X.intersection(Y)
Out[12]:
{3, 4}
  • Difference $X - Y$:
In [13]:
X - Y
Out[13]:
{1, 2}

Arrays

  • Python also has fixed-length arrays which contain a single type of value

  • i.e. we cannot have different types of value within the same array.

  • Arrays are provided by a separate module called numpy. Modules correspond to packages in e.g. Java.

  • We can import the module and then give it a shorter alias.

In [10]:
import numpy as np
  • We can now use the functions defined in this package by prefixing them with np.

  • The function array() creates an array given a list.

In [55]:
x = np.array([0, 1, 2, 3, 4])
print x
print type(x)
[0 1 2 3 4]
<type 'numpy.ndarray'>

Functions over arrays

  • When we use arithmetic operators on arrays, we create a new array with the result of applying the operator to each element.
In [56]:
y = x * 2
print y
[0 2 4 6 8]
  • The same goes for functions:
In [4]:
x = np.array([-1, 2, 3, -4])
y = abs(x)
print y
[1 2 3 4]

Populating Arrays

  • To populate an array with a range of values we use the np.arange() function:
In [58]:
x = np.arange(0, 10)
print x
[0 1 2 3 4 5 6 7 8 9]
  • We can also use floating point increments.
In [11]:
x = np.arange(0, 1, 0.1)
print x
[ 0.   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9]

Basic Plotting

  • We will use a module called matplotlib to plot some simple graphs.

  • This module provides functions which are very similar to MATLAB plotting commands.

In [12]:
import matplotlib.pyplot as plt
%matplotlib inline

y = x*2 + 5
plt.plot(x, y)
Out[12]:
[<matplotlib.lines.Line2D at 0x7fdbb00a69d0>]

Plotting a sine curve

In [13]:
from numpy import pi, sin

x = np.arange(0, 2*pi, 0.01)
y = sin(x)
plt.plot(x, y)
Out[13]:
[<matplotlib.lines.Line2D at 0x7fdba98de290>]

Plotting a histogram

  • We can use the hist() function in matplotlib to plot a histogram
In [37]:
# Generate some random data
data = np.random.randn(1000)

ax = plt.hist(data)

Computing histograms as matrices

  • The function histogram() in the numpy module will count frequencies into bins and return the result as a 2-dimensional array.
In [38]:
np.histogram(data)
Out[38]:
(array([  1,  14,  44, 122, 202, 288, 186, 105,  26,  12]),
 array([-3.78273819, -3.08261624, -2.3824943 , -1.68237235, -0.98225041,
       -0.28212847,  0.41799348,  1.11811542,  1.81823737,  2.51835931,
        3.21848125]))

Defining new functions

In [13]:
def squared(x):
    return x ** 2

print squared(5)
25

Local Variables

  • Variables created inside functions are local to that function.

  • They are not accessable to code outside of that function.

In [41]:
def squared(x):
    result = x ** 2
    return result

print squared(5)
25
In [15]:
print result
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-15-819e2eca8219> in <module>()
----> 1 print result

NameError: name 'result' is not defined

Functional Programming

  • Functions are first-class citizens in Python.

  • They can be passed around just like any other value.

In [16]:
print(squared)
<function squared at 0x7ffd3ea6ee60>
In [17]:
y = squared
print y
<function squared at 0x7ffd3ea6ee60>
In [18]:
print y(5)
25

Mapping the elements of a collection

  • We can apply a function to each element of a collection using the built-in function map().

  • This will work with any collection: list, set, tuple or string.

  • This will take as an argument another function, and the list we want to apply it to.

  • It will return the results of applying the function, as a list.

In [16]:
map(squared, [1, 2, 3, 4])
Out[16]:
[1, 4, 9, 16]

List Comprehensions

  • Because this is such a common operation, Python has a special syntax to do the same thing, called a list comprehension.
In [17]:
[squared(i) for i in [1, 2, 3, 4]]
Out[17]:
[1, 4, 9, 16]
  • If we want a set instead of a list we can use a set comprehension
In [42]:
{squared(i) for i in [1, 2, 3, 4]}
Out[42]:
{1, 4, 9, 16}

Cartesian product using list comprehensions

The Cartesian product of two collections $X = A \times B$ can be expressed by using multiple for statements in a comprehension.

In [43]:
A = {'x', 'y', 'z'}
B = {1, 2, 3}
{(a,b) for a in A for b in B}
Out[43]:
{('x', 1),
 ('x', 2),
 ('x', 3),
 ('y', 1),
 ('y', 2),
 ('y', 3),
 ('z', 1),
 ('z', 2),
 ('z', 3)}

Cartesian products with other collections

  • The syntax for Cartesian products can be used with any collection type.
In [25]:
first_names = ('Steve', 'John', 'Peter')
surnames = ('Smith', 'Doe')

[(first_name, surname) for first_name in first_names for surname in surnames]
Out[25]:
[('Steve', 'Smith'),
 ('Steve', 'Doe'),
 ('John', 'Smith'),
 ('John', 'Doe'),
 ('Peter', 'Smith'),
 ('Peter', 'Doe')]

Anonymous Function Literals

  • We can also write anonymous functions.
  • These are function literals, and do not necessarily have a name.
  • They are called lambda expressions (after the $\lambda-$calculus).
In [21]:
map(lambda x: x ** 2, [1, 2, 3, 4])
Out[21]:
[1, 4, 9, 16]

Filtering data

  • We can filter a list by applying a predicate to each element of the list.

  • A predicate is a function which takes a single argument, and returns a boolean value.

  • filter(p, X) is equivalent to $\{ x : p(x) \; \forall x \in X \}$ in set-builder notation.

In [40]:
filter(lambda x: x > 0, [-5, 2, 3, -10, 0, 1])
Out[40]:
[2, 3, 1]

We can use both filter() and map() on other collections such as strings or sets.

In [1]:
filter(lambda x: x != ' ', 'hello world')
Out[1]:
'helloworld'
In [2]:
map(ord, 'hello world')
Out[2]:
[104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100]
In [26]:
filter(lambda x: x > 0, {-5, 2, 3, -10, 0, 1})
Out[26]:
[1, 2, 3]

Filtering using a list comprehension

  • Again, because this is such a common operation, we can use simpler syntax to say the same thing.

  • We can express a filter using a list-comprehension by using the keyword if:

In [15]:
data = [-5, 2, 3, -10, 0, 1]
[x for x in data if x > 0]
Out[15]:
[2, 3, 1]
  • We can also filter and then map in the same expression:
In [20]:
from numpy import sqrt
[sqrt(x) for x in data if x > 0]
Out[20]:
[1.4142135623730951, 1.7320508075688772, 1.0]

The reduce function

  • The reduce() function recursively applies another function to pairs of values over the entire list, resulting in a single return value.
In [22]:
reduce(lambda x, y: x + y, [0, 1, 2, 3, 4, 5])
Out[22]:
15

Big Data

  • The map() and reduce() functions form the basis of the map-reduce programming model.

  • Map-reduce is the basis of modern highly-distributed large-scale computing frameworks.

  • It is used in BigTable, Hadoop and Apache Spark.

  • See these examples in Python for Apache Spark.

Reading Text Files

  • To read an entire text file as a list of lines use the readlines() method of a file object.
In [26]:
f = open('/etc/group')
result = f.readlines()
f.close()
In [27]:
# Print the first line
print result[0]
root:x:0:

To concatenate into a single string:

In [ ]:
single_string = ''.join(result)