Introduction to Python (A crash course)

For those of you that know python, this aims to refresh your memory. For those of you that don't know python -- but do know programming -- this class aims to give you an idea how python is similar/different with your favorite programming language.

Printing

From the interactive python environment:

In [1]:
print "Hello World"
Hello World

From a file:

In [2]:
#!/usr/bin/env python

print "Hello World!"
Hello World!

Standard I/O

In [3]:
#  Writing to standard out:

print "Python is awesome!"
Python is awesome!
In [4]:
#  Reading from standard input and output to standard output

name = raw_input("What is your name?")
print name
What is your name?evimaria
evimaria

Data types

Basic data types:

  1. Strings
  2. Integers
  3. Floats
  4. Booleans

These are all objects in Python.

In [5]:
#String
a = "apple"
type(a)
#print type(a)
Out[5]:
str
In [6]:
#Integer 
b = 3
type(b)
#print type(b)
Out[6]:
int
In [7]:
#Float  
c = 3.2
type(c)
#print type(c)
Out[7]:
float
In [8]:
#Boolean
d = True
type(d)
#print type(d)
Out[8]:
bool

Python doesn't require explicitly declared variable types like C and other languages.

Pay special attention to assigning floating point values to variables or you may get values you do not expect in your programs.

In [9]:
14/b
Out[9]:
4
In [10]:
14/c
Out[10]:
4.375

If you divide an integer by an integer, it will return an answer rounded to the nearest integer. If you want a floating point answer, one of the numbers must be a float. Simply appending a decimal point will do the trick:

In [11]:
14./b
Out[11]:
4.666666666666667

Strings

String manipulation will be very important for many of the tasks we will do. Therefore let us play around a bit with strings.

In [12]:
#Concatenating strings

a = "Hello"  # String
b = " World" # Another string
print a + b  # Concatenation
Hello World
In [13]:
# Slicing strings

a = "World"

print a[0]
print a[-1]
print "World"[0:4]
print a[::-1]
W
d
Worl
dlroW
In [14]:
# Popular string functions
a = "Hello World"
print "-".join(a)
print a.startswith("Wo")
print a.endswith("rld")
print a.replace("o","0").replace("d","[)").replace("l","1")
print a.split()
print a.split('o')
H-e-l-l-o- -W-o-r-l-d
False
True
He110 W0r1[)
['Hello', 'World']
['Hell', ' W', 'rld']

Strings are an example of an imutable data type. Once you instantiate a string you cannot change any characters in it's set.

In [15]:
string = "string"
string[-1] = "y"  #Here we attempt to assign the last character in the string to "y"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-b377f6c05723> in <module>()
      1 string = "string"
----> 2 string[-1] = "y"  #Here we attempt to assign the last character in the string to "y"

TypeError: 'str' object does not support item assignment

Whitespace in Python

Python uses indents and whitespace to group statements together. To write a short loop in C, you might use:

for (i = 0, i < 5, i++){
       printf("Hi! \n");
    }

Python does not use curly braces like C, so the same program as above is written in Python as follows:

In [16]:
for i in range(5):
    print "Hi \n"
Hi 

Hi 

Hi 

Hi 

Hi 

If you have nested for-loops, there is a further indent for the inner loop.

In [17]:
for i in range(3):
    for j in range(3):
        print i, j
    
    print "This statement is within the i-loop, but not the j-loop"
0 0
0 1
0 2
This statement is within the i-loop, but not the j-loop
1 0
1 1
1 2
This statement is within the i-loop, but not the j-loop
2 0
2 1
2 2
This statement is within the i-loop, but not the j-loop

File I/O

In [18]:
# Writing to a file
with open("example.txt", "w") as f:
    f.write("Hello World! \n")
    f.write("How are you? \n")
    f.write("I'm fine.")
In [19]:
# Reading from a file
with open("example.txt", "r") as f:
    data = f.readlines()
    for line in data:
        words = line.split()
        print words
['Hello', 'World!']
['How', 'are', 'you?']
["I'm", 'fine.']
In [20]:
# Count lines and words in a file
lines = 0
words = 0
the_file = "example.txt"

with open(the_file, 'r') as f:
    for line in f:
        lines += 1
        words += len(line.split())
print "There are %i lines and %i words in the %s file." % (lines, words, the_file)
There are 3 lines and 7 words in the example.txt file.

Lists, Tuples, Sets and Dictionaries

Number and strings alone are not enough! we need data types that can hold multiple values.

Lists:

Lists are mutable or able to be altered. Lists are a collection of data and that data can be of differing types.

In [21]:
groceries = []

# Add to list
groceries.append("oranges")  
groceries.append("meat")
groceries.append("asparangus")

# Access by index
print groceries[2]
print groceries[0]

# Find number of things in list
print len(groceries)

# Sort the items in the list
groceries.sort()
print groceries

# List Comprehension
veggie = [x for x in groceries if x is not "meat"]
print veggie

# Remove from list
groceries.remove("asparangus")
print groceries

#The list is mutable
groceries[0] = 2
print groceries
asparangus
oranges
3
['asparangus', 'meat', 'oranges']
['asparangus', 'oranges']
['meat', 'oranges']
[2, 'oranges']

List Comprehension

Recall the mathematical notation:

$$L_1 = \left\{x^2 : x \in \{0\ldots 9\}\right\}$$

$$L_2 = \left(1, 2, 4, 8,\ldots, 2^{12}\right)$$

$$M = \left\{x \mid x \in L_1 \text{ and } x \text{ is even}\right\}$$

In [22]:
L1 = [x**2 for x in range(10)]
L2 = [2**i for i in range(13)]
L3 = [x for x in L1 if x % 2 == 0]
print L1
print L2 
print L3
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096]
[0, 4, 16, 36, 64]

Prime numbers with list comprehension

In [23]:
noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
print noprimes
primes = [x for x in range(2, 50) if x not in noprimes]
print primes
[4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 10, 15, 20, 25, 30, 35, 40, 45, 12, 18, 24, 30, 36, 42, 48, 14, 21, 28, 35, 42, 49]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
In [24]:
primes = [x for x in range(2, 50) if x not in [j for i in range(2, 8) for j in range(i*2, 50, i) ]]
print primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Tuples:

Tuples are an immutable type. Like strings, once you create them, you cannot change them. It is their immutability that allows you to use them as keys in dictionaries. However, they are similar to lists in that they are a collection of data and that data can be of differing types.

In [25]:
# Tuple grocery list

groceries = ('orange', 'meat', 'asparangus', 2.5, True)

print groceries

#print groceries[2]

#groceries[2] = 'milk'
('orange', 'meat', 'asparangus', 2.5, True)

Sets:

A set is a sequence of items that cannot contain duplicates. They handle operations like sets in mathematics.

In [26]:
numbers = range(10)
evens = [2, 4, 6, 8]

evens = set(evens)
numbers = set(numbers)

# Use difference to find the odds
odds = numbers - evens

print odds

# Note: Set also allows for use of union (|), and intersection (&)
set([0, 1, 3, 5, 7, 9])

Dictionaries:

A dictionary is a map of keys to values. Keys must be unique.

In [27]:
# A simple dictionary

simple_dic = {'cs591': 'data-mining tools'}

# Access by key
print simple_dic['cs591']
data-mining tools
In [28]:
# A longer dictionary
classes = {
    'cs591': 'data-mining tools',
    'cs565': 'data-mining algorithms'
}

# Check if item is in dictionary
print 'cs530' in classes

# Add new item
classes['cs530'] = 'algorithms'
print classes['cs530']

# Print just the keys
print classes.keys()

# Print just the values
print classes.values()

# Print the items in the dictionary
print classes.items()

# Print dictionary pairs another way
for key, value in classes.items():
    print key, value
False
algorithms
['cs530', 'cs591', 'cs565']
['algorithms', 'data-mining tools', 'data-mining algorithms']
[('cs530', 'algorithms'), ('cs591', 'data-mining tools'), ('cs565', 'data-mining algorithms')]
cs530 algorithms
cs591 data-mining tools
cs565 data-mining algorithms
In [29]:
# Complex Data structures
# Dictionaries inside a dictionary!

professors = {
    "prof1": {
        "name": "Evimaria Terzi",
        "department": "Computer Science",
        "research interests": ["algorithms", "data mining", "machine learning",]
    },
    "prof2": {
        "name": "Chris Dellarocas",
        "department": "Management",
        "interests": ["market analysis", "data mining", "computational education",],
    }
}

for prof in professors:
    print professors[prof]["name"]
Chris Dellarocas
Evimaria Terzi

Iterators and Generators

We can loop over the elements of a list using for

In [30]:
for i in [1,2,3,4]:
    print i
1
2
3
4

When we use for for dictionaries it loops over the keys of the dictionary

In [31]:
for k in {'evimaria': 'terzi', 'aris': 'anagnostopoulos'}:
    print k
aris
evimaria

When we use for for strings it loops over the letters of the string

In [32]:
for l in 'python is magic':
    print l
p
y
t
h
o
n
 
i
s
 
m
a
g
i
c

All these are iterable objects

In [33]:
list({'evimaria': 'terzi', 'aris': 'anagnostopoulos'})
Out[33]:
['aris', 'evimaria']
In [34]:
list('python is magic')
Out[34]:
['p', 'y', 't', 'h', 'o', 'n', ' ', 'i', 's', ' ', 'm', 'a', 'g', 'i', 'c']
In [35]:
print '-'.join('evimaria')
print '-'.join(['a','b','c'])
e-v-i-m-a-r-i-a
a-b-c

Function iter takes as input an iterable object and returns an iterator

In [36]:
i = iter('magic')
print i
print i.next()
print i.next()
print i.next()
print i.next()
print i.next()
print i.next()
<iterator object at 0x1026f1690>
m
a
g
i
c
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-36-1ef58edcb098> in <module>()
      6 print i.next()
      7 print i.next()
----> 8 print i.next()

StopIteration: 

Many functions take iterators as inputs

In [37]:
a = [x for x in range(10)]
print sum(iter(a))
45

generators are functions that produce sequences of results (and not a single value)

In [38]:
def func(n):
    for i in range(n):
        yield i
In [39]:
g = func(10)
print g
print g.next()
print g.next()
<generator object func at 0x1026caf00>
0
1
In [40]:
def demonstrate(n):
    print 'begin execution of the function'
    for i in range(n):
        print 'before yield'
        yield i*i
        print 'after yield'
In [41]:
g = demonstrate(5)
print g.next()
print g.next()
print g.next()
print g.next()
begin execution of the function
before yield
0
after yield
before yield
1
after yield
before yield
4
after yield
before yield
9

Combining everything you learned about iterators and generators

In [42]:
g = (x for x in range(10))
print g
print sum(g)
y = [x for x in range(10)]
print y
<generator object <genexpr> at 0x1026caf50>
45
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Functions

In [43]:
def displayperson(name,age):
    print "My name is "+ name +" and I am "+age+" years old."
    return
    
displayperson("Bob","40")
My name is Bob and I am 40 years old.

Lambda functions

Python supports the creation of anonymous functions (i.e. functions that are not bound to a name) at runtime, using a construct called "lambda".

In [44]:
def f (x): return x**2
print f(8)
64
In [45]:
g = lambda x: x**2
print g(8)
64
In [46]:
f = lambda x, y : x + y
print f(2,3)
5

The above pieces of code are equivalent to each other! Note that there is no ``return" statement in the lambda function. A lambda function does not need to be assigned to variable, but it can be used within the code wherever a function is expected.

In [47]:
def multiply (n): return lambda x: x*n
 
f = multiply(2)
g = multiply(6)
print f(10), g(10)
20 60
In [48]:
multiply(3)(30)
Out[48]:
90

The map() function

The advantage of the lambda operator can be seen when it is used in combination with the map() function. map() is a function with two arguments:

r = map(func,s)

func is a function and s is a sequence (e.g., a list). map returns a sequence that applies function func to all the elements of s.

In [49]:
def dollar2euro(x):
    return 0.89*x
def euro2dollar(x):
    return 1.12*x

amounts= (100, 200, 300, 400)
dollars = map(dollar2euro, amounts)
print dollars
[89.0, 178.0, 267.0, 356.0]
In [50]:
amounts= (100, 200, 300, 400)
euros = map(euro2dollar, amounts)
print euros
[112.00000000000001, 224.00000000000003, 336.00000000000006, 448.00000000000006]
In [51]:
map(lambda x: 0.89*x, amounts)
Out[51]:
[89.0, 178.0, 267.0, 356.0]

map can also be applied to more than one lists as long as they are of the same size and type

In [52]:
a = [1,2,3,4,5]
b = [-1,-2,-3, -4, -5] 
c = [10, 20 , 30, 40, 50]

l1 = map(lambda x,y: x+y, a,b)
print l1
l2 = map (lambda x,y,z: x-y+z, a,b,c)
print l2
[0, 0, 0, 0, 0]
[12, 24, 36, 48, 60]

The filter() function

The function filter(function, list) filters out all the elements of a list, for which the function function returns True.

In [53]:
nums = [i for i in range(100)]
print nums
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
In [54]:
even = filter(lambda x: x%2==0 and x!=0, nums)
print even
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]

The reduce() function

The function reduce(function, list) sequentially applies the function function to the elements of the list. The output of reduce(function,list) is a single value. For example if list = [a1,a2,a3,...,a10], then the first step of reduce(function, list) will return [function(a1,a2),a3,...,a10], and so on.

In [55]:
print reduce(lambda x,y: x+y, [x for x in range(10)])
45
In [56]:
print reduce (lambda x,y: x if x>y else y, [1, 15, 26, -27])
26

Libraries

Python is a high-level open-source language. But the Python world is inhabited by many packages or libraries that provide useful things like array operations, plotting functions, and much more. We can (and we should) import libraries of functions to expand the capabilities of Python in our programs.

In [57]:
import random
myList = [2, 109, False, 10, "data", 482, "mining"]
random.choice(myList)
Out[57]:
109
In [58]:
from random import shuffle
x = [[i] for i in range(10)]
shuffle(x)
print x
[[2], [0], [6], [4], [8], [7], [1], [5], [9], [3]]

APIs

In [68]:
# Getting data from an API

import requests

width = '200'
height = '300'
#response = requests.get('http://placekitten.com/' + width + '/' + height)
response = requests.get('http://lorempixel.com/400/200/sports/1/')

print response

with open('image.jpg', 'wb') as f:
    f.write(response.content)
<Response [200]>
In [69]:
from IPython.display import Image
Image(filename="image.jpg")
Out[69]:

Python is a high-level open-source language. But the Python world is inhabited by many packages or libraries that provide useful things like array operations, plotting functions, and much more. We can (and we should) import libraries of functions to expand the capabilities of Python in our programs.

In [61]:
# Code for setting the style of the notebook
from IPython.core.display import HTML
def css_styling():
    styles = open("theme/custom.css", "r").read()
    return HTML(styles)
css_styling()
Out[61]: