This is a Jupyter Notebook for the Python tutorial class

This notebook will provide commands used within the tutorial. Every cell can contain free text (a.k.a. markdown, using markup language à la Wikipedia), or code. By selecting a cell and either pressing the "play" simbol above or SHIFT+ENTER, the cell content is executed. Try the cell below!

In [1]:
print("Hello! This is a Python tutorial!")
Hello! This is a Python tutorial!

The print command allows printing things on screen. You can print multiple things separated by commas

In [ ]:
print("Two plus two is four")
print(2, " plus " , 2, " is ", 2 + 2)

> Exercise 0

Create a new cell below this line using the menu above: Insert > Insert Cell below. Type some mathematical expression and execute the cell.

In order to store results of an operation for later usage, you can declare a variable. Note that = does not mean equality in mathematical terms, but assignment. n=n+1 means that n+1 will be first computed, then assigned to the variable n.

In [ ]:
result = (10.4 + 22)*13 + 44.0/72.0
print(result)

result = result + 1
print(result)

you can comment your code using #

In [ ]:
# some mathematical equations will follow
x = 5 # defining value of x
y = 5*x**2 # this is a parabola
print(x, y)

> Exercise 1

Compute and print the euclidean (i.e. straight line) distance between two 2D points A=(Ax,Ay)=(0,0) and B=(Bx,By)=(1,1).

In [ ]:
ax = 0
ay = 0
bx = 1
by = 1
distance =   # here you calculate the euclidean distance
# here you can ask Python to print the result, stored in the variable called "distance"

TESTS

tests verify if a condition is satisfied or not. Their result is a Boolean (True or False). The if statement allows executing a piece of code only if a condition is satisfied. It can have an optional else part, indicating commands to execute if the test result is False.

Enter a weekday into the cell below and run the code. Note that your weekday will be a string. So, the day name should be enclosed in single or double quotes, such as "banana" and '"how are things?" "all good!"'

In [ ]:
day =    #enter a day "Monday", ..."Friday", ..."Sunday"
if day == "Friday":
    print("Burrito?")
else:
    print("Sandwich?")

The if statement can have any number of elif tests (i.e. alternative conditions). Only one group of statements is executed — those controlled by the first test that passes. Assign a grade in the first line below, and execute the code.

In [ ]:
grade =    #Give a grade (A, B, C ... F) in " "
if grade == "A":
    print("Congratulations!")
elif grade == "B":
    print("That's pretty good.")
elif grade == "C":
    print("Well, it's passing, anyway.")
else:
    print("ok, this is embarrassing")

indentation is required and must be consistent. Standard indentation is 4 spaces or one tab. Thumb rule: a line terminates with a column? Indent the next lines!

In [ ]:
day =   #enter a day "Monday", ..."Friday", ..."Sunday"
print("It's Lunch time...")
if day == "Sunday":
    print("Yay, it's Sunday!")
    print("Pub lunch anyone?")
print("I'm hungry!")

> Exercise 2

Compute the euclidean distance between two 2D points, print a warning signal if the points have the same coordinates, print the distance otherwise.

In [ ]:
ax = 0
ay = 0
bx = 1
by = 1
# complete this code!

FUNCTIONS

A function is a section of code that, given some input parameters (a.k.a. arguments), performs some operations and returns an output. A function works like a black box: when calling it, the only thing you see is the output it returns, its internal mechanisms are hidden.

Python offers a list of built-in functions, i.e. functions that are available by default. For Python 2, see the list here: https://docs.python.org/2/library/functions.html. For Python 3.7, see the list here: https://docs.python.org/3/library/functions.html. Below are a couple of examples.

In [ ]:
largest = max(2, 4)
number = abs(-5)

Packages containing additional functions can be imported in your code. Some packages are available by default with Python, some are downloadable from third parties… you can also make your own packages!

In [ ]:
import sys # import the sys package in the environment # access its functions like this: sys.exit(0)

from time import * # wild import
# All functions implemented in os are directly callable
# from within your code

from math import cos # import just a specific function # access directly in the code like this: cos(0.1)

You can create your own functions. This is especially useful when the same operation is required in multiple parts of your code.

In [ ]:
def some_math(x, y):
     constant = 2.5
     result = (abs(y)*abs(x))/constant
     return result #return a value to the caller

result = some_math(2., 3.)
print(result) #2.4

result2 = some_math(4., 7.)
print(result2)

The scope is the region of code a variable is accessible (visible) from. The region inside a function is called the local scope, the main code is called the global scope. Variables declared in a local scope are not accessible outside it. Variable definitions are searched first in local scope, then in global scope, then in imported packages.

In the example below the variable result is in the local scope of the function my_function, and the variable number is in the global scope. What happens when you try to access number from within my_function? And when you try to access result from the global scope? Edit the code to find out!

In [ ]:
from math import sin, sqrt
def my_function(val):
    #print(number) #will this work?
    if val > 0:
        result = sqrt(val)
    else:
        result = sin(val)
    return result

number = 0.7
dist = my_function(number)
#print(result) #and this, will it work?

> Exercise 3

Define and call a function computing the euclidean distance between two 2D points. The function should take four parameters (x and y coordinates of the two points).

In [ ]:
# define a function called "distance", here

dist = distance(0, 0, 1, 1) #this is a call to your function
print(dist)

LISTS

lists are sequences of elements enclosed in brackets. You can refer to an individual value following the list variable name with a bracketed number. A subset of the list can be obtained by providing an interval (first number included, last number excluded). WARNING: positions in list start from 0!

In [ ]:
my_list = [3.14, 42.0, 101.0]
print(my_list[1])
print(my_list[0:2])

You can edit value stored in any position in the list.

Lists can support heterogeneous data. Try to edit the cells above to set up a list containing heterogeneous data, e.g. [3.14, "hello", [101.0, 1]].

In [ ]:
my_list[1] = 1.44

The function len tells you the number of items the list

In [ ]:
len(my_list)

append allows adding elements to a list. Any item can be appended to a list, even another list.

In [ ]:
my_list.append(5.2)
print(my_list)
In [ ]:
my_list.append([5.2, 99.99])
print(my_list)

extend allows adding a list of elements to an existing list

In [ ]:
my_list.extend([5.2, 99.99]) 
print(my_list)

STRINGS

Strings can be read like lists

In [ ]:
hi = "hello world!" 
print(hi[1])
print(hi[0:5])
print(len(hi))

However, strings are not lists. For instance, try to see what happens if you run the cell below

In [ ]:
hi[1] = 4

Strings come with lots of useful methods helping their manipulation. Check the examples below.

In [ ]:
"hello world".split()
In [ ]:
" hello ".strip()
In [ ]:
"Hello WoRLd".lower()

SETS AND DICTIONARIES

Sets are an unordered collection of unique elements defined with curly braces. They cannot be indexed, items can be only added or removed. Only unique elements will be added!

In [ ]:
my_set = {3.14, 42.0, 101.0}
my_set.add(10.0)
my_set.remove(3.14)
print(my_set)

dictionaries store couples of data, one being the key and the other the value. Keys are unique in the dictionary.

In [ ]:
numbers = {'one': 1, 'two': 2, 'three': 3 } 
print(numbers['two'])
numbers['four'] = 4 # a new key and value association
print(numbers)

TYPE CASTING

In some cases, Python can figure out what type data should be at declaration. The function type reports on the type of a variable. In some cases, Python can figure out what type data should be at declaration.

In [ ]:
a = 1
b = 3.14
c = a+b
print(type(a), type(b), type(c))

Dedicated functions can help explicitly converting objects from a type to another

In [ ]:
print(float("3.14"))
print(int("42"))
print(str(42))

Complex datatypes have their creating functions as well. Analyze what happens with the list function:

In [ ]:
print(list("42"))
print(list({'one': 1, 'two': 2}))
print(set({'one': 1, 'two': 2}))
print(set(["4", "2", "2"]))

The set function creates sets, which means that some information may be lost. This is however useful if you are trying to identify unique elements in a list!

In [ ]:
a_list = [1, 2, 4, 5, 1, 3, 1, 4, 1, 5, 3, 2, 1]
a_set = set(a_list)
a_list_unique = list(a_set)

print(a_list)
print(a_list_unique)

MUTABLE AND IMMUTABLE OBJECTS

Copying a variable creates a pointer linking a variable name to a position in memory where data is stored.

immutable objects: value in memory position cannot be modified. Modification actually saves the result in a new memory position. Applies to basic types (int, float, boolean, string,…).

In [ ]:
x = 1
y = x
x = 2
print(x, y)

mutable objects: Memory position can be modified. Effect: changing that memory position, affects all variables pointing to it. Applies to complex data structures (e.g. lists, dictionaries, sets…).

In [ ]:
x = [1,1]
y = x
x[0] = 2
print(x, y)

Copying the content of a mutable object into a new memory position must be done explicitly. For lists, this can be done by extracting the values stored in it.

In [ ]:
x = [1, 1]
y = x[:] # shortcut for x[0:len(x)]
y[1] = 2
print(x, y)

The deepcopy function, within the copy package, allows copying the values of any data structure in a new memory position.

In [ ]:
from copy import deepcopy
y = deepcopy(x)
x[0] = 2
print(x, y)

> Exercise 4

Define a function computing the euclidean distance between two 2D points. Represent points as lists containing 2 elements (i.e. x and y coordinate).

In [ ]:
# write here a function that measures the euclidean distance between two numbers

#these two lists contain each the coordinate of a 2D point
p1 = [0, 0]
p2 = [1, 1]

# call your function here, to measure the distance between p1 and p2

LOOPS

Loops allow repeating group of commands multiple times (iterating). There are several way of looping.

A while loop performs the same statements over and over until some test becomes False

In [ ]:
p = [42.0, 10.1, 3.14]
n = 0
while n < 3:
    print(n, p[n])
    n = n + 1
#Warning! If n is not incremented, this loop will never end! Infinite loops are usually bad.

A for loop performs the same statements for each value in a list

In [ ]:
for n in [1, 2, 3]:
    print("This is the number", n)
     

range is an iterable. It creates a sequence of integers, from the first number up to but not including the second number. As an optional parameter, the step size can be defined (default step is 1). range does not store numbers in memory, it is just the rule that generates them: “start from 0, stop at 99999 in steps of 1”.

In [ ]:
range(0, 12, 3) # produce numbers from 0 to 12 excluded, in steps of 3

To get all numbers defined by range, use type casting:

In [ ]:
list(range(0, 4))

range is typically used in a for loop:

In [ ]:
for i in range(10):
    print(i**2)

> Exercise 5

Define and call a function computing the euclidean distance between two n-dimensional points. Hint: you will need a loop!

In [ ]:
# define here a function that calculated the euclidean distance between n-dimensional points

# these are two 3D points
p1 = [0, 0, 0]
p2 = [1, 1, 1]

# calculate their euclidean distance calling your function, here!

READING AND WRITING FILES

The format of paths in Unix-based systems and Windows differs.

  • for Mac/Linux filename = "/home/Matteo/newfolder"
  • for Windows filename = "C:\Users\Matteo\newfolder"

In Windows the usage of backslash can cause problems (the symbol \n, for instance, means "return to the next line"). Using two backslashes allows you to use Windows-like paths:

In [ ]:
filename1 = "C:\Users\Matteo\newfolder"
print(filename1)
In [ ]:
filename2 = "C:\\Users\\Matteo\\newfolder"
print(filename2)

The os package Offers operating system-dependent functionalities. For instance, it can help checking if a file exists:

In [3]:
import os as O

if not O.path.isfile("absentfile.dat"):
    print("file not found!")
else:
    print("file found!")  
    
if not O.path.isfile("data/myfile.dat"):
    print("file not found!")
else:
    print("file found!")
file not found!
file found!

other useful functions provided by the os package:

In [ ]:
dirname =    #insert the name of a directory
O.path.isdir(dirname) #check if folder exists
O.chdir(dirname) #change working directory
O.getcwd() #get current working directory
O.path.abspath(dirname) #get absolute path

Below there are two examples of how to read a file line by line. the open command creates a file handle, providing methods to manipulate a file. The handle can be in read (second parameter equal to "r"), write ("w") or append ("a") mode. Remember to close the file after you have finished manipulating it using the close() method.

In [4]:
fin = open("data/myfile.dat", "r")

line = fin.readline() #put first lines in a list 
while line: #if not end of file
    print(line)
    line = fin.readline()
fin.close() #close the file (destroy the handle)
1.0	2.6

2.1	3.9

3.2	5.1

4.6	6.7

5.2	7.1

6.3	8.7

7.5	9.9

8.1	10.4

9.3	11.1
In [5]:
#There Is More Than One Way To Do It (TIMTOWDI)
fin = open("data/myfile.dat", "r")
for line in fin:
    print(line)
fin.close()
1.0	2.6

2.1	3.9

3.2	5.1

4.6	6.7

5.2	7.1

6.3	8.7

7.5	9.9

8.1	10.4

9.3	11.1

To save what you read for later usage, create an empty list and append to it data. Note that, unless type casted, numbers are treated as characters.

In [6]:
fin = open("data/myfile.dat", "r")
data = [] #init an empty list
for line in fin:
    columns = line.split()
    #columns is a list of strings e.g. ["1.0", "2.6"]
    data.append(columns)
fin.close()
# data is a list of lists, i.e. [["1.0","2.6"],["2.1","3.9"],...]
# Try to print it!

> Exercise 6

Open the file "1PRE.pdb". Parse it, and print all the lines containing atoms part of a tryptophan, residue name "TRP". Advanced: append lines of interest in a list, and print the list after file parsing is complete.

In [ ]:
# open the file in write mode

# read the file line-by-line

# for each line, check whether the residue "TRP" is present

# close the file handle

If file is open if write mode ("w"), strings can be inserted in it. Creation of a new line should be explicitely stated with \n

In [7]:
#enter the path where to save the file
fout = open("myfile2.dat", "w")
fout.write("hello\n")
fout.close()

data can be formatted into a string using formatting characters such as %s (generic string) %5.2f (floating point, 5 characters of which 2 digits after comma), %-8.2f (floating point with left alignment),…

In [ ]:
a = 12.345
print("%s   %4.2f   %3.1f"%(a, a, a))

> Exercise 7

Parse "1PRE.pdb" and print in an output file the x, y and z coordinates of all atom belonging to a tryptophan. Advanced: make sure that columns are nicely aligned.

In [ ]:
# upgrade the code from exercise 6 here!