We could say that it is a language we can use to communicate with a machine and ask about solving a specific task. We can specify an algorithm, that solves a task in a particular way, and make the machine execute and do what we need.
Ok, there are some differences between programming languages. There are (in general) two categories: (1) Compiled and (2) Interpreted programming languages.
Interpreted and compiled programming languages
We thought about an algorithm to go to our institute or university. In that example, we learnt that I mentioned that if you are most specific, the machine will not need to "infer" many things. It will not have to work too hard to understand what you asking to do. Then, the most specific you are, the fastest the execution of such an algorithm.
That's because of this abstraction level as we see here. Binary code is 0's and 1's, that's how machine operates over data and perform computations, but we cannot code directly in Binary Code.
If we increase the level of abstraction, we will see the Machine code and Assembly languages, which are still too complicated for humans to use directly on data analysis.
Then, we have the middle-level languages. Examples of that are C/C++ and FORTRAN. Here, the tasks are done very fast. And on top of them, we have high-level, scripting languages such as MATLAB, Java, and Python.
Compared to High-level or scripting languages such MATLAB and Python, C/C++ is super fast.
That's because in C you have access to very middle-level instructions (e.g., memory allocation). Which means you can be more specific about what the machine should do and how to do it. However, and very often, you have to code more in C/C++ than in Python. It takes longer... And your code is gonna be compiled. While others are interpreted.
Now, what are the differences between them?
Some of them are:
(Image source: memgraph.com)
A notebook document can contain computer code (e.g., python scripts) and text elements (e.g., paragraph, equations, figures, links, etc). Thus, these documents contain analysis descriptions, results as well as executable cells which can be run to perform data analysis.
Here, we can directly "talk" to the Python's interpreter. You can run the code of a cell by selecting it and pressing "shift
+ enter
". Try it out!
# method 'print(arg)': it returns the argument 'arg' to the user's console
print("Hallo Welt!")
Hallo Welt!
We can think that a variable in programming is a name or a label that we will use to refer a particular object. Such object can be value we store in the memory (e.g. n = 15
).
We can store values on variables without specifying their types. That's because Python is a dynamically-typed languague: If we assign a value to a variable the python interpreter will assign it a type.
For example, if we do this:
a = 2 # int (integer)
b = 7.1 # float (float/decimal)
c = True # bool (boolean)
d = "Hallo!" # str (string, list of characters)
print(type(a), a)
print(type(b), b)
print(type(c), c)
print(type(d), d)
<class 'int'> 2 <class 'float'> 7.1 <class 'bool'> True <class 'str'> Hallo!
We can see with type()
the type of each variable.
Note that we didn't specify explicitly the type of the data of each variable. Thus, the Python interpreter infers which data type is from the value you want to store.
You can also change the type of the variables:
str(a) # it converts the variable's value into a string.
'2'
int(b) # from float (decimal) to integer
7
Notice that was b = 7.1
. However, we are forcing to convert a decimal value into an integer. Then, the interpreter truncates the original value.
Imagine that you want to store 100 values about an specific variable (e.g., participant_age
, neuron_type
). It is not convenient to create 100 variables to store each value.
In such cases, we use Python lists (also known as vectors or arrays).
For example:
# list of integers
my_list_int = [3,1,2,5,6]
# list of strings
my_list_strings = ["red", "green", "blue", "yellow", "black"]
# list with different types of elements
my_list_elements = ["3",1.5,2,True]
Note that a list can store different types of elements.
(Image source: LearnByExample.org)
In order to have access to the elements of a given list, we should write the index of such element in the list, starting by the number 0.
For example:
print(my_list_strings[0])
print(my_list_strings[2])
red blue
Let's consider the following list of integers:
list_of_integers = [1,5,2,9,4,8,6]
print(list_of_integers)
[1, 5, 2, 9, 4, 8, 6]
We can know the length of a list (i.e., the amount of elements it has) by writing len(my_list)
:
len(list_of_integers)
7
We can access each item by indicating its index:
list_of_integers[3]
9
list_of_integers[-1] # -1 represent the last element!
6
We can access to specific sub-list of a list.
If we write our_list[:a]
, it will return every element of our_list
until the index a
.
list_of_integers[:3]
[1, 5, 2]
list_of_integers[-3:]
[4, 8, 6]
And also access to a sublist:
list_of_integers[1:4]
[5, 2, 9]
We can invert a Python list as follows:
list_of_integers[::-1]
[6, 8, 4, 9, 2, 5, 1]
To add an element use append()
as follows:
list_of_integers.append(10)
print(list_of_integers)
[1, 5, 2, 9, 4, 8, 6, 10]
You can remove an element from a list by indicating such element.
For example, to remove the element 9
we do:
list_of_integers.remove(9)
print(list_of_integers)
[1, 5, 2, 4, 8, 6, 10, 10, 10]
list_of_integers = [5,1,6,1,2,3]
list_of_integers.remove(1)
print(list_of_integers)
[5, 6, 1, 2, 3]
Note that remove(item)
will remove the first item it finds (not all of them) from left to right.
List allow us to store different types of variables such as Integers, Floats, Strings... and also lists! In this case, we call them nested lists.
(Image source: LearnByExample.org)
Let's check the implementation of such example below:
L = ['a', 'b', ['cc', 'dd', ['eee', 'fff']], 'g', 'h']
print(L[2])
# Prints ['cc', 'dd', ['eee', 'fff']]
print(L[2][2])
# Prints ['eee', 'fff']
print(L[2][2][0])
# Prints eee
['cc', 'dd', ['eee', 'fff']] ['eee', 'fff'] eee
We can also use this property to represent a matrix:
A = [[1,2,3],[1,0,5],[2,0,0]]
B = [[1,2,3],[1,0,5],[2,0,0]]
C = [[0,0,0],[0,0,0],[0,0,0]]
print("A\n", A)
print("B\n", B)
print("C\n", C)
A [[1, 2, 3], [1, 0, 5], [2, 0, 0]] B [[1, 2, 3], [1, 0, 5], [2, 0, 0]] C [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
In this case, a matrix would be represented by a list of list.
Python dictionaries (dict()
) allow us to store information on them and using keys for indexing.
They are an unordered sequence of items, as pairs (key, value)
.
Just like Lists, the values of dictionaries can hold data of different types (i.e., integers, floats, strings, lists, dictionaries, etc).
For example:
my_dict = {
"firstname" : "Ash",
"lastname" : "Ketchum",
"residence" : "Palette Town",
"PLZ" : 101,
"region" : "Kanto",
"Team" : [
"Charizard", "Squirtle", "Butterfly", "Pidgeotto", "Bulbasaur","Pikachu"
]}
print(my_dict)
{'firstname': 'Ash', 'lastname': 'Ketchum', 'residence': 'Palette Town', 'PLZ': 101, 'region': 'Kanto', 'Team': ['Charizard', 'Squirtle', 'Butterfly', 'Pidgeotto', 'Bulbasaur', 'Pikachu']}
print(my_dict['firstname'], my_dict['lastname'], "from", my_dict["residence"])
Ash Ketchum from Palette Town
print(my_dict["Team"][0]) # First element of "Team" list
Charizard
Python dictionaries help us to store our data in complex data structures.
In order to control the execution flow in your algorithm, in python we use if (condition)
, elif (condition)
, and else
.
if (condition):
# your statements if `condition` is True
else:
# your statements if `condition` is False
if one condition is not enough, we can do:
if (condition1):
# your statements if `condition1` is True
else:
# your statements if `condition1` is False
if (condition2):
# your statements if `condition2` is True
else:
# your statements if `condition2` is False
or equivalently:
if (condition1):
# your statements if `condition1` is True
elif (condition2):
# your statements if `condition1` is False but `condition2` is True
else:
# your statements if both `condition1` and `condition2` are False
x = -10
if (x > 0):
print("Positive")
elif (x < 0):
print("Negative")
else:
print("Zero")
Negative
for
and while
)¶We can access each element of a list through for
and while
.
for
allow us to repeat each element of an list or range.
For example:
N = 5
for i in range(N): # it goes from 0 to N-1
print (i)
0 1 2 3 4
On the other hand, while
also loops but given a condition.
while (conditional_statement):
# actions
# ...
While such condition is True
all the inner statements will be executed.
Note: Be sure that the condition will turn False
at some point. Otherwise, the while loop will never end.
N = 5
i = 0
while (i < N): # it goes from 0 to N-1
print(i)
i = i + 1
0 1 2 3 4
Compute z
as the substraction of each element of lists x
and y
, such that zi=(xi−yi)2. Assume that input lists x
and y
have the same length. Finally, print each value of z
and its index in the list.
Use a repetition statement to access each element of the arrays.
Example input:
x = [1,2,3,4,5]
y = [2,2,2,2,2]
Expected output:
Index 0 , value: 1
Index 1 , value: 0
Index 2 , value: 1
Index 3 , value: 4
Index 4 , value: 9
Solution
# Here your code!
# 1. Be sure you can generate an output as the expected one indicated above.
# To compute the value of a number `x` to the power of two, you do: `x**2`.
Bonus:
Compute the matrix multiplication of A = [[2,5,2],[1,0,-2],[3,1,1]]
and B = [[-2,1,0],[-2,2,1],[0,0,3]]
and save the result in C. Then, print each row of the matrix C
.
Example input:
A = [[2,5,2],[1,0,-2],[3,1,1]]
B = [[-2,1,0],[-2,2,1],[0,0,3]]
Expected output:
[-14, 12, 11]
[ -2, 1, -6]
[ -8, 5, 4]
Solution
A = [[2,5,2],[1,0,-2],[3,1,1]]
B = [[-2,1,0],[-2,2,1],[0,0,3]]
# Here your code!
# Suggestions: Use `for` instead of the `while` statement
# Start by defining a zero-matrix
# Notice you have to access each element of each row of `A`
# then, you have to access each element of each column of `B`
# Finally, you have to access each element of the new matrix (`C`)
(Image source: LearnByExample)
def hallo():
print("Hallo Welt!")
hallo()
Hallo Welt!
A function can receive input parameters or arguments to perform computations. In that case, you should add those arguments separated by commas.
For example:
def hallo(name, lastname):
print("Hallo,", name, lastname, "!")
hallo(name='Eren', lastname='Jäger')
Hallo, Eren Jäger !
If you want to allow any amount of arguments, you define a function as follows:
def print_args(*args):
print(args)
print_args((1,1,3,5,8))
((1, 1, 3, 5, 8),)
Additionally, you can make it for keyword arguments:
def print_keyword_args(**kwargs):
print(kwargs)
print_keyword_args(name="Zeke", lastname="Jäger")
{'name': 'Zeke', 'lastname': 'Jäger'}
A function can also return values by using the return
statement:
# It returns the sum of two values
def my_sum(a, b):
return a + b
x = my_sum(2, 8)
print(x)
10
v
, a list of numbers, and (2) a
, an scalar. The function should return a list y
such that y = v*a
.Example input
v = [1,2,3,4,9,8,7,6,5] a = 10
_Expected output_ ```python
[10, 20, 30, 40, 90, 80, 70, 60, 50]
Solution
v = [1,2,3,4,9,8,7,6,5]
a = 10
# your solution here
(Bonus #1)
Write a function that returns the distance, as absolute value, between every pair of components between two lists x
and y
. Use abs()
. Call such function and print the result, separating each row.
Example Input:
x=[1,2,3,4,5]; y=[1,2,3,4,5]
_Expected Output:_ ```python
[0, 1, 2, 3, 4] [1, 0, 1, 2, 3] [2, 1, 0, 1, 2] [3, 2, 1, 0, 1] [4, 3, 2, 1, 0]
Solution Bonus #1
x=[1,2,3,4,5]
y=[1,2,3,4,5]
# your solution here
(Bonus #2)
A
and B
(each one represented as a list of lists). Print each row of the resulting matrix separately.Example Input:
A = [[1,2,3],[2,0,5],[8,0,0]] B = [[1,2,3],[1,2,3],[1,2,3]]
_Expected Output_ ```python
[6, 12, 18] [7, 14, 21] [8, 16, 24]
Solution Bonus #2
A = [[1,2,3],[2,0,5],[8,0,0]]
B = [[1,2,3],[1,2,3],[1,2,3]]
# your solution here
A decorator is a function that allow us to extend the behavior of another input function (without explicitly changing it).
Let's consider the next function to sum to numbers a
and b
:
def add_together(a, b):
return a + b
print(add_together(2,3))
5
Let's try to extends its behavior and make it operate for a list of pairs (a
,b
).
def decorator_list(func):
def inner(list_of_tuples):
l_result = []
for val in list_of_tuples:
# Note that `func` exists in this context
l_result.append( func(val[0],val[1]) )
return l_result
return inner
@decorator_list
def add_together(a, b):
return a + b
print(add_together([(1,2),(2,3),(3,4),(4,5),(5,6),(6,7)]))
[3, 5, 7, 9, 11, 13]
As you saw, we did not need to change anything from our first function called add_together
. Instead, we just added @decorator_name
before its definition.
This helps a lot to expand the functionalities of your functions without the need of modifying them.
So, decorators are an elegant way to extend functionalities of your functoins without the need of modifying them!
(Spoiler) We will use decorators in more detail tomorrow. It will help us to speed-up our python code 🔥!.
Given the function divide
as follows:
def divide(a, b):
print(a/b)
Create a decorator called smart_divide(func)
that receives as input the function called divide(a,b)
and executes it only if b
is not 0
. Otherwise, print "I cannot divide!"
and return None
.
def smart_divide(func):
# your code here
@smart_divide
def divide(a, b):
print(a/b)
divide(a=7, b=2)
I am going to divide 7 and 2 3.5
divide(a=7, b=0)
I am going to divide 7 and 0 Whoops! cannot divide
A distance measure between two vectors as lists
Create a function called vector_distance
that receives three input parameters:
x
: a list of integers.y
: a list of integers.f
: a function that operates two input parameters: a
and b
.Then, vector_distance
should return the mean distance after operating each pair of element (a
,b
) by the function f
.
Note: assume that x
and y
have the same lengths (i.e., number of elements)
x = [1,2,4,8]
y = [1,3,9,27]
def abs_distance(a, b):
return abs(a-b)
def sq_distance(a, b):
return (a-b)**2
def inter_distance(a, b):
_min = min(a,b)
_max = max(a,b)
return (_max / (_min + _max))
def vector_distance(x, y, f):
# your code here!
print("Not implemented yet!")
Try your implementation:
print("abs_distance :", vector_distance(x, y, abs_distance))
print("sq_distance :", vector_distance(x, y, sq_distance))
print("inter_distance:", vector_distance(x, y, inter_distance))
Not implemented yet! abs_distance : None Not implemented yet! sq_distance : None Not implemented yet! inter_distance: None
Bonus: A distance measure between two matrices as nested lists.
A distance function called matrix_distance
which, instead of computing a distance f(a,b)
receives as inputs two matrices X
and Y
as nested lists.
Create a function called matmut
to multiply two matrices X
and Y
(nested lists).
Modify the previous function matmul
to apply the function vector_distance
instead of multiplying and sum the element of a row and a column.
A python module is a reusable chunk of code that you may want to include in your programs / projects.
Compared to languages like C/C++, a Python libraries do not pertain to any specific context in Python.
Then, a library is a collection of modules.
import math
math.cos(math.pi) # -1.0
-1.0
math.sin(math.pi) # 0.0
1.2246467991473532e-16
# round a float up or down
print( math.ceil(3.3))
print( math.floor(3.3))
4 3
import random
print("Construct a seeded random number generator:")
print(random.random())
Construct a seeded random number generator: 0.5824475071608946
from time import time
t_start = time()
n_numbers = 10_000
for i in range(n_numbers):
temp = random.random()
t_end = time()
total_time = time() - t_start
print("Total time", total_time, "Seconds")
Total time 0.0014522075653076172 Seconds
Computation times usually fluctuate a lot. In order to have a better estimation we compute the mean (and standard deviation) of a vector of times.
Let's see this point with an example:
l_times = []
# define a method to sum
def my_sum(my_list):
result = 0
for i in range(len(my_list)):
result = result + my_list[i]
return result
# generating random numbers
random_numbers = []
for i in range(n_numbers):
temp = random.random()
random_numbers.append( temp )
# measuring time
for i in range(10):
t_start = time()
my_sum(random_numbers)
total_time = time() - t_start
l_times.append( total_time )
l_times
[0.002279996871948242, 0.009655237197875977, 0.0012543201446533203, 0.0012154579162597656, 0.011293411254882812, 0.0012998580932617188, 0.003256082534790039, 0.0012285709381103516, 0.0011463165283203125, 0.0012488365173339844]
print("mean", sum(l_times)/len(l_times))
mean 0.0033878087997436523
Create function that returns a list of n≥1 random numbers between -1 and 1. Validate n: if n is less than 1, then return an empty list.
(Bonus):
Measure the time needed to compute the sum of 1000 random numbers (in seconds).
Solution
# your code here !
You can create your own modules and packages in Python. This will make easier to maintain and debug your code.
Some of the benefits of modularizing your code are:
Follow the next steps:
*.py
). Name such file as my_python_module.py
.import random
def create_random_numbers(n):
my_list = []
for i in range(n):
random_num = random.random()
random_num = (random_num*2) - 1
my_list.append( random_num )
return my_list
We can import the module as follows:
import my_python_module
One way to see what is the content of your python module is by doing dir(your_module)
.
dir(my_python_module)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'create_random_numbers', 'random']
Let's use the function of my_python_lib
n_random_numbers = 50
random_numbers = my_python_module.create_random_numbers(n_random_numbers)
That's it!
By using modules, we can make our analysis more robust as we are leveraging existing code. When we create our modules, we can publish them and helps ourselves and other programmers/scientists to use them in future analyses 😀.
Add and test the following function into your module called my_python_module
:
def add_together(a, b):
return a + b
# your code here !
Day #1 of the summer course "Introduction to High-Performance Computing in Python for Scientists!".
Goethe Research Academy for Early Career Researchers (GRADE), Goethe University Frankfurt, Germany. June 2022.