We will be using Python as the programming language, and we will use this first lesson to make sure your Python distribution is working properly. We will also make sure you have a way to use the command line on your machine.
Before diving into the Python, I pause here to remind you that this course is meant to help you unleash the power of your computer on your scientific problems. Python is just the language of instruction. That said, let's start talking about how Python works.
Python is an interpreted language, which means that each line of code you write is translated, or interpreted, into a set of instructions that your machine can understand by the Python interpreter. This stands in contrast to compiled languages. For these languages (the dominant ones being Fortran, C, and C++), your entire code is translated into machine language before you ever run it. When you execute your program, it is already in machine language.
So, whenever you want your Python code to run, you give it to the Python interpreter.
There are many ways to launch the Python interpreter. One way is to type
python
on the command line of a terminal. This launches the vanilla Python interpreter. We will never really use this in the course. Rather, we will have a greatly enhanced Python experience using a notebook through JupyterLab.
Traditionally, the first program anyone writes when learning a new language is called "Hello, world.
" In this program, the words "Hello, world.
" are printed on the screen.
We will first write and run this little program using a JupyterLab console. After launching JupyterLab, you probably already have the Launcher in your JupyterLab window. If you do not, you can expand the Files
tab at the left of your JupyterLab window (if it is not already expanded) by clicking on that tab, or alternatively hit ctrl+b
(or cmd+b
on macOS). At the top of the Files
tab is a +
sign, which gives you a Jupyter Launcher.
In the Jupyter Launcher, click the Python 3
icon under Console
. This will launch a console, which has a large white space above a prompt that says In []:
. You can enter Python code in this prompt, and it will be executed.
To print Hello, world.
, enter the code below. To execute the code, hit shift+enter
.
print('Hello, world.')
Hooray! We just printed Hello, world.
to the screen. To do this, we used Python's built-in print()
function. The print()
function takes as an argument a string. It then prints that string to the screen. We will learn more about function syntax later, but we can already see the rough syntax with the print()
function.
Now let's use our new knowledge of the print()
function to have our computer say a bit more than just Hello, world.
Type these lines in at the prompt, hitting enter
each time you need a new line. After you've typed them all in, hit shift+enter
to run them.
# The first few lines from The Zen of Python by Tim Peters
print('Beautiful is better than ugly.')
print('Explicit is better than implicit.')
print('Simple is better than complex.')
print('Complex is better than complicated.')
Note that the first line is preceded with a #
sign, and the Python interpreter ignored it. The #
sign denotes a comment, which is ignored by the interpreter, but very very important for the human!
While the console prompt was nice entering all of this, a better option is to store them in a file, and then have the Python interpreter run the lines in the file. This is how you typically store Python code, and the suffix of such files is .py
.
So, let's create a .py
file. To do this, use the JupyterLab Launcher to launch a text editor. Once it is launched, you can right click on the tab of the text editor window to change the name. We will call this file zen.py
. Within this file, enter the four lines of code you previously entered in the console prompt. Be sure to save it.
To run the code in this file, you can invoke the Python interpreter at the command line, followed by the file name. I.e., enter
python zen.py
at the command line. Note that when you run code this way, the interpreter exits after completion of running the code, and you do not get a prompt.
To shut down the console, you can click on the Running
tab at the left of the JupyterLab window and click on SHUTDOWN
next to the console.
At this point, we have introduced JupyterLab, its text editor, and the console, as well as the Python interpreter itself. You might be asking....
From the Project Jupyter website:
Project Jupyter is an open source project was born out of the IPython Project in 2014 as it evolved to support interactive data science and scientific computing across all programming languages.
So, Jupyter is an extension of IPython the pushes interactive computing further. It is language agnostic as its name suggests. The name "Jupyter" is a combination of Julia (a new language for scientific computing), Python (which you know and love), and R (the dominant tool for statistical computation). However, you can run over 40 different languages in a JupyterLab, not just Julia, Python, and R.
Central to Jupyter/JupyterLab are Jupyter notebooks. In fact, the document you are reading right now was generated from a Jupyter notebook. We will use Jupyter notebooks extensively in the bootcamp, along with .py
files and the console.
When writing code you will reuse, you should develop fully tested modules using .py
files. You can always import those modules when you are using a Jupyter notebook (more on modules and importing them later in the bootcamp). So, a Jupyter notebook is not good for an application where you are building reusable code or scripts. However, Jupyter notebooks are very useful in the following applications.
Now that we know what Jupyter notebooks are and what the motivation is for using them, let's start!
To launch a Jupyter notebook, click on the Notebook
icon of the JupyterLab launcher. If you want to open an existing notebook, right click on it in the Files
tab of the JupyterLab window and open it.
A Jupyter notebook consists of cells. The two main types of cells you will use are code cells and markdown cells, and we will go into their properties in depth momentarily. First, an overview.
A code cell contains actual code that you want to run. You can specify a cell as a code cell using the pulldown menu in the toolbar of your Jupyter notebook. Otherwise, you can can hit Esc
and then y
(denoted Esc - y
") while a cell is selected to specify that it is a code cell. Note that you will have to hit enter after doing this to start editing it.
If you want to execute the code in a code cell, hit Enter
while holding down the Shift
key (denoted Shift + Enter
). Note that code cells are executed in the order you shift-enter them. That is to say, the ordering of the cells for which you hit Shift + Enter
is the order in which the code is executed. If you did not explicitly execute a cell early in the document, its results are not known to the Python interpreter. This is a very important point and is often a source of confusion and frustration for students.
Markdown cells contain text. The text is written in markdown, a lightweight markup language. You can read about its syntax here. Note that you can also insert HTML into markdown cells, and this will be rendered properly. As you are typing the contents of these cells, the results appear as text. Hitting Shift + Enter
renders the text in the formatting you specify.
You can specify a cell as being a markdown cell in the Jupyter toolbar, or by hitting Esc - m
in the cell. Again, you have to hit enter after using the quick keys to bring the cell into edit mode.
In general, when you want to add a new cell, you can click the +
icon on the notebook toolbar. The shortcut to insert a cell below is Esc - b
and to insert a cell above is Esc - a
. Alternatively, you can execute a cell and automatically add a new one below it by hitting Alt + Enter
.
Below is an example of a code cell printing hello, world.
Notice that the output of the print statement appears in the same cell, though separate from the code block.
# Say hello to the world.
print('hello, world.')
If you evaluate a Python expression that returns a value, that value is displayed as output of the code cell. This only happens, however, for the last line of the code cell.
# Would show 9 if this were the last line, but it is not, so shows nothing
4 + 5
# I hope we see 11.
5 + 6
When we learn about plotting with matplotlib and seaborn later in the course, you will learn about displaying graphics in Jupyter notebooks.
There are some keyboard shortcuts that are convenient to use in JupyterLab. We already encountered many of them. Importantly, pressing Esc
brings you into command mode in which you are not editing the contents of a single cell, but are doing things like adding cells. Below are some useful quick keys. If two keys are separated by a +
sign, they are pressed simultaneously, and if they are separated by a -
sign, they are pressed in succession.
Quick keys | mode | action |
---|---|---|
Alt + Enter |
both | run selected cell or cells - if no cells below, insert a code cell below |
Esc |
edit | enter command mode |
m |
command | switch cell to Markdown cell |
y |
command | switch cell to code cell |
a |
command | insert cell above |
b |
command | insert cell below |
Shift + M |
command | merge multiple selected cells into one cell |
dd |
command | delete cell |
Tab |
edit | code completion (or indent if at start of line) |
Ctrl + / |
edit | toggle comment |
Ctrl + ] |
edit | indent |
Ctrl + [ |
edit | dedent |
Alt + Enter |
edit | execute cell and insert a cell below |
There are many others (and they are shown in the pulldown menus within JupyterLab), but these are the ones I seem to encounter most often.
Python is an interpreted language. We can collect sequences of commands into text files and save this to file as a Python program. It is convention that these files have the file extension “.py
”, for example hello.py
.
We can also enter individual commands at the Python prompt which are immediately evaluated and carried out by the Python interpreter. This is very useful for the programmer/learner to understand how to use certain commands (often before one puts these commands together in a longer Python program). Python’s role can be described as Reading the command, Evaluating it, Printing the evaluated value and repeating (Loop) the cycle – this is the origin of the REPL abbreviation.
Python comes with a basic terminal prompt; you may see examples from this with >>>
marking the input:
>>> 2 + 2
4
We are using a more powerful REPL interface, the Jupyter Notebook. Blocks of code appear with an In
prompt next to them:
4 + 5
To edit the code, click inside the code area. You should get a colored border around it. To run it, press Shift-Enter.
10 + 10000
42 - 1.5
47 * 11
10 / 0.5
2 + 2
# This is a comment
2 + 2
2 + 2 # and a comment on the same line as code
Parenthesis can be used for grouping:
2 * 10 + 5
2 * (10 + 5)
Whether you are programming in Python or pretty much any other language, you will be working with variables. While the precise definition of a variable will vary from language to language, we'll focus on Python variables here. Like many of the concepts in this course, though, the knowledge you gain about Python variables will translate to other languages.
We will talk more about objects later, but a variable, like everything in Python, is an object. For now, you can think of it this way. The following can be properties of a variable:
2
, or a string, like 'Hello, world.'
?Depending on the type of the variable, you can do different things to it and other variables of similar type. This, as with most things, is best explored by example. We'll go through some of the properties of variables and things you can do to them in this tutorial.
Variable names in Python can contain alphanumerical characters a-z
, A-Z
, 0-9
and some special characters such as _
. Normal variable names must start with a letter.
By convention, variable names start with a lower-case letter, and Class names start with a capital letter.
In addition, there are a number of Python keywords that cannot be used as variable names. These keywords are:
False, None, True, and, as, assert, asyn, await, break, class, continue,
def, del, elif, else, except, finally, for, from, global, if, import,
in, is, lambda, nonlocal, not, or, pass, raise, return, try, while, with, yield
Note: Be aware of the keyword lambda
, which could easily be a natural variable name in a scientific program. But being a keyword, it cannot be used as a variable name.
# variable assignments
x = 1.5
First, Python creates the object 1.5
. Everything in Python is an object, and so is the floating point number 1.5. This object is stored somewhere in memory. Next, Python binds a name to the object. The name is x
, and we often refer casually to x
as a variable, an object, or even the value 1.5. However, technically, x
is a name that is bound to the object 1.5
. Another way to say this is that x
is a reference to the object.
Note, however, if the last line does not return a value, such as if we assigned value to a variable, there is no visible output from the code cell.
Once the variable x
has been created through assignment of 0.5 in this example, we can make use of it:
x*3
x**2
In computer programs we often find statements like
x = x + 1
If we read this as an equation as we are use to from mathematics, x = x + 1 we could subtract x on both sides, to find that 0 = 1. We know this is not true, so something is wrong here.
The answer is that “equations“ in computer codes are not equations but assignments. They always have to be read in the following way two-step way:
Evaluate the value on the right hand side of the equal sign
Assign this value to the variable name shown on the left hand side. (In Python: bind the name on the left hand side to the object shown on the right hand side.)
Some computer science literature uses the following notation to express assignments and to avoid the confusion with mathematical equations:
$$x \leftarrow x + 1$$Let’s apply our two-step rule to the assignment x = x + 1
given above:
Evaluate the value on the right hand side of the equal sign: for this we need to know what the current value of x
is. Let’s assume x
is currently 4
. In that case, the right hand side x+1
evaluates to 5
.
Assign this value (i.e. 5
) to the variable name shown on the left hand side x
.
Let’s confirm with the Python prompt that this is the correct interpretation:
x = 4
x = x + 1
x
In Python, multiple assignments can be made in a single statement. Either single value can be assigned to several variables simultaneously:
a = b = c = 0 # initialise a, b and c with 0
or multiple values could be assigned to multiple variables using unpacking:
a, b, c = 5, 3.2, 7
Although not explicitly specified, a variable does have a type associated with it. The type is derived from the value that was assigned to it.
type(4 / 2)
If we assign a new value to a variable, its type can change.
x = 1
type(x)
If we try to use a variable that has not yet been defined we get an NameError
:
print(y)
# integers
x = 1
type(x)
# float
x = 1.0
type(x)
# boolean
b1 = True
b2 = False
type(b1)
# complex numbers: note the use of `j` to specify the imaginary part
x = 1.0 - 1.0j
type(x)
print(x)
print(x.real, x.imag)
x = 1.9
print(x, type(x))
x = int(x)
print(x, type(x))
z = complex(x)
print(z, type(z))
a, b = 3+5j, complex(3, 5)
print(a, b)
x = float(z)
Complex variables cannot be cast to floats or integers. We need to use z.real
or z.imag
to extract the part of the complex number we want:
print(z.real, type(z.real))
print(z.imag, type(z.imag))
Operators allow you to do things with variables, like add them. They are represented by special symbols, like +
and *
. For now, we will focus on arithmetic operators. Python's arithmetic operators are
action | operator |
---|---|
addition | + |
subtraction | - |
multiplication | * |
division | / |
exponentiation | ** |
modulus | % |
floor division | // |
Warning: Do not use the ^
operator to raise to a power. That is actually the operator for bitwise XOR, which we will not cover in the course.
10 % 3
1 + 2, 1 - 2, 1 * 2, 1 / 2
1.0 + 2.0, 1.0 - 2.0, 1.0 * 2.0, 1.0 / 2.0
# Integer division of float numbers
3.0 // 2.0
# Note! The power operators in python isn't ^, but **
2 ** 3
and, using the fact that $\sqrt[n]{x} = x^{1/n}$, we can compute the $\sqrt{3} = 1.732050\dots$ using **
:
3**0.5
(3*2)**4
The order of operations is also as we would expect. Exponentiation comes first, followed by multiplication and division, floor division, and modulo. Next comes addition and subtraction. In order of precedence, our arithmetic operator table is
precedence | operators |
---|---|
1 | ** |
2 | * , / , // , % |
3 | + , - |
You can also group operations with parentheses. Operations within parentheses is are always evaluated first. As a watchout, do not use excessive parentheses. So often, I see students not trusting the order of operations and polluting their code with lots of parentheses, making it unreadable. This has been the source of countless bugs I've encountered in student code through the years.
Let's practice.
1 + 4**2
1 + 4/2
1**3 + 2**3 + 3**3 + 4**3
(1 + 2 + 3 + 4)**2
Interestingly, we also demonstrated that the sum of the first $n$ cubes is equal to the sum of the first $n$ integers squared. Fun!
and
, not
, or
.5 == 5.0
not False
True or False
|
|
|
>
, <
, >=
(greater or equal), <=
(less or equal), ==
equality, is
identical.2 > 1, 2 < 1
2 > 2, 2 < 2
2 >= 2, 2 <= 2
# equality
[1,2] == [1,2]
# objects identical?
l1 = l2 = [1,2]
l1 is l2
Operator | What it means |
---|---|
== | Equal to |
!= | Not equal to |
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |