Notebook

Introduction to Python Day 1: Basic Syntax, Variables & Arrays¶

written by Jackie Champagne (UT Austin), adapted by Hannah Hasson (U of Rochester)¶

Hi there! Welcome to our brief introductory Python course for scientific coding. We will be teaching you skills ranging from the most basic tasks to some more advanced plotting techniques that will be helpful to you in research.

These Colab notebooks work the same as Jupyter notebooks, which is a code editor you can use offline. They have a mix of text cells and code cells, the latter of which you can execute by clicking the "play" button in the top left of the cell, or by hitting Shift+Enter.

Make sure to execute all the code cells as we go through the lesson!

The whole notebook is automatically saved periodically, but you can also save the outputs from your code as text files, plots, or images separate from the notebook. It is a great tool when you are building code from scratch and want to troubleshoot it or make a quick plot.

If you are attending this workshop live, the Questions in this notebook are meant to be a short few lines of code which you will do during the workshop. The Exercises, in a separate notebook, are longer problems you will work on after lecture, with the expectation of having all exercises finished by the end of the course. You are encouraged to work collaboratively.

Let's get started!

Importing packages¶

The first thing you'll need to do when writing any code is import the packages you expect to use. Packages are groups of functions and keywords for some purpose. For example, the numpy package has mathematical functions like $sine$ (numpy.sin) and constants like $\pi$ (numpy.pi).

You can put your import statement anywhere in your code, as long as it's written before you call anything from the package. But it's cleaner to put them all at the top, so that's what we'll do.

For this tutorial, let's load up numpy and a couple other useful examples. To run the code in a cell, click on the cell and press SHIFT+ENTER or click the play button at the top left of the cell. To add comments to your code, put a hashtag # in front of each line of your comment. Do this to add notes explaining your code.

Some examples of imports and commenting:

In [ ]:

import numpy as np #'import' loads up the package. 

# you can use 'as' to define a shortcut so you don't need to type
# numpy before every use of the package. Most people use 'np'.

import matplotlib.pyplot as plt

from scipy import integrate #'from' allows you to import a specific sub-package

Now we don't have to import these packages again for the rest of our notebook! Calling a function or constant from one of your imported packages is simple.

function:

nameofpackage.somefunction(arguments,go,here)

constant:

nameofpackage.someconstant

Execute the example below to call the constant pi

In [ ]:

np.pi #remember we renamed numpy as np

If you try to use something from a package without importing it first, you will get a sassy little error message

In [ ]:

pandas.read_csv("some_filename.pdf")

Setting Variables¶

Variables are a way to store information that you can later access or modify. The information stored may be numbers, characters, or lists of either. Variables are typically given descriptive names to help us and others understand what our code is doing.

To define a variable, use the = sign.

The variable gets assigned the value of whatever you put to the right of the equal sign. This can be a number, text, or many other data types.

In [ ]:

a = 1
b = 2

Now Python will always know that a is 1 in this notebook. Setting variables is useful for things like constants, such as g=9.8 (acceleration of gravity in $m/s^2$). Note that you can't start a variable name with a number!

To check that this worked, we can print it out. The syntax for printing something is print(thing_you_want_to_print).

In [ ]:

print(a) #this prints the value of a
print("a") #this prints the letter a

Every new print statement goes to a new line. If you want to print multiple things together on the same line, you can just separate them by a comma in your print function.

In [ ]:

print("a equals", a, "and a+b is", a+b)

Boolean logic: comparing variables¶

The double equals sign, ==, represents the comparison operator in boolean logic. This refers to comparing values and seeing if a relationship is true or false. This will come in handy when you write code where your data must meet a certain condition.

Remembering that we set a=1 earlier:

In [ ]:

a == 1 #check if a is equal to 1

In [ ]:

a == 2 #check if a is equal to 2

Do NOT confuse the single equals = (assign variable) with the double == (check if two things are equal).

We can create criteria with multiple booleans:

OR statements (or): statement is TRUE if either A or B are true (or if both are true); statement is FALSE if both statements A and B are false.

AND statements (and): statement is TRUE if and only if both A and B are true; statement is FALSE if one or both is false.

In Python, the phrase a == 1 or 2 does not make sense. The full statement must be a == 1 or a == 2 so that each piece of logic is separate.

In [ ]:

a == 1 or a == 2 #True OR False

In [ ]:

a == 1 or b == 2 #True OR True

In [ ]:

a == 1 and a == 2 #True AND False

In [ ]:

a == 1 and b == 2 #True AND True

There are also other comparison operators. Here are all the basic ones you will use:

==   equal to
!=   not equal to
<    less than
<=   less than or equal to
>    greater than
>=   greater than or equal to

In [ ]:

a >= 0

In [ ]:

a < 1

Variable types¶

There are 3 kinds of basic variables in Python: floats, integers, and strings.

A floating point value (float) is a number with a decimal point. An integer is a whole number. A string has quotes around it and is treated as a word rather than a numeric value.

Question 1: What kinds of variables are the following? Fill it in as a comment on each line.¶

In [ ]:

i = 1 # type here
j = 2.43 #
k = 'Hello world!' # 
L = 3. #
m = "123456" # 

Notice that this next line gives you an error. Why?

In [ ]:

n = Hello world!

You can convert between variable types if necessary, using the following commands:

int()
float()
str()

int() will print the whole number value of the float and chops off everything after the decimal. float() will follow an integer with a .0, which sounds pointless but is sometimes necessary for Python arithmetic. str() will put quotes around it so that Python reads it literally rather than numerically.

Check out what each of these does to L (which we defined above as 3.):

In [ ]:

print(int(L)) #make it an integer
print(float(L)) #it's already a float
print(str(L)) #make it a string

Question 2: Convert `i` (defined in the previous example) to a float and to a string, assigning each to a new variable name. Convert `j` into an integer and assign it to a variable. Use `print(type(variable_name))` to check each answer.¶

In [ ]:

# solution here

Now convert k to an integer. What happens?¶

In [ ]:

# solution here

Finally, convert m to an integer. Then check its type with the command `type(new_variable_name)`.¶

In [ ]:

# solution here

Python arithmetic¶

The syntax for doing arithmetic is the following:

+           add
-           subtract
*           multiply
/           divide
**          power
np.log()    log-base e (natural log)
np.log10()  log-base 10
np.exp()    exponential

In [ ]:

3**2

In [ ]:

8 + 9

In [ ]:

i + j

In [ ]:

i * j

In [ ]:

j - i

In [ ]:

i / j 

Wait a minute.... where did that extra 0.0000000000000002 come from when we subtracted?? This is due to something called floating-point error, which happens because the computer actually converts these numbers into binary (0's and 1's) before doing the subtraction. Because some decimals are hard to represent with binary, you get little errors introduced.

Arrays and Lists¶

When working with data, you usually won't be dealing with just one number, but a collection of values. These collections can consist of floats, integers, strings, or a combination of them. We distinguish here two types of data structures: lists and arrays.

A list is denoted by brackets: [ ], while an array must be defined with np.array(). The prefix "np" is required because array is a function of the numpy package, which we imported as np.

We talk about the size of 2D arrays in terms of their dimensions: (rows, columns). You will later reference a specific element in an array by its (row, column) coordinate in the array. This is called indexing.

The following is a 1D list:

In [ ]:

beemovie = ['Barry B. Benson', 'Vanessa Bloome', 'Ray Liotta as Ray Liotta']
print(beemovie)

For strings, this is fine, but you will need to use arrays in order to manipulate them mathematically. The array function is built into numpy. Here are a 1D array and a 2D array:

In [ ]:

myarray = np.array([1, 2, 3]) #1D
my2darray = np.array([[1, 2], [1, 2]]) #2D
print(myarray)
print(my2darray)

Recall that in the beginning we imported numpy as np, so when we call the 'array' function from numpy we write np.array().

Notice the array function has parentheses (), and then the whole array must be enclosed in a set of brackets [ ] inside that. Within that, each row of the array should be in its own set of brackets, separated by commas.

Question 3: Create the following 2D array and then print it:¶

1 2 3
4 5 6

In [ ]:

#solution here

Populating Arrays¶

You don't always have to put values into your array manually, especially if, for example, you want an array of evenly spaced numbers to apply some function to.

Here are two ways to make arrays of evenly spaced numbers:

The first is np.linspace() and the second is np.arange(). Both give you an array with numbers between two values that are linearly spaced (e.g. 2, 4, 6, 8, 10), but they do it slightly differently.

The syntax is the following:

np.linspace(beginning_number, end_number, number_of_points)
np.arange(beginning_number, end_number, step_size)

The step_size here corresponds to the difference between one point and the next in your array of values.

Typically np.linspace is used when you know how many datapoints you need, and np.arange is used when you want to jump by a certain amount between each number. This just saves you some mental math when you know only how many points you want or only the spacing.

An important note is that np.linspace is an inclusive function, meaning that the end number you give is included in the output array. However, np.arange is exclusive, meaning the end number is not included. Keep this in mind as we move forward!

Question 4: Create a `linspace` array with ten entries between 1 and 100. Create an `arange` array from 100 to 200 (with 200 included!) spaced by 10s. Print them to check :)¶

In [ ]:

# solution here

Another quick way to create an array is through np.zeros. This populates an array with, well, zeros. It might sound useless at first, but it's an easy way to make an array that you will later replace with different values. It helps keep arrays at a fixed length, for instance. More on that later.

The syntax is simply number of rows, number of columns. If it's 1D, then it can just be:

np.zeros(3) #1x3 array of zeros

If it's larger than 1D, you need two sets of parentheses:

np.zeros((rows, columns))

Question 5: Create a 3x3 array of zeros (and print it out)¶

In [ ]:

#solution here

PAUSE HERE AND TAKE A BREAK!¶

Indexing¶

An index is the position of some element in an array. Remember above we talked about using the coordinates (row, column) of an element in an array to get its value?

Note that Python uses zero-based indexing!!¶

This means that the first value of an array is the 0th index. Repeating that: the first value in an array is the zeroth index. So the second element has index 1, the third has index 2, and so on...

To call a certain value from an array, call the array name followed by brackets containing the index of the value you want:

arrayname[0]

Here is a quick example:

In [ ]:

hi = np.array([92, 73, 85, 61])
hi[2] #get the 3rd element of the array

The value inside the brackets can also be a variable, so long as the variable is an integer. You can't have the 1.5th element of a list.

To index a 2D array, you would give the row and column indices:

arrayname[0,0] # again in row, col notation

For example:

In [ ]:

hello = np.array([["who", "what", "when"], 
                  ["where", "why", "how"]])

hello[1,2] #get the element in the 2nd row & 3rd column

A helpful shortcut is that you can also count backwards in your array with a negative sign, so the last value in your array is always array[-1].

Question 6: Print out the first and last value in your linspace array.¶

In [ ]:

#solution here

Slicing arrays¶

Finally, you can also grab slices of arrays between certain index values. This is helpful if you want to plot only a small subsample of your data, for example.

For slicing, use a colon. Syntax:

:x - from beginning to index x
x: - from index x until the end
a:b - from index a to b
a:b:c - every c'th entry between indices a and b

These can be combined, e.g. a::c goes from index a until the end in steps of c.

Slicing is exclusive, so the last index of a range isn't included. For example, if you want to take index two through six of an array you should do:

array[2:7]

You can also slice a 2D matrix, which just has two slicing arguments:

array_2D[3:10,4:30]

Question 7: Print out the following:¶

a) your linspace array through (including) index 5;¶

b) your arange array beginning at index 1;¶

c) your linspace array for indices 4 through 8; and¶

d) your full arange array in steps of 2 indices.¶

In [ ]:

#solution here

Array Manipulation and Attributes¶

You can manipulate all elements of an array with one statement. Check it out:

Question 8: Create a new array which is your `arange` array divided by 2. Create another new array which is your `linspace` array + 2.¶

In [ ]:

#solution here

You can add values to the end of an array using np.append(). Use it like this:

np.append(array, something_appended)

You can even append another array, like this:

np.append(array, [5, 6])
np.append(array1, array2)

Question 9: Create a new array which is another linear array from 100 to 200 and append it to your linspace array.¶

In [ ]:

#solution here

The last part of today will be showing you how to acquire different information from an array. Some of these are attributes of the array, and some of them are attributes of np itself, so you may need to look this up again in the future.

Attributes of the array means that you call this by nameofarray.command:

ndim - prints dimensions of your array
size - number of elements in n-dimensional array
shape - shape given by (rows, columns)
flatten() - collapses the array along one axis
T - transpose the matrix
reshape(x, y) - change the dimensions of the array to x, y -- the total number of elements (x*y) MUST match

Attributes of numpy, meaning that you call it by np.command(nameofarray):

sum - sum all the elements in the array
min - print minimum value in array
max - print maximum value
sort - print array in ascending order
len - print number of elements along the row axis
dot - matrix multiplication

Question 10: Find the shape of your last array, and then print the sum of that array¶

In [ ]:

#solution here

You will have received a link to exercises to do at the end of each day's lesson (Exercises.ipynb). Please do the Day 1 exercises with your fellow students before the start of the next session!¶

Introduction to Python Day 1: Basic Syntax, Variables & Arrays¶

written by Jackie Champagne (UT Austin), adapted by Hannah Hasson (U of Rochester)¶

Importing packages¶

Setting Variables¶

Boolean logic: comparing variables¶

Variable types¶

Question 1: What kinds of variables are the following? Fill it in as a comment on each line.¶

Question 2: Convert i (defined in the previous example) to a float and to a string, assigning each to a new variable name. Convert j into an integer and assign it to a variable. Use print(type(variable_name)) to check each answer.¶

Now convert k to an integer. What happens?¶

Finally, convert m to an integer. Then check its type with the command type(new_variable_name).¶

Python arithmetic¶

Arrays and Lists¶

Question 3: Create the following 2D array and then print it:¶

Populating Arrays¶

Question 4: Create a linspace array with ten entries between 1 and 100. Create an arange array from 100 to 200 (with 200 included!) spaced by 10s. Print them to check :)¶

Question 5: Create a 3x3 array of zeros (and print it out)¶

PAUSE HERE AND TAKE A BREAK!¶

Indexing¶

Note that Python uses zero-based indexing!!¶

Question 6: Print out the first and last value in your linspace array.¶

Slicing arrays¶

Question 7: Print out the following:¶

a) your linspace array through (including) index 5;¶

b) your arange array beginning at index 1;¶

c) your linspace array for indices 4 through 8; and¶

d) your full arange array in steps of 2 indices.¶

Array Manipulation and Attributes¶

Question 8: Create a new array which is your arange array divided by 2. Create another new array which is your linspace array + 2.¶

Question 9: Create a new array which is another linear array from 100 to 200 and append it to your linspace array.¶

Question 10: Find the shape of your last array, and then print the sum of that array¶

You will have received a link to exercises to do at the end of each day's lesson (Exercises.ipynb). Please do the Day 1 exercises with your fellow students before the start of the next session!¶

Question 2: Convert `i` (defined in the previous example) to a float and to a string, assigning each to a new variable name. Convert `j` into an integer and assign it to a variable. Use `print(type(variable_name))` to check each answer.¶

Finally, convert m to an integer. Then check its type with the command `type(new_variable_name)`.¶

Question 4: Create a `linspace` array with ten entries between 1 and 100. Create an `arange` array from 100 to 200 (with 200 included!) spaced by 10s. Print them to check :)¶

Question 8: Create a new array which is your `arange` array divided by 2. Create another new array which is your `linspace` array + 2.¶