Scientific Programming I

Justin Kitzes

Thanks to Greg Wilson and Matt Davis for ideas.

1. Individual things

The most basic component of any programming language are "things", also called variables or in some languages such as Python, objects.

The most common basic "things" in Python are integers, floats, strings, booleans, and some special objects of various types. We'll meet many of these as we go through the lesson.

TIP: To run the code in a cell quickly, press Ctrl+Enter. To run the code in a cell and then proceed to the next cell, press Ctrl+Shift. The Help menu above lists more shortcuts under the Keyboard Shortcuts menu item. You can also use the buttons under the menu to delete cells (use the Cut button), create new ones, move cells, etc.

In [ ]:
# A thing
In [ ]:
# Use print to show multiple things in the same cell
In [ ]:
# Things can be stored as variables
In [ ]:
# We can use type to determine what kind of thing we have

2. Commands that operate on things

While things are great, things by themselves aren't that much use to us. Right away, we'd like to start doing stuff, that is peforming operations, with various things. There are three common means of performing an operation on a thing.

2.1 Use an operator

All of the basic math operators work like you think they should for numbers. They can also do some useful operations on other things, like strings. There are also boolean operators that compare quantities and give back a bool variable as a result.

In [ ]:
# Standard math operators work as expected on numbers
In [ ]:
# There are also operators for strings
In [ ]:
# Boolean operators compare two things

2.2 Use a function

These will be very familiar to anyone who has programmed in any language, and work like you would expect.

In [ ]:
# There are thousands of functions that operate on things

TIP: To find out what a function does, you can type it's name and then a question mark to get a pop up help window. The window will describe the function and also tell you what input parameters it takes. Parameters in square brackets are optional.

In [ ]:
 

TIP: Many useful functions are not built in to core Python, but are found in external scientific packages. These need to be imported into your Python notebook (or program) before they can be used. In addition to functions, these external packages also often include new types of things, such as numeric arrays and data tables. We'll meet several of the most important scientific packages in the next lesson. When these packages are imported into Python, we usually refer to them as modules.

In [ ]:
# Import the random module and use a function from it
# After importing a module, use a '.' to look inside of it

2.3 Use a method

Before we get any farther into the Python language, we have to say a word about "objects". We will not be teaching object oriented programming in this workshop, but you will encounter objects throughout Python (in fact, even seemingly simple things like ints and strings are actually objects in Python).

In the simplest terms, you can think of an object as a small bundled "thing" that contains within itself both data and functions that operate on that data. For example, strings in Python are objects that contain a set of characters and also various functions that operate on the set of characters. When bundled in an object, these functions are called "methods". The reason to bundle a function up as a method, instead of making it free floating, is that it makes it clear what kind of variable the function can operate on.

Instead of the "normal" function(arguments) syntax, methods are called using the syntax variable.method(arguments). Think of the '.' as short for "look inside".

In [ ]:
# A string is an object
In [ ]:
# Strings have many methods

EXERCISE 1 - Introducing logistic growth

Throughout this lesson, we will successively build towards a program that will calculate the logistic growth of a population of bacteria in a petri dish (or bears in the woods, if you prefer). Logistic growth produces a classic S shaped curve in which a population initially grows very fast, then slows down over time until it reaches a steady state value known as the carrying capacity.

For example, when there are only few bears in the woods and lost of food, they reproduce very fast. As the woods get full of bears and food gets harder to find, growth slows until the number of bears just balances with the amount of food (and hunters) in the woods, at which point the population stops growing.

A commonly used discrete time equation for logistic population growth is

$$ n_{t+1} = n_{t} + r n_{t} (1 - n_{t} / K) $$

where $n_t$ is the population size at time $t$, $r$ is the maximum net per capita growth rate, and $K$ is the carrying capacity of the dish/woods.

To get started, write Python expressions that do the following:

  1. Create variables for r, K, and n0 (the population at time $t = 0$), setting these equal to 0.3, 100, and 10, respectively.
  2. Create the variable n1 and calculate it's value. Do the same for n2.
  3. Check the type of n2 - what is it?

Bonus

  1. Figure out how to test whether n2 is an integer (a mathematical integer, not necessarily whether it is an integer type) (HINT: look at the methods of n2 by typing n2. and pressing tab.)
  2. Modify your calculations for n1 and n2 so that these values are rounded to the nearest integer. When you do so, you might notice that your answer to Bonus #1 above stops working --- why?
In [ ]:
 

3. Collections of things

Once the number of variables that you are interested in starts getting large, working with them all individually starts to get unwieldy. To help stay organized, we can use collections of things.

There are many types of collections in Python, including lists, tuples, dictionaries, and numpy arrays. Here we'll look just at the most flexible and simplest container, the list. Lists are declared using square brackets []. You can get individual elements out of a list using the syntax a[idx].

In [ ]:
# Lists are created with square bracket syntax
In [ ]:
# Lists (and all collections) are also indexed with square brackets
# NOTE: The first index is zero, not one
In [ ]:
# Lists can be sliced by putting a colon between indexes
# NOTE: The end value is not inclusive
In [ ]:
# You can leave off the start or end if desired
In [ ]:
# Lists are objects, like everything else, and have methods such as append

EXERCISE 2 - Storing population size in a list

Copy your code from Exercise 1 into the box below, and do the following:

  1. Modify your code so that the values of n0, n1, and n2 are stored in a list and not as separate individual variables. HINT: You can start off by declaring an empty list using the syntax n = [], and then append each new calculated value of nt to the list.
  2. Get the first and last values in the list, calculate their ratio, and print out "Grew by a factor of " followed by the result.

Bonus

  1. Extract the last value in two different ways: first, by using the index for the last item in the list, and second, presuming that you do not know how long the list is.
In [ ]:
######################################
# This code deletes our old variables
try: del n0, n1, n2, r, K
except: pass
######################################

4. Repeating yourself

So far, everything that we've done could, in principle, be done by hand calculation. In this section and the next, we really start to take advantage of the power of programming languages to do things for us automatically.

We start here with ways to repeat yourself. The two most common ways of doing this are known as for loops and while loops. For loops in Python are useful when you want to cycle over all of the items in a collection, such as a list, and while loops are useful when you want to cycle for an indefinite amount of time until some condition is met.

In [ ]:
# A basic for loop - pay attention to the white space!
In [ ]:
# Sum all of the values in a collection using a for loop
In [ ]:
# Sometimes we want to loop over the indexes of a collection, not just the items
In [ ]:
# While loops are useful when you don't know how many steps you will need,
# and want to stop once a certain condition is met.

TIP: Once we start really generating useful and large collections of data, it becomes unwieldy to inspect our results manually. The code below shows how to make a very simple plot. We'll do more plotting later.

In [ ]:
# Jupyter notebook command to load plotting functions
%pylab inline

# Make some data and plot it

EXERCISE 3 - Using loops to repeat calculations

Let's get smart about calculating our population size. Copy your code from Exercise 2 into the box below, and do the following:

  1. Write a for loop to calculate and store the values of nt for 100 time steps. HINT: Use the range function to pick the number of time steps. For each step, append the new population value to a list called n. What indexing strategy can you use to get the last value of the list each time around the for loop?
  2. Plot the population sizes in the list n.
  3. Play around with the values of r and K and see how it changes the plot. What happens if you set r to 1.9 or 3, or values in between?

Bonus

  1. Modify your code to use a while loop that will stop your calculation after the population size exceeds 80. HINT: Start a step counter i at 1, and check that the population size is less than 80 each time around the loop. Increment the step counter within the loop so that you have a record of what step the calculation stopped at.
In [ ]:
 
In [ ]:
 

5. Making choices

Often we want to check if a condition is True and take one action if it is, and another action if the condition is False. We can achieve this in Python with an if statement.

TIP: You can use any expression that returns a boolean value (True or False) in an if statement. Common boolean operators are ==, !=, <, <=, >, >=.

In [ ]:
# A simple if statement
In [ ]:
# If statements can also use boolean variables

EXERCISE 4 - Adding some randomness to the model

Let's introduce some element of randomness into our population growth model. We'll model a simple "catastrophe" process, in which a catastrophe happens in 10% of the time steps that reduces the population back down to the initial size. Copy your code from Exercise 4 into the box below, and do the following:

  1. Inside your for loop, add a variable called cata, for catastrophe, that will be True if a catastrophe has occurred, and False if it hasn't. A simple way to do this is to generate a pseudorandom number between 0 and 1 using rnd.random(). Check whether this number is less than 0.1 - this check will be True 10% of the time.
  2. Using your boolean variable cata, add an if statement to your for loop that checks whether cata is true in each time step. If it is true, set the population back to the initial size. Otherwise, perform the usual logistic growth calculation.
  3. Plot your results. Run the cell again to see a different random population growth path.

Bonus

  1. Count the number of time steps in your list n in which the population was above 50.
In [ ]:
 
In [ ]:
 

6. Creating chunks with functions and modules

One way to write a program is to simply string together commands, like the ones above, in one long file, and then to run that file to generate your results. This may work, but it can be cognitively difficult to follow the logic of programs written in this style. Also, it does not allow you to reuse your code easily - for example, what if we wanted to run our logistic growth model for several different choices of initial parameters?

The most important ways to "chunk" code into more manageable pieces is to create functions and then to gather these functions into your own modules, and eventually packages. Below we will discuss how to create functions and modules. A third common type of "chunk" in Python is classes, but we will not be covering object-oriented programming in this workshop.

In [ ]:
# It's very easy to write your own functions
In [ ]:
# Once a function is "run" and saved in memory, it's available just like any other function
In [ ]:
# It's useful to include docstrings to describe what your function does
In [ ]:
# All arguments must be present, or the function will return an error
In [ ]:
# Keyword arguments can be used to make some arguments optional by giving them a default value
# All mandatory arguments must come first, in order

EXERCISE 6 - Creating a logistic growth function

Finally, let's turn our logistic growth model into a function that we can use over and over again. Copy your code from Exercise 5 into the box below, and do the following:

  1. Turn your code into a function called logistic_growth that takes four arguments: r, K, n0, and p (the probability of a catastrophe). Make p a keyword argument with a default value of 0.1. Have your function return the n list of population sizes.
  2. In a subsequent cell, call your function with different values of the parameters to make sure it works. Store the returned value of n and make a plot from it.

Bonus

  1. Refactor your function by pulling out the line that actually performs the calculation of the new population given the old population. Make this line another function called grow_one_step that takes in the old population, r, and K, and returns the new population. Have your logistic_growth function use the grow_one_step function to calculate the new population in each time step.
In [ ]: