Chapter 1: Getting started

-- A Python Course for the Humanities by Folgert Karsdorp and Maarten van Gompel


In [ ]:
print("Ready, set, GO!")

Everyone can learn how to program and the best way to learn is by doing. In this tutorial you will be asked to write a lot of code. Click any block of code in this tutorial, such as the one above, and press ctrl+enter to run it. Let's begin right away and write our first little program!


Quiz!

In the code box below, write a simple program that calculates how many minutes there are in seven weeks.

In [ ]:
# insert your code here

Great! You have written your first little program and you've done it without any help! So, can we now go beyond using our programming language as a simple calculator? Before we ask you to write another program, we will first have to explain something about `assignment'.

We can assign values to variables using the = operator. A variable is just a name we give to a particular value, you can imagine it as a box you put a certain value into, and on which you write a name with a black marker. The following code block contains two operations. First, we assign the value 2 to the name x. After that x will hold the value 2. You might say Python stored the value 2 in x. Finally we print the value using the print() command.

In [ ]:
x = 2
print(x)

Now that we stored the value 2 in x, we can use the variable x to do things like the following:

In [ ]:
print(x * x)
print(x == x)
print(x > 6)

Can you figure out what is happening here?

Variables are not just numbers. They can also be text. These are called strings. For example:

In [ ]:
book = "The Lord of the Flies"
print(book)

A string in Python must be enclosed with quotes (either single or double quotes). Without those quotes Python thinks its dealing with variables that have been defined earlier. book is a variable to which we assign the string "The Lord of the Flies", but that same string is not a variable but a value!

Variable names can be chosen arbitrarily. We give a certain value a name, and we are free to pick one to our liking. It is, however, recommended to use senseful names as we will use the variable names in our code directly and not the values they hold.

In [ ]:
# not recommended...
banana = "The Lord of the Flies"
print(banana)

You are free to use the name banana to hold the title "The Lord of the Flies" but you will agree that this naming is not transparent.

Variables can vary and we can update our variables. Say we have counted how many books we have in our office:

In [ ]:
number_of_books = 100

Then, when we obtain a new book, we can update the number of books accordingly:

In [ ]:
number_of_books = number_of_books + 1
print(number_of_books)

Updates like these happen a lot. Python therefore provides a shortcut and you can write the same thing using +=:

In [ ]:
number_of_books += 5
print(number_of_books)

For now the final interesting thing we would like to mention about variables is that we can assign the value of one variable to another variable. We will explain more about this later on, but here you just need to understand the basic mechanism. Before you evaluate the following code block, can you predict what Python will print?

In [ ]:
book = "The Lord of the Flies"
reading = book
print(reading)

Quiz!

Now that you understand all about assigning values to variables, it is time for our second programming quiz. We want you to write some code that defines a variable, name, and assign to it a string that is your name. If your first name is shorter than 5 characters, use your last name. If your last name is also shorter than 5 characters, use the combination of you first and last name.

In [ ]:
# insert your code here
print(name)

What we have learnt

To finish this section, here is an overview of the concepts you have learnt. Go through the list and make sure you understand all the concepts.

  • variable
  • value
  • assignment to variables
  • difference between variables and values
  • strings
  • integers
  • varying variables

String manipulation

Many disciplines within the humanities work on texts. Quite naturally programming for the humanities will focus a lot on manipulating texts. In the last quiz you were asked to define a variable that points to a string that represents your name. We have already seen some basic arithmetic in our very first calculation. Not only numbers, but also strings can be added, or, more precisely, concatenated, together as well:

In [ ]:
book = "The Lord of the Flies"
print(name + " likes " + book + "?")

This string consists of a number of characters. We can access the individual characters with the help of indexing. For example, to find only the first letter of your name, you can type in:

In [ ]:
first_letter = name[0]
print(first_letter)

Notice that to access the first letter, we use the index 0. This might seem odd, but just remember that indexes in Python start at zero.


Quiz!

Now, if you know the length of your name you can ask for the last letter of your name:

In [ ]:
last_letter = name[# fill in the last index of your name (tip indexes start at 0)]
print(last_letter)

It is rather inconvenient having to know how long our strings are if we want to find out what its last letter is. Python provides a simple way of accessing a string from the rear:

In [ ]:
last_letter = name[-1]
print(last_letter)

Alternatively, there is the function len() which returns the length of a string:

In [ ]:
print(len(name))

Do you understand the following?

In [ ]:
print(name[len(name)-1])

Quiz!

Now can you write some code that defines a variable but_last_letter and assign to it the second-to-last letter of your name?

In [ ]:
but_last_letter = # insert your code here
print(but_last_letter)

You're starting to become a real expert in indexing strings. Now what if we would like to find out what the last two or three letters of our name are? In Python we can use so-called slice-indexes or slices for short. To find the first two letters of our name we type in:

In [ ]:
first_two_letters = name[0:2]
print(first_two_letters)

The 0 index is optional, so we could just as well type in name[:2]. This says take all characters of name until you reach index 2. We can also start at index 2 and leave the end index unspecified:

In [ ]:
without_first_two_letters = name[2:]

Because we did not specify the end index, Python continues until it reaches the end of our string. If we would like to find out what the last two letters of our name are, we can type in:

In [ ]:
last_two_letters = name[-2:]
print(last_two_letters)

Take a look at the following picture. Do you fully understand it?


Quiz!

Can you define a variable middle_letters and assign to it all letters of your name except for the first two and the last two?

In [ ]:
middle_letters = # insert your code here
print(middle_letters)

Given the following two words, can you write code that prints out the word humanities using only slicing and concatenation? (So, no quotes are allowed in your code.)

In [ ]:
word1 = "human"
word2 = "opportunities"
# insert your code here

What we have learnt

To finish this section, here is an overview of what we have learnt. Go through the list and make sure you understand all the concepts.

  • concatenation (e.g. addition of strings)
  • indexing
  • slicing
  • len()

Lists

Consider the sentence below:

In [ ]:
sentence = "Python's name is derived from the television series Monty Python's Flying Circus."

Words are made up of characters, and so are string objects in Python. As we will see, it is always to be prefered to represent our data as naturally as possible. Now for the sentence above, it seems more natural to describe it in terms of words than in terms of characters. Say we want to access the first word in our sentence. If we type in:

In [ ]:
first_word = sentence[0]
print(first_word)

Python only prints the first letter of our sentence. (Think about this if you do not understand why.) We can transform our sentence into a list of words (represented by strings) using the split() function as follows:

In [ ]:
words = sentence.split()
print(words)

By issuing the function split on our sentence, Python splits the sentence on spaces and returns a list of words. In many ways a list functions like a string. We can access all of its components using indexes and we can use slice indexes to access parts of the list. Let's try it!


Quiz!

Write a small program that defines a variable first_word and assign to it the first word of our word list. Play around a little with the indexes to see if you really understand how it works.

In [ ]:
first_word = # insert your code here
print(first_word)

A list acts like a container where we can store all kinds of information. We can access a list using indexes and slices. We can also add new items to a list. For that you use the method append. Let's see how it works. Say we want to keep a list of all our good reads. We start with an empty list and we will add some good books to it:

In [ ]:
#start with an empty list
good_reads = []
good_reads.append("The Hunger games")
good_reads.append("A Clockwork Orange")
print(good_reads)

Now, if for some reason we don't like a particular book anymore, we can change it as follows:

In [ ]:
good_reads[0] = "Pride and Prejudice"
print(good_reads)

Quiz!

Here's another small Quiz! Try to change the title of the second book in our good reads collection.

In [ ]:
# insert your code here
print(good_reads)

We just changed one element in a list. Note that if you do the same thing for a string, you will get an error:

In [ ]:
name = "Pythen"
name[4] = "o"

This is because strings (and some other types) are immutable. That is, they cannot be changed, as opposed to lists which are mutable. Let's explore some other ways in which we can manipulate lists.

remove()

Let's assume our good read collection has grown a lot and we would like to remove some of the books from the list. Python provides the method remove that acts upon a list and takes as its argument the items we would like to remove.

In [ ]:
good_reads = ["The Hunger games", "A Clockwork Orange", 
              "Pride and Prejudice", "Water for Elephants",
              "The Shadow of the Wind", "Bel Canto"]

good_reads.remove("Water for Elephants")

print(good_reads)

If we try to remove a book that is not in our collection, Python raises an error (don't be afraid, your computer won't break ;-))

In [ ]:
good_reads.remove("White Oleander")

Quiz!

Define a variable good_reads as an empty list. Now add some of your favorite books to it (at least three) and print the last two books you added.

In [ ]:
# insert your code here

Just as with strings, we can concatenate two lists. Here is an example:

In [ ]:
#first we specify two lists of strings:
good_reads = ["The Hunger games", "A Clockwork Orange", 
              "Pride and Prejudice", "Water for Elephants",
              "The Shadow of the Wind", "Bel Canto"]

bad_reads = ["Fifty Shades of Grey", "Twilight"]

all_reads = good_reads + bad_reads
print(all_reads)

sort()

It is always nice to organise your bookshelf. We can sort our collection with the following expression:

In [ ]:
good_reads.sort()
print(good_reads)

nested lists

Up to this point, our lists only have consisted of strings. However, a list can contain all kinds of data types, such as integers and even lists! Do you understand what is happening in the following example?

In [ ]:
nested_list = [[1, 2, 3, 4], [5, 6, 7, 8]]
print(nested_list[0])
print(nested_list[0][0])

We can put this to use to enhance our good read collection with a score for every book we have. An entry in our collection will consist of a score within the range of 1 and 10 and the title of our book. The first element is the title; the second the score: [title, score]. We initialize an empty list:

In [ ]:
good_reads = []

And add two books to it:

In [ ]:
good_reads.append(["Pride and Prejudice", 8])
good_reads.append(["A Clockwork Orange", 9])

Quiz!

Update the good_reads collection with some of your own books and give them all a score. Can you print out the score you gave to the first book in the list? (Tip: you can pile up indexes)

In [ ]:
# insert your code here

What we have learnt

To finish this section, here is an overview of the new concepts and functions you have learnt. Go through them and make sure you understand them all.

  • list
  • mutable versus immutable
  • .split()
  • .append()
  • nested lists
  • .remove()
  • .sort()

Dictionaries

Our little good reads collection is starting to look good and we can perform all kinds of manipulations on it. Now, imagine that our list is large and we would like to look up the score we gave to a particular book. How are we going to find that book? For this purpose Python provides another more appropriate data structure, named dictionary. A dictionary is similar to the dictionaries you have at home. It consists of entries, or keys, that hold a value. Let's define one:

In [ ]:
my_dict = {"book": "physical objects consisting of a number of pages bound together",
           "sword": "a cutting or thrusting weapon that has a long metal blade",
           "pie": "dish baked in pastry-lined pan often with a pastry top"}

Take a close look at the new syntax. Notice the curly brackets and the colons. Keys are located at the left side of the colon; values at the right side. To look up the value of a given key, we 'index' the dictionary using that key:

In [ ]:
description = my_dict["sword"]
print(description)

We say 'index', because we use the same syntax with square brackets when indexing lists or strings. The differences is that we don't use a position number to index a dictionary, but a key. Like lists, dictionaries are mutable which means we can add and remove entries from it. Let's define an empty dictionary and add some books to it. The titles will be our keys and the scores their values. Watch the syntax to add a new entry:

In [ ]:
good_reads = {}
good_reads["Pride and Prejudice"] = 8
good_reads["A Clockwork Orange"] = 9

In a way this is similar to what we have seen before when we altered our book list. There we indexed the list using a integer to access a particular book. Here we directly use the title of the book. Can you imagine why this is so useful?


Quiz!

Update the new good reads datastructure with your own books. Try to print out the score you gave for one of the books.

In [ ]:
# insert your code here

keys(), values()

To retrieve a list of all the books we have in our collection, we can ask the dictionary to return its keys as a list:

In [ ]:
good_reads.keys()

Similarly we can ask for the values:

In [ ]:
good_reads.values()

What we have learnt

To finish this section, here is an overview of the new concepts and functions you have learnt. Make sure you understand them all.

  • dictionary
  • indexing or accessing keys of dictionaries
  • adding items to a dictionary
  • .keys()
  • .values()

Conditions

Simple conditions

A lot of programming has to do with executing a certain piece of code if a particular condition holds. We have already seen two conditions at the very beginning of the chapter. Here we give a brief overview. Can you figure our what all of the conditions do?

In [ ]:
print("2 < 5 =", 2 < 5)
print("3 > 7 =", 3 >= 7)
print("3 == 4 =", 3 == 4)
print("school == homework =", "school" == "homework")
print("Python != perl =", "Python" != "perl")

if, elif and else

The dictionary is a much better data structure for our good reads collections. However, even with dictionaries we might forget which books we added to the collection. What happens if we try to get the score of a book that is not in our collection (and hopefully never will be...)?

In [ ]:
good_reads["Folgert's awesomeness"]

We get an error. A KeyError, which basically means "the key you asked me to look up is not in the dictionary". We will learn a lot more about error handling later, but for now we would like to prevent our program from giving it in the first place. Let's write a little program that prints "X is in the collection" if a particular book is in the collection and "X is NOT in the collection" if it is not.

In [ ]:
book = "A Clockwork Orange"
if book in good_reads:
    print(book + " is in the collection")
    print("A lot more")
    print("Still more to come")
else:
    print(book + " is NOT in the collection")

A lot of new syntax here. Let's go through it step by step. First we ask if the value we assigned to book is in our collection. The part after if evaluates to either True or to False. Let's type that in:

In [ ]:
book in good_reads

Because our book is not in the collection, Python returns False. Let's do the same thing for a book that we know is in the collection:

In [ ]:
"A Clockwork Orange" in good_reads

Indeed, it is in the collection. Back to our if statement. If the expression after if evaluates to True, our program will go on to the next line and print book + " is in the collection". Let's try that as well:

In [ ]:
if "A Clockwork Orange" in good_reads:
    print("Found it!")
In [ ]:
if book in good_reads:
    print("Found it!")

Notice that the print statement in the last code block is not executed. That is because the value we assigned to book is not in our collection and thus the part after if did not evaluate to True. In our little program above we used another statement besides if, namely else. It shouldn't be too hard to figure out what's going on here. The part after else will be executed if the if statement evaluated to False. In English: if the book is not in the collection, print that it is not.

Indentation!

Before we continue, we must first explain to you that the layout of our code is not optional. Unlike in other languages, Python does not make use of curly braces to mark the start and end of expressions. The only delimiter is a colon (:) and the indentation of the code. This indentation must be used consistently throughout your code. The convention is to use 4 spaces as indentation. This means that after you have used a colon (such as in our if statement) the next line should be indented by four spaces more than the previous line.

Sometimes we have various conditions that should all evaluate to something different. For that Python provides the elif statement. We use it similar to if and else. Note however that you can only use elif after an if statement! Above we asked whether a book was in the collection. We can do the same thing for parts of strings or for items in a list. For example we could test whether the letter a is in the word banana:

In [ ]:
"a" in "banana"

Likewise the following evaluates to False:

In [ ]:
"z" in "banana"

Let's use this in an if-elif-else combination:

In [ ]:
word = "rocket science"
if "a" in word:
    print(word + " contains the letter a")
elif "s" in word:
    print(word + " contains the letter s")
else:
    print("What a weird word!")

Quiz!

Let's practice our new condition skills a little. Write a small program that defines a variable weight. If the weight is > 50 pounds, print "There is a $25 charge for luggage that heavy." If it is not, print: "Thank you for your business." Change the value of weight to see both statements. (Tip: make use of the < or > operators)

In [ ]:
# insert your code here

and, or, not

Up to this point, our conditions have consisted of single expresssions. However, quite often we would like to test for multiple conditions and then execute a particular piece of code. Python provides a number of ways to do that. The first is with the and statement. and allows us to juxtapose two expressions that need to be true in order to make the entire expression evaluate to True. Let's see how that works:

In [ ]:
word = "banana"
if "a" in word and "b" in word:
    print("Both a and b are in " + word)

If one of the expressions evaluates to False, nothing will be printed:

In [ ]:
if "a" in word and "z" in word:
    print("Both a and z are in " + word)

Quiz!

Replace and with or in the if statement below. What happens?

In [ ]:
word = "banana"
if "a" in word and "z" in word:
    print("Both a and b are in " + word)

In the code block below, can you add an else statement that prints that none of the letters were found?

In [ ]:
if "a" in word and "z" in word:
    print("Both a and z are in " + word)
# insert your code here

Finally we can use not to test for conditions that are not true.

In [ ]:
if "z" not in word:
    print("z is not in " + word)

Objects, such as strings or integers of lists are True because they exist. Empty strings, lists, dictionaries etc on the other hand are False because in a way they do not exist. We can use this principle to, for example, only execute a piece of code if a certain list contains any values:

In [ ]:
numbers = [1, 2, 3, 4]
if numbers:
    print("I found some numbers!")

Now if our list were empty, Python wouldn't print anything:

In [ ]:
numbers = []
if numbers:
    print("I found some numbers!")

Quiz!

Can you write code that prints "This is an empty list" if the provided list does not contain any values?

In [ ]:
numbers = []
# insert your code here

Can you do the same thing, but this time using the function len()?

In [ ]:
numbers = []
# insert your code here

What we have learnt

To finish this section, here is an overview of the new functions, statements and concepts we have learnt. Go through them and make sure you understand what their purpose is and how they are used.

  • conditions
  • indentation
  • if
  • elif
  • else
  • True
  • False
  • empty objects are false
  • not
  • in
  • and
  • or
  • multiple conditions
  • ==
  • <
  • >
  • !=
  • KeyError

Loops

Programming is most useful if we can perform a certain action on a range of different elements. For example, given a list of words, we would like to know the length of all words, not just one. Now you could do this by going through all the indexes of a list of words and print the length of the words one at a time, taking up as many lines of code as you have indices. Needless to say, this is rather cumbersome.

Python provides the so-called for-statements that allow us to iterate through any iterable object and perform actions on its elements. The basic format of a for-statement is:

for X in iterable:

That reads almost like English. We can print all letters of the word banana as follows:

In [ ]:
for letter in "banana":
    print(letter)

The code in the loop is executed as many times as their are letters, with a different value for the variable letter at each iteration. Read the previous sentence again.

Likewise we can print all the items that are contained in a list:

In [ ]:
colors = ["yellow", "red", "green", "blue", "purple"]
for whatever in colors:
    print("This is color " + whatever)

Since dictionaries are iterable objects as well, we can iterate through our good reads collection as well. This will iterate over the keys of a dictionary:

In [ ]:
for book in good_reads:
    print(book)

We can also iterate over both the keys and the values of a dictionary, this is done as follows:

In [ ]:
good_reads.items()
In [ ]:
for x, y in good_reads.items():
    print(x + " has score " + str(y))

Using items() will, at each iteration, return a nice pair of the key and the value. In the example above the variable book will loop over the keys of the dictionary, and the variable score loops over the respective values.

The above way is the most elegant way of looping over dictionaries, but try to see if you understand the following alternative as well:

In [ ]:
for book in good_reads:
    print(book, "has score", good_reads[book])

Quiz!

The function len() returns the length of an iterable item:

In [ ]:
len("banana")

We can use this function to print the length of each word in the color list. Write your code in the box below:

In [ ]:
colors = ["yellow", "red", "green", "blue", "purple"]
# insert your code here

Now write a small program that iterates through the list colors and appends all colors that contain the letter r to the list colors_with_r. (Tip: use colors_with_r.append)

In [ ]:
colors = ["yellow", "red", "green", "blue", "purple"]
colors_with_r = []
# insert you code here

What we have learnt

Here is an overview of the new concepts, statements and functions we have learnt in this section. Again, go through the list and make sure you understand them all.

  • loop
  • for statement
  • iterable objects
  • variable assignment in a for loop

Final Quiz!

We have covered a lot of ground. Now it is time to put all what we learned together. The following quiz might be quite hard and we would be very impressed if you get it right!

What we want you to do is write code that counts the number of word tokens in which the letter a is present in a small corpus. You need to do this on the basis of a frequency distribution of words that is represented by a dictionary. In this dictionary frequency_distribution, keys are words and values are the frequencies. Assign your value to the variable number_of_as.

In [ ]:
frequency_distribution = {"Beg": 1, "Goddard's": 1, "I": 3, "them": 2, "absent": 1, "already": 1,
                          "alteration": 1, "amazement": 2, "appeared": 1, "apprehensively": 1, 
                          "associations": 1, 'clever': 1, 'clock': 1, 'composedly': 1, 
                          'deeply': 7, 'do': 7, 'encouragement': 1, 'entrapped': 1,
                          'expressed': 1, 'flatterers': 1, 'following': 12, 'gone': 9, 
                          'happening': 4, 'hero': 2, 'housekeeper': 1, 'ingratitude': 1, 
                          'like': 1, 'marriage': 15, 'not': 25, 'opportunities': 1,
                          'outgrown': 1, 'playfully': 2, 'remain': 1, 'required': 2, 
                          'ripening': 1, 'slippery': 1, 'touch': 1, 'twenty-five': 1,
                          'ungracious': 2, 'unwell': 1, 'verses': 1, 'yards': 5}
number_of_as = 0
# insert your code here

You've reached the end of the chapter. Ignore the code below, it's just here to make the page pretty:

In [2]:
from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()
Out[2]:
/* Placeholder for custom user CSS mainly to be overridden in profile/static/custom/custom.css This will always be an empty file in IPython */