Python basics: Part 1

Before we can start working with data, we need to work out some of the basics of Python. The goal is to learn enough so that we can do some interesting data work --- we do not need to be Python Jedi.

In this book we will cover

  1. Assignment
  2. Calculation
  3. Help
  4. Strings

Remember: Ask questions as we go.

Assignment

Run the following code block (ctrl+enter). What does the code do?

In [42]:
# Assign the value 10 to the variable x
x = 10

# Print the value of x to the screen
print(x)
10

The print() function is a built-in python function. More on it later.

When writing in a code block, we use the pound sign (#) to create a comment. The python interpretor ignores what follows the #. You can put a comment on the same line as some code. (So we use # to make comments in code blocks and we use # to make headers in markdown blocks.)

Key concept: In the code cell above, you assigned the value of 10 to the variable named x. When we asked the Python interpreter to print x, it printed the value assigned to x.

In [2]:
# Variable names should typically be self-documenting
my_age = 40   # I am old!
print(my_age)
40

I used the underscore symbol in my variable name to make if more readable. Variable names must start with a letter or an underscore. Names can include letters, numbers, and underscores. Variable names are case sensitive. Before running the code below, what do you think the output will look like?

In [3]:
myAge = 25     # My auto insurance is much cheaper
myage = 35     # I can run for president
MYAGE = 50     # I can join AARP
my_age_2 = 16 

print(myAge)
print(myage, MYAGE)
print(my_age, my_age_2, myAge)
25
35 50
40 16 25

We can print multiple variables from the same print statement. Notice that the my_age variable from the earlier code block is available to me in later code blocks. What happens when you run the following code?

In [4]:
2_my_age = 5 
  File "<ipython-input-4-8140e86d2413>", line 1
    2_my_age = 5
     ^
SyntaxError: invalid token
In [5]:
# Assign the value in my_age_2 to another variable
my_age_3 = my_age_2
print(my_age_2, my_age_3)

my_age_2 = 17
print(my_age_2, my_age_3)
16 16
17 16

Calculation

Computers are very good at calculating...

In [8]:
# Multiplication
z = 2*3
zz = 2 * 3                      # Does whitespace matter?
zzz = 2     *     3     

print('This is z, zz, and zzz.')
print(z, zz, zzz)

z_mult = z * zz
print('What is z times zz?')
print(z_mult)
This is z, zz, and zzz.
6 6 6
What is z times zz?
36

Notice that we can use the print() function to directly print messages to the screen. We put the text to be printed inside of single quotation marks. When we do that, we are creating a string variable. More on that soon.

In [9]:
# Division
b = 10/2
bb = 20/5
bbb = 6/4
print(b, bb, bbb)
5.0 4.0 1.5

When we used division, the answers were displayed with decimal points, but when we multiplied they were not. Python is creating different types of variables.

Key concept: Every object in Python has a type. There are several types (integer, float, string, list...) which we will learn about as we proceed. We have been dealing mostly with integers so far. Dividing two integers (10 and 2, for example) creates a float (5.0). You can learn about a variable's type using the type() function.

In [10]:
print(b)
print(type(b))

print(z)
print(type(z))
5.0
<class 'float'>
6
<class 'int'>

Types are important. Different types of objects can do different things. We will see this often in class.

In [11]:
# Exponents (note that it is not ^)
a = 2**3
print(a)
8
In [12]:
# We will often use the natural logarthim of a variable in our analysis (why?). 
d = 10
log_d = log(d)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-12-ecfc5e33b2dd> in <module>
      1 # We will often use the natural logarthim of a variable in our analysis (why?).
      2 d = 10
----> 3 log_d = log(d)

NameError: name 'log' is not defined

What just happened? While python has many built in functions, the natural logarithm is not one of them. We need to add a package that includes the logarithm.

Key concept: An important (and powerful) feature of python is that we can add to 'vanilla' python by importing packages. The numpy (numerical python) package is chock full of numerical functions. To use the functions in a package, we first import it.

We normally import all of the packages we are going to use at the beginning of the file. For today, we can just import it here.

In [13]:
# Import the numpy package and give it the shorter name np.
import numpy as np

# To use a function from the numpy package, we use the 'dot' syntax
log_d = np.log(d)
print(log_d)

# The opposite of the natural log is the exponential function
should_be_d = np.exp(log_d)
print(should_be_d)
2.302585092994046
10.000000000000002

We will dig into packages and the 'dot' syntax later.

Practice: Calculation

Take a few minutes and try the following. Feel free to chat with those around if you get stuck. I am here, too.

  1. Suppose you lend \$300 for one year at 5 percent interest. What is the repayment amount? Create the variable principal and set it to 300 and the variable i and set it equal to 0.05. Create a variable named payoff to hold the payoff amount. Print the value of the payoff.
In [14]:
principal = 300
i = 0.05
payoff = principal * (1+i)
print(payoff)
315.0
  1. In a code cell, enter
    r = 5
    r = r+1
    
    and run the cell. What is the value of r?
In [16]:
r = 5
r = r+1
print(r)
6
  1. In the code cell below, enter
    r=r+1
    print(r)
    
    and run the code. What happened?
In [22]:
r = r+1
print(r)
12

Rerun the previous cell (ctrl+enter). What is the value of r? Rerun the cell again. And again. What is happening?

  1. In a code cell, set m=2 and n=3. Write some code that swaps the values of m and n.
In [24]:
# Initialize the variables
m=2
n=3
print(n, m)

# Now swap
temp = m
m = n
n = temp
print(n,m)

# There are easier ways to swap variables. We will see this later...
3 2
2 3

Help, or object introspection

The Jupyter Notebook has an easy way to get help about an object. In a code cell below, enter print? to learn about the print function.

In [25]:
print?

Now try r?

In [26]:
r?

In general, we can use the ? to learn about any object in our programs.

Strings

Strings are collections of characters. Python is very good at manipulating strings, whereas languages like MATLAB and STATA are less so. (More on this in the future.)

In Python, we put strings into quotation marks to assign them to variables. Either single or double quotes will do. These are all legitimate strings:

name = 'Kim Ruhl'
address = "7444 Soc Sci"
zip_code = '53705'

Notice that zip_code looks like an integer. It is not! Enter these three stings in a code window. Then try z = zip_code/3.

In [27]:
name = 'Kim Ruhl'
address = "7444 Soc Sci"
zip_code = '53705'
In [29]:
# Why would I want to divide a zip code by 3? I have no idea!

z = zip_code/3    
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-f540f1f0f3dc> in <module>
      1 # Why would I want to divide a zip code by 3? I have no idea!
      2 
----> 3 z = zip_code/3

TypeError: unsupported operand type(s) for /: 'str' and 'int'
In [30]:
# What type of object is zip_code?

print(type(zip_code))
print(zip_code)
<class 'str'>
53705

We asked Python to divide a string by an integer. It does not know how to do that.

Python does know how to do some "math" with strings. Open a code cell and try this:

first = 'Bucky'
last = 'Badger'
name = first + last
print(name)
In [31]:
first = 'Bucky'
last = 'Badger'
name = first + last
print(name)
BuckyBadger

What does first*2 do? What about 2*first?

In [34]:
print(first*2)
print(2*first)
BuckyBucky
BuckyBucky

Quotation marks

Single and double quotes do the same thing. The statements first_name = 'Kim' and first_name = "Kim" do the same thing.

We need double quotation marks when the string contains a single quote. The statement opinion = "I don't like hot weather" works because the Python interpreter knows the string is contained in the double quotes, so it treats the single quote as a character. In a code cell try opinion = "I don't like hot weather" and opinion_2 = 'I don't like hot weather'

In [35]:
opinion = "I don't like hot weather"
print(opinion)
I don't like hot weather
In [37]:
opinion_2 = 'I don't like hot weather'
print(opinion_2)
  File "<ipython-input-37-0a4dfffcf0f9>", line 1
    opinion_2 = 'I don't like hot weather'
                       ^
SyntaxError: invalid syntax

Lastly, we can use triple quotes (made up of either single or double quote characters) to create strings that break over several lines.

second_coming = """
Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart; the centre cannot hold;
"""
In [38]:
second_coming = """
Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart; the centre cannot hold;
"""
print(second_coming)
Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart; the centre cannot hold;

A second use of triple quotes is to create long comments. We have been using # to create comments in our code, but we can also triple quotes. You will often see a triple quote at the beginning of a program.

"""
This program computes average GDP growth in the United States.
Kim J. Ruhl
August 14, 2018
"""

This bit of code will not do anything. It is only there for humans to read.

Practice: Strings

Take a few minutes and try the following. Feel free to chat with those around if you get stuck. The TA and I are here, too.

  1. In which of the following is x a string? Edit this markdown cell and type 'string' or 'not string' next to each example.

    1. x = '10' String
    2. x= 10 Not a string
    3. x= "Hello World" String
    4. x = 'Lake Mendota' String
    5. x = 3.5 String
    6. x = '[email protected]' String
  2. Fix this expression:

    whose_car = 'Jane's'
  3. In our first and last name example, our output was BuckyBadger. Go back and fix your code so that name has a space between the first and last names. When you print name, it should look like: Bucky Badger.

In [43]:
whose_car = "Jane's"
print(whose_car)
Jane's
In [44]:
name = first + ' ' + last
print(name)
Bucky Badger