Testing

There are different levels of testing.

Single Use Code Sometime we write codes for one time data analysis. In this case we know exactly what is going in and what we expect to get out. We do sanity checks, we run code on something that we know the expected output and that is about it.

Repetative Use Code When you get good at functional programming you start to use the same function over and over. In this case it is good to write a set of tests which you can run to make sure your function behaves the way you expect it in all situations - not just the first situation you wrote it for.

Test Driven Development In large projects or contract driven projects a project can be defined by its tests (e.g. The program should do x when given y). In this case you can write the tests first and then when you program satisfies all of them, you are done.

Testing the product

The simplest testing you can do is to test that you get what you expect in a specific circumstance. Let's use the module we wrote yesterday

In [1]:
import plot_temperature
In [2]:
dir(plot_temperature)
Out[2]:
['__builtins__',
 '__doc__',
 '__file__',
 '__name__',
 '__package__',
 'convert_fahrenheit_to_celsius',
 'np',
 'plot_data',
 'pyplot',
 'read_csv_file']

We will start by testing read_csv_file

In [3]:
cat mosquito_data_A1.csv
year,temperature,rainfall,mosquitos
2001,87,222,198
2002,72,103,105
2003,77,176,166
2004,89,236,210
2005,88,283,242
2006,89,151,147
2007,71,121,117
2008,88,267,232
2009,85,211,191
2010,75,101,106
In [4]:
def test_output_read_csv_file():
    year, temperature, rainfall, mosquitos = plot_temperature.read_csv_file('mosquito_data_A1.csv')
    year_key = [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010]
    for yr_guess, yr_solution in zip(year, year_key):
        assert yr_guess == yr_solution, 'year doesn\'t match key'

There are two new statements in here:

  1. zip: this says loop over each array together (e.g. take the first item of every array, then the second, then the third, etc)
  2. assert: assert statements have the format: assert conditional, print_statement. This says if this statement is not true, print this message
In [5]:
test_output_read_csv_file()

Alternately you can type:

In [9]:
def test_output_read_csv_file():
    year, temperature, rainfall, mosquitos = plot_temperature.read_csv_file('mosquito_data_A1.csv')
    assert (year == np.array([2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010])).all()
In [10]:
test_output_read_csv_file()

This test actually tests that given known input we get out what we expect.

Challenge

Write a test for convert_fahrenheit_to_celsius

Solution

In [13]:
def test_convert_fahrenheit_to_celsius():
    temp_c = plot_temperature.convert_fahrenheit_to_celsius(32)
    assert temp_c == 0, '32F != 0C'
In [14]:
test_convert_fahrenheit_to_celsius()

Testing inputs

Sometimes you may want to test that a user gave you the right input. For instance in read_csv_file you may want to test that what is passed in is a string. This often needs to be done in the code.

In [22]:
def read_csv_file(filename):
    '''
    This code will read in a CSV file of year, temperature, rainfall, and number of mosquitos and return 4 arrays, one for each column
    '''
    assert type(filename) is str, 'filename must be a string'
    year, temperature, rainfall, mosquitos = np.genfromtxt(filename, skiprows = 1, delimiter = ',', unpack = True)
    return year, temperature, rainfall, mosquitos
In [24]:
read_csv_file('mosquito_data_A1.csv')
Out[24]:
(array([ 2001.,  2002.,  2003.,  2004.,  2005.,  2006.,  2007.,  2008.,
        2009.,  2010.]),
 array([ 87.,  72.,  77.,  89.,  88.,  89.,  71.,  88.,  85.,  75.]),
 array([ 222.,  103.,  176.,  236.,  283.,  151.,  121.,  267.,  211.,  101.]),
 array([ 198.,  105.,  166.,  210.,  242.,  147.,  117.,  232.,  191.,  106.]))
In [25]:
def read_csv_file(filename):
    '''
    This code will read in a CSV file of year, temperature, rainfall, and number of mosquitos and return 4 arrays, one for each column
    '''
    assert isinstance(filename, str), 'filename must be a string'
    year, temperature, rainfall, mosquitos = np.genfromtxt(filename, skiprows = 1, delimiter = ',', unpack = True)
    return year, temperature, rainfall, mosquitos
In [26]:
read_csv_file('mosquito_data_A1.csv')
Out[26]:
(array([ 2001.,  2002.,  2003.,  2004.,  2005.,  2006.,  2007.,  2008.,
        2009.,  2010.]),
 array([ 87.,  72.,  77.,  89.,  88.,  89.,  71.,  88.,  85.,  75.]),
 array([ 222.,  103.,  176.,  236.,  283.,  151.,  121.,  267.,  211.,  101.]),
 array([ 198.,  105.,  166.,  210.,  242.,  147.,  117.,  232.,  191.,  106.]))

What if our statement is not true

In [27]:
read_csv_file(4)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-27-946ff756682b> in <module>()
----> 1 read_csv_file(4)

<ipython-input-25-75f92f4ed7c9> in read_csv_file(filename)
      3     This code will read in a CSV file of year, temperature, rainfall, and number of mosquitos and return 4 arrays, one for each column
      4     '''
----> 5     assert isinstance(filename, str), 'filename must be a string'
      6     year, temperature, rainfall, mosquitos = np.genfromtxt(filename, skiprows = 1, delimiter = ',', unpack = True)
      7     return year, temperature, rainfall, mosquitos

AssertionError: filename must be a string

Challenge

write a function which tests the input of convert_fahrenheit_to_celsius()

Solution

In [29]:
def convert_fahrenheit_to_celsius(temp_in_f):
    '''
    This code will convert an array of tempertures from fahrenheit to celsius
    '''
    assert isinstance(temp_in_f, float) or isinstance(temp_in_f, int), 'temperature must be an int or float'
    temp_in_c = (temp_in_f - 32) * 5 / 9.0
    return temp_in_c
In [32]:
convert_fahrenheit_to_celsius(4)
convert_fahrenheit_to_celsius(10.)
convert_fahrenheit_to_celsius('a')
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-32-05c4afff0b4d> in <module>()
      1 convert_fahrenheit_to_celsius(4)
      2 convert_fahrenheit_to_celsius(10.)
----> 3 convert_fahrenheit_to_celsius('a')

<ipython-input-29-0465d69b5f77> in convert_fahrenheit_to_celsius(temp_in_f)
      3     This code will convert an array of tempertures from fahrenheit to celsius
      4     '''
----> 5     assert isinstance(temp_in_f, float) or isinstance(temp_in_f, int), 'temperature must be an int or float'
      6     temp_in_c = (temp_in_f - 32) * 5 / 9.0
      7     return temp_in_c

AssertionError: temperature must be an int or float

Move out of the notebook

  1. Copy and paste updated versions of read_csv_file and convert_fahrenheit_to_celsius into file and save
  2. Copy and paste test_... functions into a file and save it as test_plot_temperature.py
  3. type nosetests
In [ ]: