Testing inside an IPython Notebook

In my courses and as part of Software Carpentry I'm teaching more and more in IPython notebooks. One of the things I/we teach is testing, so I wondered if I could teach it in a notebook as well. With quite a lot of help from Matthias Bussonnier I got it working so I thought I'd share.

An example

We will test a function that caluculates the GC-content of a DNA sequence. The GC-content is simply the percentage of bases in the DNA sequence that are either G's or C's. So, for example, the GC-content of 'ATTGC' is 40%.

The function we are testing is get_gc_content() and it takes a single argument, which is a string represting a sequence. This function is in a custom module called dna_analysis.py.

Create the module to test

We can use the %%file magic to save a block of code to a file, so we can use that to create the module that we're going to test.

In [1]:
%%file dna_analysis.py

"""Code for analyzing DNA sequences"""

from __future__ import division

def get_gc_content(seq):
    """Determine the GC content of a sequence"""
    seq = seq.upper()
    gc_content = 100 * (seq.count('G') + seq.count('C')) / len(seq)
    return gc_content
Writing dna_analysis.py

Create the test module

  1. To use nosetests we need to save the test code to a Python file starting with the word test
  2. We then need to import the function(s) we are going to test
  3. Then we right a simple function (whose name starts with test) that calls the function and checks to see if it returns the right value
In [2]:
%%file test_dna.py

from dna_analysis import get_gc_content

def test_get_gc_content_zero():
    assert get_gc_content('ATTATTAAA') == 0
    
def test_get_gc_content_lowercase():
    assert get_gc_content('atgcatgc') == 50
    
def test_get_gc_content_multiline():
    sequence = """atta
gccg
attt
cccg"""
    assert get_gc_content(sequence) == 50
Writing test_dna.py

Run nosetests

We would normally run nosetests from the command line, so in IPython we can just call !nosetests.

In [3]:
!nosetests
..F
======================================================================
FAIL: test_dna.test_get_gc_content_multiline
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/ethan/Dropbox/Teaching/ProgBiol/repo/ipynbs/test_dna.py", line 15, in test_get_gc_content_multiline
    assert get_gc_content(sequence) == 50
AssertionError

----------------------------------------------------------------------
Ran 3 tests in 0.018s

FAILED (failures=1)

This is exactly we output we want since the original function handles most basic cases, but not multiline strings.

Other kinds of testing

There is also a nice example of running doctests in the notebook.

Value?

I wonder a bit about how valuable this is in the sense that we probably wouldn't normally run tests this way (at least I don't at the moment). But, I think at least for a short workshop where we're teaching all of the Python in a notebook that the value of reducing the cognitive load relative to switching environments for the testing portion might outweigh doing something in a way that might not be exactly how we would do it in day to day work.