Notebook

This notebook proposes a few requirements for notebooks that are used as source code. The best practices will:

Make more readable notebooks.
Make more reusable notebooks.
Make more diffable notebooks.
Make more reproducible notebooks.

Tim made a cool project https://github.com/betatim/joli/tree/master/joli and it got me thinking about some tests for notebooks if someone was crazy enough use them as source. Maybe pytest is a good framework for testing notebooks. Encouraging testing earlier will improve the longevity of computational thought.

In [1]:

    %reload_ext pidgin

In [2]:

    import pytest, IPython, importnb

nb  = json.loads(pathlib.Path('2018-11-30-Restart-run-all-precommit.ipynb').read_text())

In [3]:

    def monotically_increasing_execution_results(nb):
`monotically_increasing_execution_results` tests if the execution count of the code cells in __nb__ are monotonically increasing.  This check
will assure better different and more reliable state.  This concept is inline with the Million notebook analysis. In that work the execution count
is a feature metric for reproducibility.
    
__exc__ is a list of `enumerate`d code cells and their `"execution_count"`.
        
        exc = [
            (id+1, object['execution_count'])
            for id, object in enumerate(
                object for object in nb['cells']
                if object['cell_type'] == 'code' and object['execution_count'] is not None
            ) 
        ]
Strip cells with `None` for an `"execution_count"`s.

        while exc[-1][1] is None: exc.pop()
        
        assert all(itertools.starmap(int.__eq__, exc)), """The cells are out of order."""

def monotically_increasing_execution_results(nb):

monotically_increasing_execution_results tests if the execution count of the code cells in nb are monotonically increasing. This check will assure better different and more reliable state. This concept is inline with the Million notebook analysis. In that work the execution count is a feature metric for reproducibility.

exc is a list of enumerated code cells and their "execution_count".

    exc = [
        (id+1, object['execution_count'])
        for id, object in enumerate(
            object for object in nb['cells']
            if object['cell_type'] == 'code' and object['execution_count'] is not None
        ) 
    ]

Strip cells with None for an "execution_count"s.

    while exc[-1][1] is None: exc.pop()

    assert all(itertools.starmap(int.__eq__, exc)), """The cells are out of order."""

In [4]:

    def has_markdown_docstring(nb):
`has_markdown_docstring` ensures the source starts with __Markdown__, and thereby the docstring.  This opinion is a consequent of the `importnb` library.
    
        assert nb['cells'][0]['cell_type'] == 'markdown', """The cell in notebook source code should be a __Markdown__ docstring."""

def has_markdown_docstring(nb):

has_markdown_docstring ensures the source starts with Markdown, and thereby the docstring. This opinion is a consequent of the importnb library.

    assert nb['cells'][0]['cell_type'] == 'markdown', """The cell in notebook source code should be a __Markdown__ docstring."""

In [5]:

    # content of conftest.py
    import pytest, pidgin, pathlib, json, itertools

In [6]:

    def pytest_collect_file(parent, path):
`pytest_collect_file` will collect notebooks.

        if path.ext == ".ipynb": return NotebookFile(path, parent)

def pytest_collect_file(parent, path):

pytest_collect_file will collect notebooks.

    if path.ext == ".ipynb": return NotebookFile(path, parent)

In [7]:

    class NotebookFile(pytest.File):
        def collect(self):
`NotebookFile.collect` reads the notebook and runs just one test for now.
            
            nb = __import__('json').load(self.fspath.open())
            yield from(
                AggregateNotebookTests(callable.__name__, self, nb, callable) for callable in (
                    monotically_increasing_execution_results, 
                    has_markdown_docstring
                )
            )

class NotebookFile(pytest.File):
    def collect(self):

NotebookFile.collect reads the notebook and runs just one test for now.

        nb = __import__('json').load(self.fspath.open())
        yield from(
            AggregateNotebookTests(callable.__name__, self, nb, callable) for callable in (
                monotically_increasing_execution_results, 
                has_markdown_docstring
            )
        )

In [8]:

    class AggregateNotebookTests(pytest.Item):
`AggregateNotebookTests` is a `pytest.Item` for testing features of notebook data.

        def runtest(self):  return self.callable(self.nb)
        
        def __init__(self, name, parent, nb, callable):
            super().__init__(name, parent)
            self.nb, self.callable = nb, callable

`AggregateNotebookTests.repr_failure` is really similar the to the <b><i>render_traceback</i></b> attribute provided by `IPython` to customize tracebacks.

        
        def reportinfo(self): return self.fspath, 0, "usecase: %s" % self.name

class AggregateNotebookTests(pytest.Item):

AggregateNotebookTests is a pytest.Item for testing features of notebook data.

    def runtest(self):  return self.callable(self.nb)

    def __init__(self, name, parent, nb, callable):
        super().__init__(name, parent)
        self.nb, self.callable = nb, callable

AggregateNotebookTests.repr_failure is really similar the to the render_traceback attribute provided by IPython to customize tracebacks.

    def reportinfo(self): return self.fspath, 0, "usecase: %s" % self.name

In [9]:

    def _run_pytest_example(): 
        pytest.main('--nbval 2018-11-29-Notebooks-as-source-tests.md.ipynb'.split(), [__import__(__name__)])
        
    
    if __name__ == '__main__' and not __import__('os').environ.get('PYTEST_CURRENT_TEST', None):
        with IPython.utils.capture.capture_output() as report:
            _run_pytest_example()
        print('\n'.join(report.stdout.splitlines()))

============================= test session starts =============================
platform win32 -- Python 3.6.6, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: C:\Users\deathbeds\deathbeds.github.io, inifile:
plugins: xonsh-0.8.1, doctestplus-0.1.3, cov-2.6.0, nbval-0.9.1, hypothesis-3.66.16, pidgin-0.3.0, importnb-0.5.1
collected 12 items

2018-11-29-Notebooks-as-source-tests.md.ipynb .....FFF....               [100%]

================================== FAILURES ===================================
_______ deathbeds\2018-11-29-Notebooks-as-source-tests.md.ipynb::Cell 5 _______
Notebook cell execution failed
Cell 5: Unrun reference cell has outputs

Input:
    def pytest_collect_file(parent, path):
`pytest_collect_file` will collect notebooks.

        if path.ext == ".ipynb": return NotebookFile(path, parent)

_______ deathbeds\2018-11-29-Notebooks-as-source-tests.md.ipynb::Cell 6 _______
Notebook cell execution failed
Cell 6: Unrun reference cell has outputs

Input:
    class NotebookFile(pytest.File):
        def collect(self):
`NotebookFile.collect` reads the notebook and runs just one test for now.
            
            nb = __import__('json').load(self.fspath.open())
            yield from(
                AggregateNotebookTests(callable.__name__, self, nb, callable) for callable in (
                    monotically_increasing_execution_results, 
                    has_markdown_docstring
                )
            )

_______ deathbeds\2018-11-29-Notebooks-as-source-tests.md.ipynb::Cell 7 _______
Notebook cell execution failed
Cell 7: Unrun reference cell has outputs

Input:
    class AggregateNotebookTests(pytest.Item):
`AggregateNotebookTests` is a `pytest.Item` for testing features of notebook data.

        def runtest(self):  return self.callable(self.nb)
        
        def __init__(self, name, parent, nb, callable):
            super().__init__(name, parent)
            self.nb, self.callable = nb, callable

`AggregateNotebookTests.repr_failure` is really similar the to the <b><i>render_traceback</i></b> attribute provided by `IPython` to customize tracebacks.

        
        def reportinfo(self): return self.fspath, 0, "usecase: %s" % self.name


===================== 3 failed, 9 passed in 6.62 seconds ======================

In [ ]: