Get Contents

Example notebook for grabbing headings from notebooks.

Get Notebook File Directories

Grab an ordered list of module notebook directories.

In [7]:
import os

basedir='tm351'
weeks = sorted([w for w in os.listdir(basedir) if w.startswith('Part') and w.endswith('Notebooks')])
weeks
Out[7]:
['Part 01 Notebooks',
 'Part 02 Notebooks',
 'Part 03 Notebooks',
 'Part 04 Notebooks',
 'Part 05 Notebooks',
 'Part 07 Notebooks',
 'Part 08 Notebooks',
 'Part 09 Notebooks',
 'Part 10 Notebooks',
 'Part 11 Notebooks',
 'Part 12 Notebooks',
 'Part 14 Notebooks',
 'Part 15 Notebooks',
 'Part 16 Notebooks',
 'Part 20 Notebooks',
 'Part 21 Notebooks',
 'Part 22 Notebooks',
 'Part 23 Notebooks',
 'Part 25 Notebooks',
 'Part 26 Notebooks']

Parse Notebook

We can parse a notebook by grabbing markdown cells, splitting each line, and then looking for lines that start with a markdown heading marker (i.e. a #). Note that we will get false positives if a markdown cell contains styled code that contains a comment. (A full parser would have to guard out code marked up inside backticks, for example.)

In [39]:
import nbformat as nb
import IPython.nbformat.v4.nbbase as nb4
In [46]:
def getContentFromNotebook(notebook, md=None, debug=True):
    ''' Grab header lines from notebook markdown cells. '''
    md = [] if md is None else md
    tmpnb=nb.read(notebook,nb.NO_CONVERT)
    for i in tmpnb['cells']:
        if (i['cell_type']=='markdown'):
            for line in i['source'].splitlines():
                if line.startswith('#'):
                    txt = line.replace('#','\t')
                    if debug: print(txt)
                    md.append(txt)
    return md 
In [49]:
def processNotebookFolders(weeks, debug=True):
    ''' Find notebooks in folders and extract headings from them. '''
    md=[]
    for week in weeks:
        notebooks = sorted([f for f in os.listdir('{}/{}'.format(basedir, week)) if f.endswith('.ipynb')])
        for notebook in notebooks:
            notebookpath = '{}/{}/{}'.format(basedir, week, notebook)
            txt = '{}/{}'.format( week, notebook)
            if debug: print(txt)
            md.append(txt)
            md = getContentFromNotebook(notebookpath, md, debug)
        if debug: print()
    return md

md = processNotebookFolders(weeks, False)
In [50]:
def saveContentsAsMarkdown(md, fn='contents.md'):
    ''' Write out contents list to a markdown file. '''
    with open(fn, 'w') as outfile:
        outfile.write('\n'.join(md))
    return fn

fn = saveContentsAsMarkdown(md, fn='contents.md')
#preview the out file
!head $fn
Part 01 Notebooks/01.1 Getting started with IPython and Notebooks - Bootcamp.ipynb
	 Python recap and IPython Notebook "bootcamp"
		 Notebook cells
			 Code cells
			 Markdown cells and cell editting
			 In passing, say 'Hi' to the command line...
		 What next?
Part 01 Notebooks/01.2 Python recap.ipynb
	 Python recap
		 Some Python basics
In [ ]: