IPython (Interactive Python) is an enhanced Python shell which provides a more robust and productive development environment for users. There are several key features that set it apart from the standard Python shell.


In IPython, all your inputs and outputs are saved. There are two variables named In and Out which are assigned as you work with your results. All outputs are saved automatically to variables of the form _N, where N is the prompt number, and inputs to _iN. This allows you to recover quickly the result of a prior computation by referring to its number even if you forgot to store it as a variable.

In [1]:
import numpy as np
In [2]:
In [3]:
'import numpy as np\nnp.sin(4)**2'
In [4]:
_1 / 4.

Output is asynchronous

All output is displayed asynchronously as it is generated in the Kernel. If you execute the next cell, you will see the output one piece at a time, not all at the end.

In [5]:
import time, sys
for i in range(8):


If you want details regarding the properties and functionality of any Python objects currently loaded into IPython, you can use the ? to reveal any details that are available:

In [6]:
some_dict = {}
Type:        dict
String form: {}
Length:      0
dict() -> new empty dictionary
dict(mapping) -> new dictionary initialized from a mapping object's
    (key, value) pairs
dict(iterable) -> new dictionary initialized as if via:
    d = {}
    for k, v in iterable:
        d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs
    in the keyword argument list.  For example:  dict(one=1, two=2)

If available, additional detail is provided with two question marks, including the source code of the object itself.

In [7]:
from numpy.linalg import cholesky
Signature: cholesky(a)
def cholesky(a):
    Cholesky decomposition.

    Return the Cholesky decomposition, `L * L.H`, of the square matrix `a`,
    where `L` is lower-triangular and .H is the conjugate transpose operator
    (which is the ordinary transpose if `a` is real-valued).  `a` must be
    Hermitian (symmetric if real-valued) and positive-definite.  Only `L` is
    actually returned.

    a : (..., M, M) array_like
        Hermitian (symmetric if all elements are real), positive-definite
        input matrix.

    L : (..., M, M) array_like
        Upper or lower-triangular Cholesky factor of `a`.  Returns a
        matrix object if `a` is a matrix object.

       If the decomposition fails, for example, if `a` is not


    .. versionadded:: 1.8.0

    Broadcasting rules apply, see the `numpy.linalg` documentation for

    The Cholesky decomposition is often used as a fast way of solving

    .. math:: A \\mathbf{x} = \\mathbf{b}

    (when `A` is both Hermitian/symmetric and positive-definite).

    First, we solve for :math:`\\mathbf{y}` in

    .. math:: L \\mathbf{y} = \\mathbf{b},

    and then for :math:`\\mathbf{x}` in

    .. math:: L.H \\mathbf{x} = \\mathbf{y}.

    >>> A = np.array([[1,-2j],[2j,5]])
    >>> A
    array([[ 1.+0.j,  0.-2.j],
           [ 0.+2.j,  5.+0.j]])
    >>> L = np.linalg.cholesky(A)
    >>> L
    array([[ 1.+0.j,  0.+0.j],
           [ 0.+2.j,  1.+0.j]])
    >>> np.dot(L, L.T.conj()) # verify that L * L.H = A
    array([[ 1.+0.j,  0.-2.j],
           [ 0.+2.j,  5.+0.j]])
    >>> A = [[1,-2j],[2j,5]] # what happens if A is only array_like?
    >>> np.linalg.cholesky(A) # an ndarray object is returned
    array([[ 1.+0.j,  0.+0.j],
           [ 0.+2.j,  1.+0.j]])
    >>> # But a matrix object is returned if A is a matrix object
    >>> LA.cholesky(np.matrix(A))
    matrix([[ 1.+0.j,  0.+0.j],
            [ 0.+2.j,  1.+0.j]])

    extobj = get_linalg_error_extobj(_raise_linalgerror_nonposdef)
    gufunc = _umath_linalg.cholesky_lo
    a, wrap = _makearray(a)
    t, result_t = _commonType(a)
    signature = 'D->D' if isComplexType(t) else 'd->d'
    r = gufunc(a, signature=signature, extobj=extobj)
    return wrap(r.astype(result_t, copy=False))
File:      ~/anaconda3/envs/dev/lib/python3.6/site-packages/numpy/linalg/linalg.py
Type:      function

This syntax can also be used to search namespaces with wildcards (*).

In [8]:
%matplotlib inline
import pylab as plt

Tab completion

Because IPython allows for introspection, it is able to afford the user the ability to tab-complete commands that have been partially typed. This is done by pressing the <tab> key at any point during the process of typing a command.

Place your cursor after the partially-completed command below and press tab:

In [ ]:

This can even be used to help with specifying arguments to functions, which can sometimes be difficult to remember:

In [ ]:

System commands

In IPython, you can type ls to see your files or cd to change directories, just like you would at a regular system prompt:

In [10]:
ls ../data
AIS/                          microbiome_missing.csv
baseball-archive-2011.sqlite  mushroom.csv
baseball.csv                  nashville_precip.txt
baseball.dat*                 occupancy.csv
besx97e.dta                   pima-indians-diabetes.data.txt
bikeshare.csv                 pima-indians-diabetes.metadata.txt
bodyfat.dat*                  pitches.csv
brasil_capitals.txt           pitches.md
cancer.csv                    prostate.data.txt
cdystonia.csv                 radon.csv
concrete.csv                  salmon.txt
credit.csv                    srrs2.dat*
cty.dat*                      survey.db
ebola/                        test_scores.csv
heart_rate.csv                titanic.html
heart_rate.txt                titanic.xls
measles.csv                   TNNASHVI.txt
measles.xlsx                  vlbw.csv
melanoma_data.py              walker.txt
microbiome/                   wine.dat*
microbiome.csv                wisconsin_breast_cancer.csv

Virtually any system command can be accessed by prepending !, which passes any subsequent command directly to the OS.

In [11]:
!locate sklearn | grep pdf 
/bin/sh: 1: locate: not found

You can even use Python variables in commands sent to the OS:

In [12]:
file_type = 'csv'
!ls ../data/*$file_type
../data/baseball.csv	../data/microbiome_missing.csv
../data/bikeshare.csv	../data/mushroom.csv
../data/cancer.csv	../data/occupancy.csv
../data/cdystonia.csv	../data/pitches.csv
../data/concrete.csv	../data/radon.csv
../data/credit.csv	../data/test_scores.csv
../data/heart_rate.csv	../data/vlbw.csv
../data/measles.csv	../data/wisconsin_breast_cancer.csv

The output of a system command using the exclamation point syntax can be assigned to a Python variable.

In [13]:
data_files = !ls ../data/microbiome/
In [14]:

Qt Console

If you type at the system prompt:

$ ipython qtconsole

instead of opening in a terminal, IPython will start a graphical console that at first sight appears just like a terminal, but which is in fact much more capable than a text-only terminal. This is a specialized terminal designed for interactive scientific work, and it supports full multi-line editing with color highlighting and graphical calltips for functions, it can keep multiple IPython sessions open simultaneously in tabs, and when scripts run it can display the figures inline directly in the work area.


Jupyter Notebook

Over time, the IPython project grew to include several components, including:

  • an interactive shell
  • a REPL protocol
  • a notebook document fromat
  • a notebook document conversion tool
  • a web-based notebook authoring tool
  • tools for building interactive UI (widgets)
  • interactive parallel Python

As each component has evolved, several had grown to the point that they warrented projects of their own. For example, pieces like the notebook and protocol are not even specific to Python. As the result, the IPython team created Project Jupyter, which is the new home of language-agnostic projects that began as part of IPython, such as the notebook in which you are reading this text.

The HTML notebook that is part of the Jupyter project supports interactive data visualization and easy high-performance parallel computing.

In [16]:
import matplotlib.pyplot as plt

def f(x):
    return (x-3)*(x-5)*(x-7)+85

import numpy as np
x = np.linspace(0, 10, 200)
y = f(x)
[<matplotlib.lines.Line2D at 0x7c294cdd74e0>]

The notebook lets you document your workflow using either HTML or Markdown.

The Jupyter Notebook consists of two related components:

  • A JSON based Notebook document format for recording and distributing Python code and rich text.
  • A web-based user interface for authoring and running notebook documents.

The Notebook can be used by starting the Notebook server with the command:

$ ipython notebook

This initiates an iPython engine, which is a Python instance that takes Python commands over a network connection.

The IPython controller provides an interface for working with a set of engines, to which one or more iPython clients can connect.

The Notebook gives you everything that a browser gives you. For example, you can embed images, videos, or entire websites.

In [17]:
from IPython.display import HTML
HTML("<iframe src=http://fonnesbeck.github.io/Bios8366 width=700 height=350></iframe>")
/home/fonnesbeck/anaconda3/envs/dev/lib/python3.6/site-packages/IPython/core/display.py:689: UserWarning: Consider using IPython.display.IFrame instead
  warnings.warn("Consider using IPython.display.IFrame instead")
In [18]:
from IPython.display import YouTubeVideo

Remote Code

Use %load to add remote code

In [19]:
# %load http://matplotlib.org/mpl_examples/shapes_and_collections/scatter_demo.py
Simple demo of a scatter plot.
import numpy as np
import matplotlib.pyplot as plt

N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2  # 0 to 15 point radii

plt.scatter(x, y, s=area, c=colors, alpha=0.5)

Mathjax Support

Mathjax ia a javascript implementation $\alpha$ of LaTeX that allows equations to be embedded into HTML. For example, this markup:

"""$$ \int_{a}^{b} f(x)\, dx \approx \frac{1}{2} \sum_{k=1}^{N} \left( x_{k} - x_{k-1} \right) \left( f(x_{k}) + f(x_{k-1}) \right). $$"""

becomes this:

$$ \int_{a}^{b} f(x)\, dx \approx \frac{1}{2} \sum_{k=1}^{N} \left( x_{k} - x_{k-1} \right) \left( f(x_{k}) + f(x_{k-1}) \right). $$

SymPy Support

SymPy is a Python library for symbolic mathematics. It supports:

  • polynomials
  • calculus
  • solving equations
  • discrete math
  • matrices
In [20]:
from sympy import *
x, y = symbols("x y")
In [21]:
eq = ((x+y)**2 * (x+1))
$$\left(x + 1\right) \left(x + y\right)^{2}$$
In [22]:
$$x^{3} + 2 x^{2} y + x^{2} + x y^{2} + 2 x y + y^{2}$$
In [23]:
(1/cos(x)).series(x, 0, 6)
$$1 + \frac{x^{2}}{2} + \frac{5 x^{4}}{24} + O\left(x^{6}\right)$$
In [24]:
limit((sin(x)-x)/x**3, x, 0)
$$- \frac{1}{6}$$
In [25]:
diff(cos(x**2)**2 / (1+x), x)
$$- \frac{4 x \sin{\left (x^{2} \right )} \cos{\left (x^{2} \right )}}{x + 1} - \frac{\cos^{2}{\left (x^{2} \right )}}{\left(x + 1\right)^{2}}$$

Magic functions

IPython has a set of predefined ‘magic functions’ that you can call with a command line style syntax. These include:

  • %run
  • %edit
  • %debug
  • %timeit
  • %paste
  • %load_ext
In [26]:
Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%python  %%python2  %%python3  %%ruby  %%script  %%sh  %%svg  %%sx  %%system  %%time  %%timeit  %%writefile

Automagic is ON, % prefix IS NOT needed for line magics.

Timing the execution of code; the timeit magic exists both in line and cell form:

In [27]:
%timeit np.linalg.eigvals(np.random.rand(100,100))
11.8 ms ± 1.06 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [28]:
%%timeit a = np.random.rand(100, 100)
14.4 ms ± 4.37 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

IPython also creates aliases for a few common interpreters, such as bash, ruby, perl, etc.

These are all equivalent to %%script <name>

In [29]:
puts "Hello from Ruby #{RUBY_VERSION}"
Hello from Ruby 2.3.3
In [30]:
echo "hello from $BASH"
hello from /bin/bash

IPython has an rmagic extension that contains a some magic functions for working with R via rpy2. This extension can be loaded using the %load_ext magic as follows:

In [35]:
%load_ext rpy2.ipython

If the above generates an error, it is likely that you do not have the rpy2 module installed. You can install this now via:

!pip install rpy2
In [37]:
x,y = np.arange(10), np.random.normal(size=10)
%R print(lm(rnorm(10)~rnorm(10)))
lm(formula = rnorm(10) ~ rnorm(10))


In [38]:
%%R -i x,y -o XYcoef
lm.fit <- lm(y~x)
XYcoef <- coef(lm.fit)
lm(formula = y ~ x)

     Min       1Q   Median       3Q      Max 
-2.40415 -0.46348 -0.08018  0.37039  2.11385 

            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.24996    0.76438   0.327    0.752
x           -0.09938    0.14318  -0.694    0.507

Residual standard error: 1.301 on 8 degrees of freedom
Multiple R-squared:  0.0568,	Adjusted R-squared:  -0.0611 
F-statistic: 0.4818 on 1 and 8 DF,  p-value: 0.5073

In [39]:
FloatVector with 2 elements.
0.249965 -0.099384


In addition to MathJax support, you may declare a LaTeX cell using the %latex magic:

In [40]:
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0
\begin{align} \nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\ \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\ \nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\ \nabla \cdot \vec{\mathbf{B}} & = 0 \end{align}


Jupyter also enables objects to declare a JavaScript representation. At first, this may seem odd as output is inherently visual and JavaScript is a programming language. However, this opens the door for rich output that leverages the full power of JavaScript and associated libraries such as D3 for output.

In [41]:

alert("Hello world!");

Exporting and Converting Notebooks

In Jupyter, one can convert an .ipynb notebook document file into various static formats via the nbconvert tool. Currently, nbconvert is a command line tool, run as a script using Jupyter.

In [42]:
!jupyter nbconvert --to html Section0-IPython_and_Jupyter.ipynb
[NbConvertApp] WARNING | pattern 'Section0-IPython_and_Jupyter.ipynb' matched no files
This application is used to convert notebook files (*.ipynb) to various other



Arguments that take values are actually convenience aliases to full
Configurables, whose aliases are listed on the help line. For more information
on full configurables, see '--help-all'.

    set log level to logging.DEBUG (maximize logging output)
    generate default config file
    Answer yes to any questions instead of prompting.
    Execute the notebook prior to export.
    Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
    read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
    Write notebook output to stdout instead of files.
    Run nbconvert in place, overwriting the existing notebook (only 
    relevant when converting to notebook format)
    Clear output of current file and save in place, 
    overwriting the existing notebook.
    Exclude input and output prompts from converted document.
--log-level=<Enum> (Application.log_level)
    Default: 30
    Choices: (0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL')
    Set the log level by value or name.
--config=<Unicode> (JupyterApp.config_file)
    Default: ''
    Full path of a config file.
--to=<Unicode> (NbConvertApp.export_format)
    Default: 'html'
    The export format to be used, either one of the built-in formats, or a
    dotted object name that represents the import path for an `Exporter` class
--template=<Unicode> (TemplateExporter.template_file)
    Default: ''
    Name of the template file to use
--writer=<DottedObjectName> (NbConvertApp.writer_class)
    Default: 'FilesWriter'
    Writer class used to write the  results of the conversion
--post=<DottedOrNone> (NbConvertApp.postprocessor_class)
    Default: ''
    PostProcessor class used to write the results of the conversion
--output=<Unicode> (NbConvertApp.output_base)
    Default: ''
    overwrite base name use for output files. can only be used when converting
    one notebook at a time.
--output-dir=<Unicode> (FilesWriter.build_directory)
    Default: ''
    Directory to write output(s) to. Defaults to output to the directory of each
    notebook. To recover previous default behaviour (outputting to the current
    working directory) use . as the flag value.
--reveal-prefix=<Unicode> (SlidesExporter.reveal_url_prefix)
    Default: ''
    The URL prefix for reveal.js. This can be a a relative URL for a local copy
    of reveal.js, or point to a CDN.
    For speaker notes to work, a local reveal.js prefix must be used.
--nbformat=<Enum> (NotebookExporter.nbformat_version)
    Default: 4
    Choices: [1, 2, 3, 4]
    The nbformat version to write. Use this to downgrade notebooks.

To see all available configurables, use `--help-all`


    The simplest way to use nbconvert is
    > jupyter nbconvert mynotebook.ipynb
    which will convert mynotebook.ipynb to the default format (probably HTML).
    You can specify the export format with `--to`.
    Options include ['asciidoc', 'custom', 'html', 'html_ch', 'html_embed', 'html_toc', 'html_with_lenvs', 'html_with_toclenvs', 'latex', 'latex_with_lenvs', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'selectLanguage', 'slides', 'slides_with_lenvs']
    > jupyter nbconvert --to latex mynotebook.ipynb
    Both HTML and LaTeX support multiple output templates. LaTeX includes
    'base', 'article' and 'report'.  HTML includes 'basic' and 'full'. You
    can specify the flavor of the format used.
    > jupyter nbconvert --to html --template basic mynotebook.ipynb
    You can also pipe the output to stdout, rather than a file
    > jupyter nbconvert mynotebook.ipynb --stdout
    PDF is generated via latex
    > jupyter nbconvert mynotebook.ipynb --to pdf
    You can get (and serve) a Reveal.js-powered slideshow
    > jupyter nbconvert myslides.ipynb --to slides --post serve
    Multiple notebooks can be given at the command line in a couple of 
    different ways:
    > jupyter nbconvert notebook*.ipynb
    > jupyter nbconvert notebook1.ipynb notebook2.ipynb
    or you can specify the notebooks list in a config file, containing::
        c.NbConvertApp.notebooks = ["my_notebook.ipynb"]
    > jupyter nbconvert --config mycfg.py

Currently, nbconvert supports HTML (default), LaTeX, Markdown, reStructuredText, Python and HTML5 slides for presentations. Some types can be post-processed, such as LaTeX to PDF (this requires Pandoc to be installed, however).

In [43]:
!jupyter nbconvert --to pdf Section2_1-Introduction-to-Pandas.ipynb
[NbConvertApp] Converting notebook Section2_1-Introduction-to-Pandas.ipynb to pdf
[NbConvertApp] Writing 48153 bytes to notebook.tex
[NbConvertApp] Building PDF
[NbConvertApp] Running xelatex 3 times: ['xelatex', 'notebook.tex']
[NbConvertApp] Running bibtex 1 time: ['bibtex', 'notebook']
[NbConvertApp] WARNING | bibtex had problems, most likely because there were no citations
[NbConvertApp] PDF successfully created
[NbConvertApp] Writing 46106 bytes to Section2_1-Introduction-to-Pandas.pdf

A very useful online service is the IPython Notebook Viewer which allows you to display your notebook as a static HTML page, which is useful for sharing with others:

In [44]:
<iframe src=http://nbviewer.ipython.org/2352771 width=700 height=300></iframe>

Also, GitHub supports the rendering of Jupyter Notebooks stored on its repositories.

Reproducible Research

reproducing conclusions from a single experiment based on the measurements from that experiment

The most basic form of reproducibility is a complete description of the data and associated analyses (including code!) so the results can be exactly reproduced by others.

Reproducing calculations can be onerous, even with one's own work!

Scientific data are becoming larger and more complex, making simple descriptions inadequate for reproducibility. As a result, most modern research is irreproducible without tremendous effort.

Reproducible research is not yet part of the culture of science in general, or scientific computing in particular.

Scientific Computing Workflow

There are a number of steps to scientific endeavors that involve computing:


Many of the standard tools impose barriers between one or more of these steps. This can make it difficult to iterate, reproduce work.

The Jupyter notebook eliminates or reduces these barriers to reproducibility.

IPython Notebook Viewer Displays static HTML versions of notebooks, and includes a gallery of notebook examples.

NotebookCloud A service that allows you to launch and control IPython Notebook servers on Amazon EC2 from your browser.

A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data A landmark example of reproducible research in genomics: Git repo, iPython notebook, data and scripts.

Jacques Ravel and K Eric Wommack. 2014. All Hail Reproducibility in Microbiome Research. Microbiome, 2:8.

Benjamin Ragan-Kelley et al.. 2013. Collaborative cloud-enabled tools allow rapid, reproducible biological insights. The ISME Journal, 7, 461–464; doi:10.1038/ismej.2012.123;