**Christopher Fonnesbeck**

*Department of Biostatistics, Vanderbilt University School of Medicine*

reproducing conclusions from a single experiment based on the measurements from that experiment

The most basic form of reproducibility is a complete description of the data and associated analyses (including code!) so the results can be *exactly* reproduced by others.

Reproducing calculations can be onerous, even with one's own work!

Scientific data are becoming larger and more complex, making simple descriptions inadequate for reproducibility. As a result, most modern research is irreproducible without tremendous effort.

There are a number of steps to scientific endeavors that involve computing:

Many of the standard tools impose barriers between one or more of these steps. This can make it difficult to iterate, reproduce work.

**IPython** is an enhanced Python shell which provides a more robust and productive development environment for users.

It includes the **HTML notebook** featured here, as well as support for **interactive data visualization** and easy high-performance **parallel computing**.

In [1]:

```
def f(x):
return (x-3)*(x-5)*(x-7)+85
import numpy as np
x = np.linspace(0, 10, 200)
y = f(x)
plot(x,y)
```

Out[1]:

The HTML lets you document your workflow using either HTML or Markdown.

The IPython Notebook consists of two related components:

- A JSON based Notebook document format for recording and distributing Python code and rich text.
- A web-based user interface for authoring and running notebook documents.

The Notebook can be used by starting the Notebook server with the command:

```
$ ipython notebook
```

This initiates an **iPython engine**, which is a Python instance that takes Python commands over a network connection.

The **IPython controller** provides an interface for working with a set of engines, to which one or more **iPython clients** can connect.

The Notebook gives you everything that a browser gives you. For example, you can embed images, videos, or entire websites.

In [2]:

```
from IPython.display import HTML
HTML("<iframe src=http://co-op.nashvl.org width=700 height=350></iframe>")
```

Out[2]:

In [3]:

```
from IPython.display import YouTubeVideo
YouTubeVideo("BS4Wd5rwNwE")
```

Out[3]:

Use `%load`

to add remote code

Mathjax ia a javascript implementation of LaTeX that allows equations to be embedded into HTML.

$$ \int_{a}^{b} f(x)\, dx \approx \frac{1}{2} \sum_{k=1}^{N} \left( x_{k} - x_{k-1} \right) \left( f(x_{k}) + f(x_{k-1}) \right). $$

SymPy is a Python library for symbolic mathematics. It supports:

- polynomials
- calculus
- solving equations
- discrete math
- matrices

In [4]:

```
from sympy import *
%load_ext sympyprinting
x, y = symbols("x y")
```

In [5]:

```
eq = ((x+y)**2 * (x+1))
eq
```

Out[5]:

In [6]:

```
expand(eq)
```

Out[6]:

In [7]:

```
(1/cos(x)).series(x, 0, 6)
```

Out[7]:

IPython has a set of predefined ‘magic functions’ that you can call with a command line style syntax. These include:

`%run`

`%edit`

`%debug`

`%timeit`

`%paste`

`%load_ext`

In [8]:

```
%lsmagic
```

Timing the execution of code; the `timeit`

magic exists both in line and cell form:

In [9]:

```
%timeit np.linalg.eigvals(np.random.rand(100,100))
```

In [10]:

```
%%timeit a = np.random.rand(100, 100)
np.linalg.eigvals(a)
```

IPython also creates aliases for a few common interpreters, such as bash, ruby, perl, etc.

These are all equivalent to `%%script <name>`

In [11]:

```
%%ruby
puts "Hello from Ruby #{RUBY_VERSION}"
```

In [12]:

```
%%bash
echo "hello from $BASH"
```

IPython has an `rmagic`

extension that contains a some magic functions for working with R via rpy2. This extension can be loaded using the `%load_ext`

magic as follows:

In [13]:

```
%load_ext rmagic
```

In [14]:

```
x,y = arange(10), random.normal(size=10)
```

In [15]:

```
%%R -i x,y -o XYcoef
lm.fit <- lm(y~x)
par(mfrow=c(2,2))
print(summary(lm.fit))
plot(lm.fit)
XYcoef <- coef(lm.fit)
```

In [16]:

```
XYcoef
```

Out[16]:

Before running the next cell, make sure you have first started your cluster, you can use the clusters tab in the dashboard to do so.

In [18]:

```
from IPython.parallel import Client
client = Client()
dv = client.direct_view()
```

In [19]:

```
len(dv)
```

Out[19]:

In [20]:

```
def where_am_i():
import os
import socket
return "In process with pid {0} on host: '{1}'".format(
os.getpid(), socket.gethostname())
```

In [21]:

```
where_am_i_direct_results = dv.apply(where_am_i)
where_am_i_direct_results.get()
```

Out[21]:

IPython Notebook Viewer Displays static HTML versions of notebooks, and includes a gallery of notebook examples.

NotebookCloud A service that allows you to launch and control IPython Notebook servers on Amazon EC2 from your browser.

A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data A landmark example of reproducible research in genomics: Git repo, iPython notebook, data and scripts.