ADSC, Singapore May 23rd by Jonas Arnfred
For scientific computing, fast iterations and rapid prototyping is essential for designing experiments and working with data. When we can see the result of what we are doing right away, it is a lot simpler to adjust algorithm parameters and fine tune plots to suit our purpose. IPython is an interactive python console with a browser based notebook that comes with support for code, text, mathematical expressions and inline plots. It is made with scientific computing in mind and allows for a matlab-like iterative approach for working with data. In this presentation I will give an introduction to IPython and showcase how it can be used for working with data, plotting and fast prototyping.
This tutorial will cover basic principles in using IPython for development and scientific research. I will assume a basic familiarity with Python coding in general, although I think the code examples should be easy enough to understand for those that aren't. If you have questions, you are of course welcome to interrupt me and ask me to clarify something.
I will start with a few notes on installing Ipython before I introduce the basic features and showcase inline plotting. Then I will go through a small example computing the Mandelbrot set to showcase my normal workflow in IPython. Finally we will have a look at plotting data from the Mandelbrot set using the Pandas and ggPlot libraries.
On Debian/Ubuntu you can install ipython by installing pip:
sudo apt-get install python-pip
Then using pip you can easily install ipython and dependencies:
sudo pip install ipython[notebook]
For pylab (plotting) you will need numpy and matplotlib:
sudo pip install numpy matplotlib
On other platforms you can find instructions on the IPython installation page.
IPython consists of a few different shells that make it easy to enter python and evaluate python expressions. In this tutorial we will focus on the IPython Notebook which is a browser based shell with inline graphics. To open IPython Notebook, open a terminal and navigate to your project directory, then type:
ipython notebook --pylab inline
The 'pylab inline' means that any command creating a pylab plot or showing an image will open up in the interactive python shell.
After entering the command, IPython will open in a browser window in any open browser showing the directory page from where you can load saved IPython notebooks or create a new one. When you create a new notebook you are met with a page featuring text-input (a cell) in which you can enter any python code and have it evaluated by pressing the run button or by pressing Ctrl + Enter:
n = 4+8
"I would like %i shrubberies" % n
'I would like 12 shrubberies'
To insert a new cell below, go to insert and click 'Insert Cell Below' or press Ctrl + m and then b (as in below). You can change any cell and reevaluate it. Any variable assignment that you evaluate will be available in all other cells:
"'n' declared above has a value of %i" % n
"'n' declared above has a value of 12"
Outside of python code you can also specify that a cell contains markdown formatted text (like this one). Markdown is a simple syntax for formatting text documents which makes it easy to include links, lists, code examples etc. You can find a brief overview on the markdown syntax here. In Markdown cells you can also include $\LaTeX$ expressions by wrapping the text in dollar signs: $\LaTeX$ -> $\LaTeX$. as well as html-expressions. The input <span style="color:red">Some text</span> is rendered like: Some text
import numpy
%timeit numpy.sqrt(numpy.ones((400, 400)))
100 loops, best of 3: 2.67 ms per loop
Since we've loaded IPython Notebook with pylab inline, plotting a figure is as simple as typing 'plot' which uses MatPlotLib:
values = numpy.arange(30)**2
plot(values)
[<matplotlib.lines.Line2D at 0x7ff867114390>]
As an example I will iteratively go through the example of coding up the algorithm that computes the Mandelbrot fractal. This example probably makes more sense in the live tutorial than in the notes, but in case anyone is interested I've kept it here.
For the definitions of the Mandelbrot set I take the liberty of quoting wikipedia:
The Mandelbrot set is a mathematical set of points whose boundary is a distinctive and easily recognizable two-dimensional fractal shape. The set is closely related to Julia sets (which include similarly complex shapes) and is named after the mathematician Benoit Mandelbrot, who studied and popularized it.
Mandelbrot set images are made by sampling complex numbers and determining for each whether the result tends towards infinity when a particular mathematical operation is iterated on it. Treating the real and imaginary parts of each number as image coordinates, pixels are colored according to how rapidly the sequence diverges, if at all.
More precisely, the Mandelbrot set is the set of values of c in the complex plane for which the orbit of 0 under iteration of the complex quadratic polynomial $z_{n+1}=z_n^2+c$ remains bounded.
This means that for a $m \times n$ grid of imaginary numbers going from, say, $z = (-x - iy)$ to $z = (x + i)$ we apply function $z_{n+1}=z_n^2+c$ continously and observes when each value diverges, and then we create an $m \times n$ image where each pixel corresponds to how many iterations it took before the corresponding imaginary value in the grid converged.
To do this, we need a grid of evenly spaced imaginary numbers to start with. The most straightforward way (I know of) in numpy is to create to arrays and add them as follows:
import numpy
t1 = numpy.linspace(0,1,5).reshape((1, 5))
t2 = numpy.linspace(1,3,3).reshape((3, 1))
print(t1)
print(t2)
t1 + t2*1j
[[ 0. 0.25 0.5 0.75 1. ]] [[ 1.] [ 2.] [ 3.]]
array([[ 0.00+1.j, 0.25+1.j, 0.50+1.j, 0.75+1.j, 1.00+1.j], [ 0.00+2.j, 0.25+2.j, 0.50+2.j, 0.75+2.j, 1.00+2.j], [ 0.00+3.j, 0.25+3.j, 0.50+3.j, 0.75+3.j, 1.00+3.j]])
We can create a grid of size $m \times n = 600 \times 600$ with values ranging from $-2$ to $1$ for the real part and $-1$ to $1$ for the imaginary part:
m = 600 # Height of plot
n = 600 # Width of plot
values_real = numpy.linspace(-2.3, 1, n).reshape((1,n))
values_imag = numpy.linspace(-1.4, 1.4, m).reshape((m,1))
initial_values = values_real + values_imag*1j
initial_values
array([[-2.30000000-1.4j , -2.29449082-1.4j , -2.28898164-1.4j , ..., 0.98898164-1.4j , 0.99449082-1.4j , 1.00000000-1.4j ], [-2.30000000-1.39532554j, -2.29449082-1.39532554j, -2.28898164-1.39532554j, ..., 0.98898164-1.39532554j, 0.99449082-1.39532554j, 1.00000000-1.39532554j], [-2.30000000-1.39065109j, -2.29449082-1.39065109j, -2.28898164-1.39065109j, ..., 0.98898164-1.39065109j, 0.99449082-1.39065109j, 1.00000000-1.39065109j], ..., [-2.30000000+1.39065109j, -2.29449082+1.39065109j, -2.28898164+1.39065109j, ..., 0.98898164+1.39065109j, 0.99449082+1.39065109j, 1.00000000+1.39065109j], [-2.30000000+1.39532554j, -2.29449082+1.39532554j, -2.28898164+1.39532554j, ..., 0.98898164+1.39532554j, 0.99449082+1.39532554j, 1.00000000+1.39532554j], [-2.30000000+1.4j , -2.29449082+1.4j , -2.28898164+1.4j , ..., 0.98898164+1.4j , 0.99449082+1.4j , 1.00000000+1.4j ]])
Let's now apply the function $z_{n+1}=z_n^2+c$ to this grid. Then for each round we want to know which values are about to diverge. Because I don't have the patience to iterate an infinite amount of round, we just test all values if they are bigger than a certain threshold. Conveniently enough wikipedia argues that we only need to test if the norm of the value is above 2. For an imaginary number $z = (x + iy)$ the norm is $\sqrt{x^2 + y^2} = \sqrt{z * conj(z)}$. For each iteration we log which numbers are divergent in the iterations matrix. We can then print this as a heatmap using matplotlib's imshow function:
values = initial_values
max_iterations = 30
iterations = numpy.ones(initial_values.shape) * max_iterations
for i in range(max_iterations) :
values = values**2 + initial_values
divergent = values * conj(values) > 4
divergent = divergent & (iterations == max_iterations) # Test that we haven't already found this number
iterations[divergent] = i
imshow(iterations)
-c:6: RuntimeWarning: overflow encountered in multiply -c:6: RuntimeWarning: invalid value encountered in multiply -c:5: RuntimeWarning: overflow encountered in square -c:5: RuntimeWarning: invalid value encountered in square
<matplotlib.image.AxesImage at 0x7ff866ec1e90>
Let's set matplotlib defaults to a bigger plot size:
rcParams['figure.figsize'] = 10, 10
imshow(iterations)
<matplotlib.image.AxesImage at 0x7ff866dfea10>
This is a bit cumbersome for experimentation though, so let's define a function so we can more easily play around with the values:
def mandelbrot(width, height, x_lim = (-2.3, 1), y_lim = (-1.4, 1.4), max_iterations = 30) :
m = height # Height of plot
n = width # Width of plot
values_real = numpy.linspace(x_lim[0], x_lim[1], n).reshape((1,n))
values_imag = numpy.linspace(y_lim[0], y_lim[1], m).reshape((m,1))
initial_values = values_real + values_imag*1j
initial_values
values = initial_values
iterations = numpy.ones(initial_values.shape) * max_iterations
for i in range(max_iterations) :
values = values**2 + initial_values
divergent = values * conj(values) > 4
divergent = divergent & (iterations == max_iterations) # Test that we haven't already found this number
iterations[divergent] = i
return iterations
Nice, so let's try to zoom in a bit:
mandelbrot_data = mandelbrot(600, 600, (-0.56, -0.55), (-0.56,-0.55), 90)
imshow(mandelbrot_data)
-c:12: RuntimeWarning: overflow encountered in multiply -c:12: RuntimeWarning: invalid value encountered in multiply -c:11: RuntimeWarning: overflow encountered in square -c:11: RuntimeWarning: invalid value encountered in square
<matplotlib.image.AxesImage at 0x7ff8655f0250>
Here in the end I would like to add a brief overview on data exploration using the library Pandas together with ggplot for python. Pandas is a library providing easy-to-use data structures and data analysis tools for python inspired by R data frames. Similarly ggplot is a plotting library originally developed for R and later partially ported to python. Both can be installed with pip:
pip install pandas ggplot
To keep with the theme of the Mandelbrot brot set, we'll take a look at the data produced by our Mandelbrot function and plot it in various ways. First though, we'll need to construct a Python dataframe with relevant information:
import pandas, ggplot
# Let's pick out a few lines from the image
mandelbrot_df = pandas.DataFrame({ i : f for i,f in enumerate(mandelbrot_data) })
mandelbrot_df
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 90 | 75 | 70 | 69 | 68 | 68 | 68 | 70 | 67 | 66 | 66 | 66 | 64 | 63 | 63 | 62 | 62 | 62 | 62 | 63 | ... |
1 | 73 | 85 | 68 | 68 | 67 | 67 | 67 | 66 | 65 | 65 | 65 | 64 | 63 | 63 | 62 | 62 | 62 | 62 | 62 | 62 | ... |
2 | 70 | 69 | 68 | 67 | 67 | 66 | 66 | 65 | 65 | 64 | 64 | 63 | 63 | 62 | 62 | 62 | 62 | 62 | 61 | 61 | ... |
3 | 69 | 69 | 68 | 67 | 67 | 66 | 66 | 65 | 65 | 64 | 64 | 63 | 63 | 62 | 62 | 62 | 62 | 61 | 61 | 61 | ... |
4 | 69 | 69 | 68 | 67 | 66 | 66 | 65 | 65 | 64 | 64 | 64 | 63 | 63 | 62 | 62 | 62 | 61 | 61 | 61 | 60 | ... |
5 | 69 | 69 | 68 | 67 | 67 | 66 | 66 | 65 | 64 | 64 | 64 | 63 | 63 | 62 | 62 | 61 | 61 | 61 | 61 | 60 | ... |
6 | 69 | 69 | 69 | 68 | 67 | 66 | 66 | 65 | 64 | 64 | 64 | 63 | 63 | 62 | 62 | 61 | 61 | 61 | 60 | 60 | ... |
7 | 71 | 70 | 70 | 73 | 71 | 67 | 66 | 65 | 64 | 64 | 64 | 63 | 63 | 63 | 62 | 61 | 61 | 61 | 60 | 60 | ... |
8 | 74 | 71 | 71 | 75 | 90 | 69 | 67 | 66 | 65 | 64 | 64 | 64 | 63 | 63 | 62 | 61 | 61 | 61 | 60 | 60 | ... |
9 | 79 | 73 | 74 | 85 | 81 | 90 | 80 | 87 | 66 | 65 | 65 | 65 | 65 | 64 | 63 | 62 | 61 | 61 | 60 | 60 | ... |
10 | 82 | 76 | 76 | 90 | 90 | 90 | 77 | 69 | 67 | 66 | 66 | 66 | 70 | 69 | 90 | 62 | 61 | 61 | 60 | 60 | ... |
11 | 90 | 90 | 80 | 81 | 87 | 90 | 90 | 86 | 68 | 67 | 67 | 68 | 70 | 90 | 78 | 63 | 62 | 61 | 60 | 60 | ... |
12 | 90 | 90 | 86 | 90 | 90 | 90 | 74 | 71 | 69 | 68 | 69 | 90 | 82 | 90 | 67 | 64 | 63 | 62 | 61 | 60 | ... |
13 | 90 | 90 | 90 | 90 | 90 | 79 | 76 | 72 | 70 | 70 | 71 | 75 | 90 | 76 | 90 | 90 | 64 | 63 | 61 | 60 | ... |
14 | 90 | 90 | 90 | 90 | 90 | 83 | 90 | 87 | 86 | 71 | 72 | 87 | 90 | 84 | 84 | 90 | 72 | 90 | 64 | 61 | ... |
15 | 90 | 88 | 90 | 90 | 90 | 88 | 90 | 90 | 84 | 74 | 74 | 76 | 90 | 90 | 90 | 78 | 90 | 67 | 63 | 61 | ... |
16 | 90 | 83 | 90 | 90 | 90 | 90 | 90 | 90 | 81 | 77 | 77 | 78 | 81 | 90 | 89 | 90 | 70 | 64 | 62 | 62 | ... |
17 | 78 | 80 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 86 | 90 | 89 | 90 | 90 | 76 | 90 | 77 | 66 | 62 | 62 | ... |
18 | 76 | 80 | 84 | 90 | 90 | 90 | 90 | 90 | 90 | 89 | 90 | 90 | 90 | 89 | 74 | 70 | 69 | 66 | 63 | 62 | ... |
19 | 79 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 77 | 70 | 68 | 66 | 65 | 64 | 63 | ... |
20 | 74 | 77 | 90 | 90 | 90 | 90 | 86 | 90 | 90 | 90 | 90 | 84 | 79 | 74 | 72 | 69 | 66 | 65 | 64 | 63 | ... |
21 | 72 | 90 | 90 | 90 | 90 | 90 | 82 | 90 | 90 | 90 | 90 | 83 | 79 | 90 | 90 | 72 | 67 | 65 | 64 | 64 | ... |
22 | 70 | 72 | 72 | 75 | 90 | 84 | 79 | 90 | 90 | 90 | 90 | 90 | 81 | 90 | 88 | 86 | 72 | 70 | 66 | 65 | ... |
23 | 69 | 70 | 71 | 72 | 76 | 76 | 79 | 85 | 90 | 90 | 90 | 86 | 84 | 89 | 90 | 90 | 77 | 69 | 67 | 66 | ... |
24 | 69 | 69 | 70 | 71 | 73 | 74 | 84 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 86 | 90 | 90 | 68 | 67 | ... |
25 | 68 | 69 | 70 | 71 | 73 | 75 | 79 | 85 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 80 | 78 | 73 | 69 | 69 | ... |
26 | 68 | 68 | 70 | 85 | 86 | 81 | 79 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 75 | 73 | 71 | 70 | ... |
27 | 67 | 67 | 68 | 90 | 80 | 90 | 84 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 77 | 78 | 76 | 90 | ... |
28 | 66 | 67 | 67 | 69 | 73 | 86 | 90 | 90 | 90 | 88 | 88 | 87 | 90 | 90 | 90 | 90 | 80 | 87 | 90 | 90 | ... |
29 | 66 | 66 | 67 | 70 | 90 | 83 | 90 | 90 | 90 | 88 | 84 | 84 | 90 | 90 | 90 | 90 | 86 | 85 | 90 | 90 | ... |
30 | 65 | 66 | 66 | 83 | 80 | 90 | 83 | 90 | 90 | 90 | 81 | 82 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
31 | 65 | 65 | 65 | 66 | 68 | 69 | 76 | 80 | 90 | 90 | 80 | 79 | 82 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
32 | 64 | 64 | 65 | 65 | 66 | 68 | 72 | 90 | 90 | 78 | 76 | 78 | 82 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
33 | 64 | 64 | 64 | 65 | 66 | 67 | 72 | 76 | 76 | 74 | 75 | 79 | 90 | 90 | 90 | 90 | 90 | 85 | 79 | 90 | ... |
34 | 64 | 64 | 64 | 65 | 66 | 66 | 68 | 70 | 71 | 72 | 74 | 78 | 90 | 90 | 90 | 90 | 90 | 90 | 76 | 74 | ... |
35 | 63 | 64 | 64 | 65 | 66 | 66 | 68 | 69 | 70 | 73 | 79 | 81 | 90 | 90 | 90 | 90 | 90 | 90 | 77 | 74 | ... |
36 | 63 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 71 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 87 | 90 | 77 | 76 | ... |
37 | 62 | 63 | 64 | 66 | 72 | 70 | 71 | 70 | 71 | 90 | 90 | 90 | 90 | 90 | 90 | 89 | 84 | 81 | 78 | 78 | ... |
38 | 62 | 62 | 64 | 90 | 87 | 86 | 76 | 72 | 73 | 76 | 81 | 89 | 90 | 90 | 90 | 90 | 90 | 81 | 80 | 81 | ... |
39 | 62 | 62 | 64 | 67 | 90 | 90 | 90 | 75 | 75 | 81 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 84 | 83 | ... |
40 | 62 | 62 | 64 | 65 | 69 | 90 | 90 | 78 | 79 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 87 | ... |
41 | 61 | 62 | 64 | 66 | 69 | 75 | 90 | 82 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
42 | 61 | 63 | 90 | 76 | 87 | 75 | 90 | 88 | 90 | 90 | 90 | 88 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
43 | 61 | 63 | 71 | 76 | 87 | 90 | 85 | 90 | 90 | 89 | 85 | 84 | 86 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
44 | 60 | 61 | 63 | 66 | 68 | 90 | 90 | 90 | 90 | 90 | 85 | 81 | 81 | 82 | 88 | 90 | 90 | 90 | 90 | 90 | ... |
45 | 60 | 61 | 62 | 63 | 66 | 90 | 90 | 90 | 87 | 90 | 84 | 80 | 79 | 80 | 82 | 86 | 90 | 90 | 90 | 90 | ... |
46 | 60 | 61 | 61 | 63 | 64 | 68 | 73 | 79 | 90 | 90 | 90 | 78 | 78 | 80 | 90 | 88 | 90 | 90 | 90 | 87 | ... |
47 | 60 | 61 | 61 | 62 | 65 | 81 | 90 | 90 | 90 | 86 | 78 | 76 | 76 | 79 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
48 | 60 | 61 | 61 | 62 | 90 | 90 | 90 | 90 | 77 | 90 | 75 | 74 | 75 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
49 | 60 | 61 | 61 | 62 | 65 | 71 | 69 | 84 | 74 | 71 | 72 | 73 | 74 | 77 | 86 | 90 | 90 | 90 | 90 | 90 | ... |
50 | 61 | 61 | 62 | 63 | 63 | 65 | 66 | 68 | 69 | 70 | 71 | 73 | 75 | 77 | 82 | 86 | 90 | 90 | 90 | 90 | ... |
51 | 61 | 62 | 62 | 63 | 63 | 65 | 66 | 67 | 68 | 69 | 71 | 74 | 90 | 80 | 84 | 88 | 90 | 90 | 90 | 90 | ... |
52 | 62 | 62 | 63 | 63 | 64 | 64 | 66 | 67 | 69 | 70 | 73 | 87 | 90 | 86 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
53 | 66 | 67 | 64 | 64 | 64 | 65 | 70 | 71 | 74 | 72 | 78 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
54 | 76 | 90 | 66 | 65 | 66 | 67 | 90 | 90 | 77 | 75 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 88 | 84 | 83 | ... |
55 | 90 | 77 | 67 | 67 | 67 | 78 | 90 | 90 | 90 | 86 | 82 | 85 | 89 | 90 | 90 | 90 | 90 | 88 | 90 | 80 | ... |
56 | 90 | 90 | 70 | 68 | 69 | 73 | 88 | 83 | 89 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | ... |
57 | 90 | 75 | 71 | 69 | 70 | 72 | 75 | 79 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 88 | 74 | ... |
58 | 90 | 86 | 72 | 73 | 90 | 78 | 78 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 88 | 85 | 90 | 90 | 72 | ... |
59 | 90 | 76 | 75 | 78 | 89 | 89 | 90 | 90 | 90 | 90 | 90 | 90 | 90 | 86 | 85 | 80 | 88 | 88 | 74 | 71 | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
600 rows × 600 columns
This gives us a data structure that is easy to work with using panda's functionality. For the moment though, we are mostly interested in creating a narrow table with x and y values to plot the data. For this purpose we pick a few rows and use panda's melt function to squeeze the data in to one array:
select_rows_df = pandas.melt(mandelbrot_df[[100,200,300,400]], var_name="line", value_name="iteration") # picking out row 100, 200, 300 and 400 for inspection
select_rows_df
line | iteration | |
---|---|---|
0 | 100 | 90 |
1 | 100 | 62 |
2 | 100 | 61 |
3 | 100 | 61 |
4 | 100 | 60 |
5 | 100 | 60 |
6 | 100 | 59 |
7 | 100 | 59 |
8 | 100 | 59 |
9 | 100 | 59 |
10 | 100 | 59 |
11 | 100 | 59 |
12 | 100 | 59 |
13 | 100 | 59 |
14 | 100 | 59 |
15 | 100 | 60 |
16 | 100 | 60 |
17 | 100 | 60 |
18 | 100 | 61 |
19 | 100 | 62 |
20 | 100 | 65 |
21 | 100 | 78 |
22 | 100 | 85 |
23 | 100 | 90 |
24 | 100 | 77 |
25 | 100 | 90 |
26 | 100 | 90 |
27 | 100 | 90 |
28 | 100 | 90 |
29 | 100 | 78 |
30 | 100 | 79 |
31 | 100 | 74 |
32 | 100 | 86 |
33 | 100 | 61 |
34 | 100 | 60 |
35 | 100 | 59 |
36 | 100 | 58 |
37 | 100 | 57 |
38 | 100 | 56 |
39 | 100 | 55 |
40 | 100 | 55 |
41 | 100 | 55 |
42 | 100 | 55 |
43 | 100 | 54 |
44 | 100 | 54 |
45 | 100 | 54 |
46 | 100 | 54 |
47 | 100 | 54 |
48 | 100 | 54 |
49 | 100 | 55 |
50 | 100 | 55 |
51 | 100 | 56 |
52 | 100 | 90 |
53 | 100 | 90 |
54 | 100 | 90 |
55 | 100 | 74 |
56 | 100 | 60 |
57 | 100 | 90 |
58 | 100 | 90 |
59 | 100 | 58 |
... | ... |
2400 rows × 2 columns
Now we can take a look at the histogram of iteration values for these 4 rows by using ggplot and the geom_histogram method:
from ggplot import *
ggplot(select_rows_df, aes(x="iteration")) + geom_histogram(binwidth = 5)
<ggplot: (8794026754077)>
How about looking at the densities for different lines? We can group the values by the 'line' variable and display the histogram as a density plot:
ggplot(select_rows_df, aes(x="iteration", color="line", fill="line")) + geom_density(alpha=0.3)
<ggplot: (8794026732405)>
Finally how about we show a line plot illustrating the mean of the iteration values as we move down the image line by line?
accumulated_rows_df = mandelbrot_df.mean().reset_index()
accumulated_rows_df
index | 0 | |
---|---|---|
0 | 0 | 72.740000 |
1 | 1 | 72.440000 |
2 | 2 | 72.031667 |
3 | 3 | 72.081667 |
4 | 4 | 72.193333 |
5 | 5 | 72.008333 |
6 | 6 | 71.663333 |
7 | 7 | 71.410000 |
8 | 8 | 71.325000 |
9 | 9 | 70.953333 |
10 | 10 | 70.566667 |
11 | 11 | 70.483333 |
12 | 12 | 70.086667 |
13 | 13 | 70.195000 |
14 | 14 | 69.980000 |
15 | 15 | 69.848333 |
16 | 16 | 69.741667 |
17 | 17 | 69.766667 |
18 | 18 | 69.010000 |
19 | 19 | 68.685000 |
20 | 20 | 68.690000 |
21 | 21 | 69.113333 |
22 | 22 | 68.986667 |
23 | 23 | 68.983333 |
24 | 24 | 68.738333 |
25 | 25 | 69.246667 |
26 | 26 | 69.381667 |
27 | 27 | 69.458333 |
28 | 28 | 69.316667 |
29 | 29 | 69.453333 |
30 | 30 | 69.826667 |
31 | 31 | 70.451667 |
32 | 32 | 70.235000 |
33 | 33 | 70.575000 |
34 | 34 | 70.766667 |
35 | 35 | 71.000000 |
36 | 36 | 71.133333 |
37 | 37 | 71.493333 |
38 | 38 | 71.570000 |
39 | 39 | 71.090000 |
40 | 40 | 70.578333 |
41 | 41 | 70.438333 |
42 | 42 | 70.356667 |
43 | 43 | 70.035000 |
44 | 44 | 69.601667 |
45 | 45 | 69.190000 |
46 | 46 | 69.500000 |
47 | 47 | 69.375000 |
48 | 48 | 69.088333 |
49 | 49 | 68.985000 |
50 | 50 | 68.501667 |
51 | 51 | 68.260000 |
52 | 52 | 68.485000 |
53 | 53 | 68.431667 |
54 | 54 | 68.435000 |
55 | 55 | 68.381667 |
56 | 56 | 68.756667 |
57 | 57 | 68.773333 |
58 | 58 | 68.458333 |
59 | 59 | 67.991667 |
... | ... |
600 rows × 2 columns
ggplot(accumulated_rows_df, aes(x="index", y=0)) + geom_line(position="jitter")
<ggplot: (8794053609001)>
For ggplot you can find more plotting options here. Pandas is a very powerful data analysis tool and we've barely touched the surface. To get a glimpse of what you can do with Pandas I suggest taking a look at the 10 minute tour of Pandas.