Back to the main Index

How to use ipython widgets to analyze pandas Dataframes

Pandas dataframes are very powerful and quite handy when one needs to analyze the results of multiple calculations or a large dataset. Several AbiPy objects (e.g robots) can construct dataframes and one can then use the pandas plot method for quick plots, or the seaborn package for more advanced visualizations.

Unfortunately, this approach becomes a bit tedious when we have to deal with several columns or when we need to customize the plot and we are not pandas/seaborn gurus.

For this reason, AbiPy provides a set of ipython widgets wrapping the Dataframe plot method and the seaborn API. The widgets are defined in abipy.display.seabornw for seaborn and abipy.display.pandasw for pandas:

In [12]:
from __future__ import print_function, division, unicode_literals, absolute_import
%matplotlib notebook

import pandas as pd
import seaborn as sns # See https://seaborn.pydata.org/examples/index.html

# Import abipy widgets
import abipy.display.pandasw as pdw
import abipy.display.seabornw as snw

The idea is very simple. Instead of calling the seaborn API directly:

In [13]:
tips = sns.load_dataset("tips")

sns.jointplot(x="total_bill", y="tip", data=tips, kind="reg");

we call snw.joinplot to build a widget. The widget has controllers that allow us to select the variables to plot and change the options available for this kind of plot. Click the Run interact button to produce the figure:

In [15]:
snw.joinplot(tips)
Out[15]:
<function abipy.display.seabornw.joinplot.<locals>.sns_joinplot>

Use pdw.plot to create a widget wrapping the plot method of pandas Dataframe:

In [16]:
iris = sns.load_dataset("iris")
pdw.plot(iris)
Out[16]:
<function abipy.display.pandasw.plot.<locals>.plot_dataframe>

Now we can start to play with the seaborn API and our widgets:

In [6]:
titanic = sns.load_dataset("titanic")
In [7]:
snw.countplot(titanic)
Out[7]:
<function abipy.display.seabornw.countplot.<locals>.sns_countplot>
In [26]:
tips = sns.load_dataset("tips")
In [28]:
snw.swarmplot(tips)
Out[28]:
<function abipy.display.seabornw.swarmplot.<locals>.sns_swarmplot>
In [29]:
snw.lmplot(tips)
Out[29]:
<function abipy.display.seabornw.lmplot.<locals>.sns_lmplot>
In [30]:
exercise = sns.load_dataset("exercise")
In [16]:
snw.factorplot(exercise)
Out[16]:
<function abipy.display.seabornw.factorplot.<locals>.sns_factorplot>
In [8]:
snw.violinplot(tips)
Out[8]:
<function abipy.display.seabornw.violinplot.<locals>.sns_violinplot>
In [18]:
snw.stripplot(tips)
Out[18]:
<function abipy.display.seabornw.stripplot.<locals>.sns_stripplot>
In [19]:
snw.swarmplot(tips)
Out[19]:
<function abipy.display.seabornw.swarmplot.<locals>.sns_swarmplot>
In [20]:
snw.pointplot(tips)
Out[20]:
<function abipy.display.seabornw.pointplot.<locals>.sns_pointplot>
In [21]:
snw.barplot(tips)
Out[21]:
<function abipy.display.seabornw.barplot.<locals>.sns_barplot>
In [31]:
import numpy as np; np.random.seed(0)
uniform_data = np.random.rand(10, 12)
snw.heatmap(uniform_data)
Out[31]:
<function abipy.display.seabornw.heatmap.<locals>.sns_heatmap>
In [32]:
flights = sns.load_dataset("flights")
flights = flights.pivot("month", "year", "passengers")
In [33]:
snw.clustermap(flights)
Out[33]:
<function abipy.display.seabornw.clustermap.<locals>.sns_clustermap>
In [25]:
# Generate a random dataset with strong simple effects and an interaction
import numpy as np
n = 80
rs = np.random.RandomState(11)
x1 = rs.randn(n)
x2 = x1 / 5 + rs.randn(n)
b0, b1, b2, b3 = .5, .25, -1, 2
y = b0  + b1 * x1 + b2 * x2 + b3 * x1 * x2 + rs.randn(n)
df = pd.DataFrame(np.c_[x1, x2, y], columns=["x1", "x2", "y"])

# Show a scatterplot of the predictors with the estimated model surface
snw.interactplot(df)
Out[25]:
<function abipy.display.seabornw.interactplot.<locals>.sns_interactplot>

Back to the main Index

In [ ]: