#!/usr/bin/env python # coding: utf-8 # > This is one of the 100 recipes of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python. # # # 6.2. Creating beautiful statistical plots with seaborn # 1. Let's import NumPy, matplotlib, and seaborn. # In[ ]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns get_ipython().run_line_magic('matplotlib', 'inline') # 2. We generate a random dataset (following this example on seaborn's website: http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/linear_models.ipynb) # In[ ]: x1 = np.random.randn(80) x2 = np.random.randn(80) x3 = x1 * x2 y1 = .5 + 2 * x1 - x2 + 2.5 * x3 + 3 * np.random.randn(80) y2 = .5 + 2 * x1 - x2 + 2.5 * np.random.randn(80) y3 = y2 + np.random.randn(80) # 2. Seaborn implements many easy-to-use statistical plotting functions. For example, here is how to create a violin plot (showing the distribution of several sets of points). # In[ ]: plt.figure(figsize=(4,3)); sns.violinplot([x1,x2, x3]); # 4. Seaborn also implement all-in-one statistical visualization functions. For example, one can use a single function (`regplot`) to perform *and* display a linear regression between two variables. # In[ ]: plt.figure(figsize=(4,3)); sns.regplot(x2, y2); # 5. Seaborn has built-in support for Pandas data structures. Here, we display the pairwise correlations between all variables defined in a `DataFrame`. # In[ ]: df = pd.DataFrame(dict(x1=x1, x2=x2, x3=x3, y1=y1, y2=y2, y3=y3)) sns.corrplot(df); # > You'll find all the explanations, figures, references, and much more in the book (to be released later this summer). # # > [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (500 pages).