Importing Pandas, NumPy and plotly (also importing getpass to type the password in the console running IPython)
import pandas as pd
import numpy as np
import plotly
import getpass
Instantiating Plotly with username and API key
api_key=getpass.getpass()
p = plotly.plotly('nipun.batra.1', api_key)
Create a random Pandas DataFrame consisting of 3 columns and 100 rows
df=pd.DataFrame({'A':np.random.rand(100), 'B':np.random.rand(100),'C':np.random.rand(100)} ,index= np.array(range(100)))
df
<class 'pandas.core.frame.DataFrame'> Int64Index: 100 entries, 0 to 99 Data columns (total 3 columns): A 100 non-null values B 100 non-null values C 100 non-null values dtypes: float64(3)
df.describe()
A | B | C | |
---|---|---|---|
count | 100.000000 | 100.000000 | 100.000000 |
mean | 0.466735 | 0.497376 | 0.506264 |
std | 0.272216 | 0.286994 | 0.266733 |
min | 0.007997 | 0.007294 | 0.000326 |
25% | 0.234741 | 0.243540 | 0.296997 |
50% | 0.462034 | 0.521211 | 0.511778 |
75% | 0.690845 | 0.711072 | 0.730888 |
max | 0.986756 | 0.993439 | 0.979052 |
Standard plot produced by Matplotlib/Pandas
df.plot()
<matplotlib.axes.AxesSubplot at 0x4a24e50>
Function to create Plotly series ([x1,y1,x2,y2,....]) from the Pandas DataFrame
def df_to_iplot(df):
'''
Coverting a Pandas Data Frame to Plotly interface
'''
x = df.index.values
lines={}
for key in df:
lines[key]={}
lines[key]["x"]=x
lines[key]["y"]=df[key].values
lines[key]["name"]=key
#Appending all lines
lines_plotly=[lines[key] for key in df]
return lines_plotly
Plotting the DataFrame using iplot
p.iplot(df_to_iplot(df))
That is it! You can pan, zoom and do a bunch more now!
Now let us try some time series
date_rng = pd.date_range('2013-01-01 00:00','2013-01-03 10:00',freq='300s')
df2=pd.DataFrame({'A':np.random.rand(len(date_rng)), 'B':np.random.rand(len(date_rng))}, index=date_rng)
df2
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 697 entries, 2013-01-01 00:00:00 to 2013-01-03 10:00:00 Freq: 300S Data columns (total 2 columns): A 697 non-null values B 697 non-null values dtypes: float64(2)
Standard Pandas plot
df2.plot()
<matplotlib.axes.AxesSubplot at 0x504c150>
p.iplot(df_to_iplot(df2))
We now observe that the x axis is showing the epoch in nano seconds rather than datetime
We modify the function defined above to take care of this
def df_to_iplot(df):
'''
Coverting a Pandas Data Frame to Plotly interface
'''
if df.index.__class__.__name__=="DatetimeIndex":
#Convert the index to MySQL Datetime like strings
x=df.index.format()
#Alternatively, directly use x, since DateTime index is np.datetime64
#see http://nbviewer.ipython.org/gist/cparmer/7721116
#x=df.index.values.astype('datetime64[s]')
else:
x = df.index.values
lines={}
for key in df:
lines[key]={}
lines[key]["x"]=x
lines[key]["y"]=df[key].values
lines[key]["name"]=key
#Appending all lines
lines_plotly=[lines[key] for key in df]
return lines_plotly
p.iplot(df_to_iplot(df2))
from IPython.display import HTML
import requests
Styling up the IPython notebook. Stylesheet courtesy Cam Davidson Pilon and his Book Bayesian Methods for Hackers
styles = requests.get("https://raw.github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/master/styles/custom.css")
HTML(styles.text)