This library binds the power of plotly with the flexibility of pandas for easy plotting.
This library is available on https://github.com/santosjorge/cufflinks
This tutorial assumes that the plotly user credentials have already been configured as stated on the getting started guide.
import pandas as pd
import cufflinks as cf
import numpy as np
%reload_ext autoreload
%autoreload 2
We make all charts public and set a global theme
cf.set_config_file(world_readable=True,theme='pearl')
We create a set of timeseries
df=pd.DataFrame(np.random.randn(100,5),index=pd.date_range('1/1/15',periods=100),
columns=['IBM','MSFT','GOOG','VERZ','APPL'])
df=df.cumsum()
iplot can be used on any DataFrame to plot on a plotly chart. If no filename is specified then a generic Plotly Playground file is created.
All the charts are created as private by default. To make them public you can use world_readable=True
Let's look at the avilable parameters
help(df.iplot)
Help on method _iplot in module cufflinks.plotlytools: _iplot(self, data=None, layout=None, filename='', world_readable=None, kind='scatter', title='', xTitle='', yTitle='', zTitle='', theme=None, colors=None, colorscale=None, fill=False, width=None, mode='lines', symbol='dot', size=12, barmode='', sortbars=False, bargap=None, bargroupgap=None, bins=None, histnorm='', histfunc='count', orientation='v', boxpoints=False, annotations=None, keys=False, bestfit=False, bestfit_colors=None, categories='', x='', y='', z='', text='', gridcolor=None, zerolinecolor=None, margin=None, subplots=False, shape=None, asFrame=False, asDates=False, asFigure=False, asImage=False, dimensions=(1116, 587), asPlot=False, asUrl=False, **kwargs) method of pandas.core.frame.DataFrame instance Returns a plotly chart either as inline chart, image of Figure object Parameters: ----------- data : Data Plotly Data Object. If not entered then the Data object will be automatically generated from the DataFrame. data : Data Plotly Data Object. If not entered then the Data object will be automatically generated from the DataFrame. layout : Layout Plotly layout Object If not entered then the Layout objet will be automatically generated from the DataFrame. filename : string Filename to be saved as in plotly account world_readable : bool If False then it will be saved as a private file kind : string Kind of chart scatter bar box spread ratio heatmap surface histogram bubble bubble3d scatter3d title : string Chart Title xTitle : string X Axis Title yTitle : string Y Axis Title zTitle : string zTitle : string Z Axis Title Applicable only for 3d charts theme : string Layout Theme solar pearl white see cufflinks.getThemes() for all available themes colors : list or dict {key:color} to specify the color for each column [colors] to use the colors in the defined order colorscale : str Color scale name If the color name is preceded by a minus (-) then the scale is inversed Only valid if 'colors' is null See cufflinks.colors.scales() for available scales fill : bool Filled Traces width : int Line width mode : string Plotting mode for scatter trace lines markers lines+markers lines+text markers+text lines+markers+text symbol : string The symbol that is drawn on the plot for each marker Valid only when mode includes markers dot cross diamond square triangle-down triangle-left triangle-right triangle-up x size : string or int Size of marker Valid only if marker in mode barmode : string Mode when displaying bars group stack overlay * Only valid when kind='bar' sortbars : bool Sort bars in descending order * Only valid when kind='bar' bargap : float Sets the gap between bars [0,1) * Only valid when kind is 'histogram' or 'bar' bargroupgap : float Set the gap between groups [0,1) * Only valid when kind is 'histogram' or 'bar' bins : int Specifies the number of bins * Only valid when kind='histogram' histnorm : string '' (frequency) percent probability density probability density Sets the type of normalization for an histogram trace. By default the height of each bar displays the frequency of occurrence, i.e., the number of times this value was found in the corresponding bin. If set to 'percent', the height of each bar displays the percentage of total occurrences found within the corresponding bin. If set to 'probability', the height of each bar displays the probability that an event will fall into the corresponding bin. If set to 'density', the height of each bar is equal to the number of occurrences in a bin divided by the size of the bin interval such that summing the area of all bins will yield the total number of occurrences. If set to 'probability density', the height of each bar is equal to the number of probability that an event will fall into the corresponding bin divided by the size of the bin interval such that summing the area of all bins will yield 1. * Only valid when kind='histogram' histfunc : string count sum avg min max Sets the binning function used for an histogram trace. * Only valid when kind='histogram' orientation : string h v Sets the orientation of the bars. If set to 'v', the length of each | bar will run vertically. If set to 'h', the length of each bar will | run horizontally * Only valid when kind is 'histogram','bar' or 'box' boxpoints : string Displays data points in a box plot outliers all suspectedoutliers False annotations : dictionary Dictionary of annotations {x_point : text} keys : list of columns List of columns to chart. Also can be usded for custom sorting. bestfit : boolean or list If True then a best fit line will be generated for all columns. If list then a best fit line will be generated for each key on the list. bestfit_colors : list or dict {key:color} to specify the color for each column [colors] to use the colors in the defined order categories : string Name of the column that contains the categories x : string Name of the column that contains the x axis values y : string Name of the column that contains the y axis values z : string Name of the column that contains the z axis values text : string Name of the column that contains the text values gridcolor : string Grid color zerolinecolor : string Zero line color margin : dict or tuple Dictionary (l,r,b,t) or Tuple containing the left, right, bottom and top margins subplots : bool If true then each trace is placed in subplot layout shape : (rows,cols) Tuple indicating the size of rows and columns If omitted then the layout is automatically set * Only valid when subplots=True asFrame : bool If true then the data component of Figure will be of Pandas form (Series) otherwise they will be index values asDates : bool If true it truncates times from a DatetimeIndex asFigure : bool If True returns plotly Figure asImage : bool If True it returns Image * Only valid when asImage=True dimensions : tuple(int,int) Dimensions for image (width,height) asPlot : bool If True the chart opens in browser asUrl : bool If True the chart url is returned. No chart is displayed.
df.iplot(filename='Tutorial 1')
We can pass a theme to the iplot function. 3 themes are available, but you can create your own
df[['APPL','IBM','VERZ']].iplot(theme='white',filename='Tutorial White')
We can also pass common metadata for the chart
df.iplot(theme='pearl',filename='Tutorial Metadata',title='Stock Returns',xTitle='Dates',yTitle='Returns')
We can easily add a bestfit line to any Series
This will automatically add a best fit approximation and the equation as the legend.
df['IBM'].iplot(filename='IBM Returns',bestfit=True)
We can pass any color (either by Hex, RGB or Text *)
*Text values are specified in the cufflinks.colors modules
df['IBM'].iplot(filename='IBM Returns - colors',bestfit=True,colors=['pink'],bestfit_colors=['blue'])
We can add a fill to a trace with fill=True
df['IBM'].iplot(filename='Tutorial Microsoft',fill=True,colors=['green'])
We can easily create a bar chart with the parameter kind
df.sum().iplot(kind='bar',filename='Tutorial Barchart')
Bars can also be stacked by a given dimension
df.resample('M').iplot(kind='bar',barmode='stacked',filename='Tutorial Bar Stacked')
We can also create spread and ratio charts on the fly with kind='spread' and kind='ratio'
df[['VERZ','IBM']].iplot(filename='Tutorial Spread',kind='spread')
(df[['GOOG','MSFT']]+20).iplot(filename='Tutorial Ratio',kind='ratio',colors=['green','red'])
Annotations can be added to the chart and these are automatically positioned correctly.
Annotations should be specified in a dictionary form
annotations={'2015-01-15':'Dividends','2015-03-31':'Split Announced'}
df['MSFT'].iplot(filename='Tutorial Annotations',annotations=annotations)
The output of a chart can be in an image mode as well.
For this we can use asImage=True
We can also set the dimensions (optional) with dimensions=(width,height)
df[['VERZ','MSFT']].iplot(filename='Tutorial Image',theme='white',colors=['pink','blue'],asImage=True,dimensions=(800,500))
It is also possible to get the Plotly Figure as an output to tweak it manually
We can achieve this with asFigure=True
df['GOOG'].iplot(asFigure=True)
{'data': [{'line': {'color': 'rgba(255, 153, 51, 1.0)', 'width': '1.3'}, 'mode': 'lines', 'name': 'GOOG', 'type': u'scatter', 'x': ['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04', '2015-01-05', '2015-01-06', '2015-01-07', '2015-01-08', '2015-01-09', '2015-01-10', '2015-01-11', '2015-01-12', '2015-01-13', '2015-01-14', '2015-01-15', '2015-01-16', '2015-01-17', '2015-01-18', '2015-01-19', '2015-01-20', '2015-01-21', '2015-01-22', '2015-01-23', '2015-01-24', '2015-01-25', '2015-01-26', '2015-01-27', '2015-01-28', '2015-01-29', '2015-01-30', '2015-01-31', '2015-02-01', '2015-02-02', '2015-02-03', '2015-02-04', '2015-02-05', '2015-02-06', '2015-02-07', '2015-02-08', '2015-02-09', '2015-02-10', '2015-02-11', '2015-02-12', '2015-02-13', '2015-02-14', '2015-02-15', '2015-02-16', '2015-02-17', '2015-02-18', '2015-02-19', '2015-02-20', '2015-02-21', '2015-02-22', '2015-02-23', '2015-02-24', '2015-02-25', '2015-02-26', '2015-02-27', '2015-02-28', '2015-03-01', '2015-03-02', '2015-03-03', '2015-03-04', '2015-03-05', '2015-03-06', '2015-03-07', '2015-03-08', '2015-03-09', '2015-03-10', '2015-03-11', '2015-03-12', '2015-03-13', '2015-03-14', '2015-03-15', '2015-03-16', '2015-03-17', '2015-03-18', '2015-03-19', '2015-03-20', '2015-03-21', '2015-03-22', '2015-03-23', '2015-03-24', '2015-03-25', '2015-03-26', '2015-03-27', '2015-03-28', '2015-03-29', '2015-03-30', '2015-03-31', '2015-04-01', '2015-04-02', '2015-04-03', '2015-04-04', '2015-04-05', '2015-04-06', '2015-04-07', '2015-04-08', '2015-04-09', '2015-04-10'], 'y': array([ 1.15972096, 2.61814088, 3.88137325, 5.29048469, 5.45267539, 3.81069438, 2.56234866, 2.48937755, 0.8885047 , 1.42449396, 0.71398967, 2.23910212, 0.54126357, 0.73594014, 2.40198747, 3.19970865, 2.52258531, 2.70797982, 2.54410633, 2.48464073, 5.1813573 , 2.59804433, 4.05577506, 4.08723678, 2.76233312, 5.48137562, 5.1817834 , 4.33260202, 6.34032271, 5.77686686, 5.36107125, 5.3328116 , 5.18925194, 4.66230224, 3.13874977, 3.17508182, 3.1986122 , 3.93469004, 5.25255624, 7.19535909, 7.34507408, 5.84095987, 4.86338622, 4.84782583, 5.67626555, 6.05159475, 6.42908244, 7.44653376, 7.5657167 , 8.51323216, 8.8481517 , 8.55037342, 8.81314153, 7.87584403, 6.87637759, 7.95065448, 10.02877143, 9.40492693, 9.97178043, 10.33123802, 9.64544183, 8.26611953, 8.27434462, 7.27165506, 5.6709917 , 5.45050085, 5.96900983, 6.60072447, 7.78534843, 7.84492484, 5.55131151, 5.572 , 3.13828897, 2.72750849, 2.11072395, 1.32132751, 0.54197202, 0.50989272, 1.49837952, 2.85919443, 4.4573978 , 3.09151662, 4.18383466, 5.24095248, 5.8356141 , 5.63447151, 6.37295694, 4.82640324, 5.18567817, 4.75585821, 3.16939133, 4.46429576, 4.47270732, 4.14814115, 5.31533815, 5.80154658, 6.86509013, 6.66442417, 6.52294608, 7.33802256])}], 'layout': {'bargap': 0.01, 'legend': {'bgcolor': '#F5F6F9', 'font': {'color': '#4D5663'}}, 'paper_bgcolor': '#F5F6F9', 'plot_bgcolor': '#F5F6F9', 'xaxis': {'gridcolor': '#E1E5ED', 'tickfont': {'color': '#4D5663'}, 'title': '', 'titlefont': {'color': '#4D5663'}, 'zerolinecolor': '#E1E5ED'}, 'yaxis': {'gridcolor': '#E1E5ED', 'tickfont': {'color': '#4D5663'}, 'title': '', 'titlefont': {'color': '#4D5663'}, 'zeroline': False, 'zerolinecolor': '#E1E5ED'}}}
We can also get the Data object directly
data=df.to_iplot()
data[0]['name']='My Custom Name'
And pass this directly to iplot
df.iplot(data=data,filename='Tutorial Custom Name')