Notebook

Cufflinks¶

This library binds the power of plotly with the flexibility of pandas for easy plotting.

This library is available on https://github.com/santosjorge/cufflinks

This tutorial assumes that the plotly user credentials have already been configured as stated on the getting started guide.

In [1]:

import pandas as pd
import cufflinks as cf
import numpy as np

In [2]:

%reload_ext autoreload
%autoreload 2

We make all charts public and set a global theme

In [3]:

cf.set_config_file(world_readable=True,theme='pearl')

We create a set of timeseries

In [26]:

df=pd.DataFrame(np.random.randn(100,5),index=pd.date_range('1/1/15',periods=100),
                columns=['IBM','MSFT','GOOG','VERZ','APPL'])
df=df.cumsum()

iplot can be used on any DataFrame to plot on a plotly chart. If no filename is specified then a generic Plotly Playground file is created.

All the charts are created as private by default. To make them public you can use world_readable=True

Let's look at the avilable parameters

In [5]:

help(df.iplot)

Help on method _iplot in module cufflinks.plotlytools:

_iplot(self, data=None, layout=None, filename='', world_readable=None, kind='scatter', title='', xTitle='', yTitle='', zTitle='', theme=None, colors=None, colorscale=None, fill=False, width=None, mode='lines', symbol='dot', size=12, barmode='', sortbars=False, bargap=None, bargroupgap=None, bins=None, histnorm='', histfunc='count', orientation='v', boxpoints=False, annotations=None, keys=False, bestfit=False, bestfit_colors=None, categories='', x='', y='', z='', text='', gridcolor=None, zerolinecolor=None, margin=None, subplots=False, shape=None, asFrame=False, asDates=False, asFigure=False, asImage=False, dimensions=(1116, 587), asPlot=False, asUrl=False, **kwargs) method of pandas.core.frame.DataFrame instance
           Returns a plotly chart either as inline chart, image of Figure object
    
           Parameters:
           -----------
                   data : Data
                           Plotly Data Object.
                           If not entered then the Data object will be automatically
                           generated from the DataFrame.
                   data : Data
                           Plotly Data Object.
                           If not entered then the Data object will be automatically
                           generated from the DataFrame.
                   layout : Layout
                           Plotly layout Object
                           If not entered then the Layout objet will be automatically
                           generated from the DataFrame.
                   filename : string
                           Filename to be saved as in plotly account
                   world_readable : bool
                           If False then it will be saved as a private file
                   kind : string
                           Kind of chart
                                   scatter
                                   bar
                                   box
                                   spread
                                   ratio
                                   heatmap
                                   surface
                                   histogram
                                   bubble
                                   bubble3d
                                   scatter3d               
                   title : string
                           Chart Title                             
                   xTitle : string
                           X Axis Title
                   yTitle : string
                           Y Axis Title
                                   zTitle : string
                   zTitle : string
                           Z Axis Title
                           Applicable only for 3d charts
                   theme : string
                           Layout Theme
                                   solar
                                   pearl
                                   white           
                           see cufflinks.getThemes() for all 
                           available themes
                   colors : list or dict
                           {key:color} to specify the color for each column
                           [colors] to use the colors in the defined order
                   colorscale : str 
                           Color scale name
                           If the color name is preceded by a minus (-) 
                           then the scale is inversed
                           Only valid if 'colors' is null
                           See cufflinks.colors.scales() for available scales
                   fill : bool
                           Filled Traces           
                   width : int
                           Line width      
                   mode : string
                           Plotting mode for scatter trace
                                   lines
                                   markers
                                   lines+markers
                                   lines+text
                                   markers+text
                                   lines+markers+text              
                   symbol : string
                           The symbol that is drawn on the plot for each marker
                           Valid only when mode includes markers
                                   dot
                                   cross
                                   diamond
                                   square
                                   triangle-down
                                   triangle-left
                                   triangle-right
                                   triangle-up
                                   x
                   size : string or int 
                           Size of marker 
                           Valid only if marker in mode
                   barmode : string
                           Mode when displaying bars
                                   group
                                   stack
                                   overlay
                           * Only valid when kind='bar'
                   sortbars : bool
                           Sort bars in descending order
                           * Only valid when kind='bar'
                   bargap : float
                           Sets the gap between bars
                                   [0,1)
                           * Only valid when kind is 'histogram' or 'bar'
                   bargroupgap : float
                           Set the gap between groups
                                   [0,1)
                           * Only valid when kind is 'histogram' or 'bar'          
                   bins : int
                           Specifies the number of bins 
                           * Only valid when kind='histogram'
                   histnorm : string
                                   '' (frequency)
                                   percent
                                   probability
                                   density
                                   probability density
                           Sets the type of normalization for an histogram trace. By default
                           the height of each bar displays the frequency of occurrence, i.e., 
                           the number of times this value was found in the
                           corresponding bin. If set to 'percent', the height of each bar
                           displays the percentage of total occurrences found within the
                           corresponding bin. If set to 'probability', the height of each bar
                           displays the probability that an event will fall into the
                           corresponding bin. If set to 'density', the height of each bar is
                           equal to the number of occurrences in a bin divided by the size of
                           the bin interval such that summing the area of all bins will yield
                           the total number of occurrences. If set to 'probability density',
                           the height of each bar is equal to the number of probability that an
                           event will fall into the corresponding bin divided by the size of
                           the bin interval such that summing the area of all bins will yield
                           1.
                           * Only valid when kind='histogram'
                   histfunc : string
                                   count
                                   sum
                                   avg
                                   min
                                   max
                      Sets the binning function used for an histogram trace. 
                           * Only valid when kind='histogram'           
                   orientation : string
                                   h 
                                   v
                           Sets the orientation of the bars. If set to 'v', the length of each
    |          bar will run vertically. If set to 'h', the length of each bar will
    |          run horizontally
                           * Only valid when kind is 'histogram','bar' or 'box'
                   boxpoints : string
                           Displays data points in a box plot
                                   outliers
                                   all
                                   suspectedoutliers
                                   False
                   annotations : dictionary
                           Dictionary of annotations
                           {x_point : text}
                   keys : list of columns
                           List of columns to chart.
                           Also can be usded for custom sorting.
                   bestfit : boolean or list
                           If True then a best fit line will be generated for
                           all columns.
                           If list then a best fit line will be generated for
                           each key on the list.
                   bestfit_colors : list or dict
                           {key:color} to specify the color for each column
                           [colors] to use the colors in the defined order 
                   categories : string
                           Name of the column that contains the categories
                   x : string
                           Name of the column that contains the x axis values              
                   y : string
                           Name of the column that contains the y axis values
                   z : string
                           Name of the column that contains the z axis values                                      
                   text : string
                           Name of the column that contains the text values        
                   gridcolor : string
                           Grid color      
                   zerolinecolor : string
                           Zero line color
                   margin : dict or tuple
                           Dictionary (l,r,b,t) or
                           Tuple containing the left,
                           right, bottom and top margins
                   subplots : bool
                           If true then each trace is placed in 
                           subplot layout
                   shape : (rows,cols)
                           Tuple indicating the size of rows and columns
                           If omitted then the layout is automatically set
                           * Only valid when subplots=True
                   asFrame : bool
                           If true then the data component of Figure will
                           be of Pandas form (Series) otherwise they will 
                           be index values
                   asDates : bool
                           If true it truncates times from a DatetimeIndex
                   asFigure : bool
                           If True returns plotly Figure
                   asImage : bool
                           If True it returns Image
                           * Only valid when asImage=True
                   dimensions : tuple(int,int)
                           Dimensions for image
                                   (width,height)          
                   asPlot : bool
                           If True the chart opens in browser
                   asUrl : bool
                           If True the chart url is returned. No chart is displayed.

In [27]:

df.iplot(filename='Tutorial 1')

Out[27]:

Customizing Themes¶

We can pass a theme to the iplot function. 3 themes are available, but you can create your own

Solar
Pearl (Default)
White

In [28]:

df[['APPL','IBM','VERZ']].iplot(theme='white',filename='Tutorial White')

Out[28]:

We can also pass common metadata for the chart

In [29]:

df.iplot(theme='pearl',filename='Tutorial Metadata',title='Stock Returns',xTitle='Dates',yTitle='Returns')

Out[29]:

Bestfit Lines¶

We can easily add a bestfit line to any Series

This will automatically add a best fit approximation and the equation as the legend.

In [30]:

df['IBM'].iplot(filename='IBM Returns',bestfit=True)

Out[30]:

Customizing Colors¶

We can pass any color (either by Hex, RGB or Text *)

*Text values are specified in the cufflinks.colors modules

In [31]:

df['IBM'].iplot(filename='IBM Returns - colors',bestfit=True,colors=['pink'],bestfit_colors=['blue'])

Out[31]:

Filled Traces¶

We can add a fill to a trace with fill=True

In [32]:

df['IBM'].iplot(filename='Tutorial Microsoft',fill=True,colors=['green'])

Out[32]:

Bar Charts¶

We can easily create a bar chart with the parameter kind

In [33]:

df.sum().iplot(kind='bar',filename='Tutorial Barchart')

Out[33]:

Bars can also be stacked by a given dimension

In [34]:

df.resample('M').iplot(kind='bar',barmode='stacked',filename='Tutorial Bar Stacked')

Out[34]:

Spread and Ratio charts¶

We can also create spread and ratio charts on the fly with kind='spread' and kind='ratio'

In [35]:

df[['VERZ','IBM']].iplot(filename='Tutorial Spread',kind='spread')

Out[35]:

In [55]:

(df[['GOOG','MSFT']]+20).iplot(filename='Tutorial Ratio',kind='ratio',colors=['green','red'])

Out[55]:

Annotations¶

Annotations can be added to the chart and these are automatically positioned correctly.

Annotations should be specified in a dictionary form

In [56]:

annotations={'2015-01-15':'Dividends','2015-03-31':'Split Announced'}
df['MSFT'].iplot(filename='Tutorial Annotations',annotations=annotations)

Out[56]:

Output as Image¶

The output of a chart can be in an image mode as well.

For this we can use asImage=True

We can also set the dimensions (optional) with dimensions=(width,height)

In [61]:

df[['VERZ','MSFT']].iplot(filename='Tutorial Image',theme='white',colors=['pink','blue'],asImage=True,dimensions=(800,500))

Advanced Use¶

It is also possible to get the Plotly Figure as an output to tweak it manually

We can achieve this with asFigure=True

In [62]:

df['GOOG'].iplot(asFigure=True)

Out[62]:

{'data': [{'line': {'color': 'rgba(255, 153, 51, 1.0)', 'width': '1.3'},
   'mode': 'lines',
   'name': 'GOOG',
   'type': u'scatter',
   'x': ['2015-01-01',
    '2015-01-02',
    '2015-01-03',
    '2015-01-04',
    '2015-01-05',
    '2015-01-06',
    '2015-01-07',
    '2015-01-08',
    '2015-01-09',
    '2015-01-10',
    '2015-01-11',
    '2015-01-12',
    '2015-01-13',
    '2015-01-14',
    '2015-01-15',
    '2015-01-16',
    '2015-01-17',
    '2015-01-18',
    '2015-01-19',
    '2015-01-20',
    '2015-01-21',
    '2015-01-22',
    '2015-01-23',
    '2015-01-24',
    '2015-01-25',
    '2015-01-26',
    '2015-01-27',
    '2015-01-28',
    '2015-01-29',
    '2015-01-30',
    '2015-01-31',
    '2015-02-01',
    '2015-02-02',
    '2015-02-03',
    '2015-02-04',
    '2015-02-05',
    '2015-02-06',
    '2015-02-07',
    '2015-02-08',
    '2015-02-09',
    '2015-02-10',
    '2015-02-11',
    '2015-02-12',
    '2015-02-13',
    '2015-02-14',
    '2015-02-15',
    '2015-02-16',
    '2015-02-17',
    '2015-02-18',
    '2015-02-19',
    '2015-02-20',
    '2015-02-21',
    '2015-02-22',
    '2015-02-23',
    '2015-02-24',
    '2015-02-25',
    '2015-02-26',
    '2015-02-27',
    '2015-02-28',
    '2015-03-01',
    '2015-03-02',
    '2015-03-03',
    '2015-03-04',
    '2015-03-05',
    '2015-03-06',
    '2015-03-07',
    '2015-03-08',
    '2015-03-09',
    '2015-03-10',
    '2015-03-11',
    '2015-03-12',
    '2015-03-13',
    '2015-03-14',
    '2015-03-15',
    '2015-03-16',
    '2015-03-17',
    '2015-03-18',
    '2015-03-19',
    '2015-03-20',
    '2015-03-21',
    '2015-03-22',
    '2015-03-23',
    '2015-03-24',
    '2015-03-25',
    '2015-03-26',
    '2015-03-27',
    '2015-03-28',
    '2015-03-29',
    '2015-03-30',
    '2015-03-31',
    '2015-04-01',
    '2015-04-02',
    '2015-04-03',
    '2015-04-04',
    '2015-04-05',
    '2015-04-06',
    '2015-04-07',
    '2015-04-08',
    '2015-04-09',
    '2015-04-10'],
   'y': array([  1.15972096,   2.61814088,   3.88137325,   5.29048469,
            5.45267539,   3.81069438,   2.56234866,   2.48937755,
            0.8885047 ,   1.42449396,   0.71398967,   2.23910212,
            0.54126357,   0.73594014,   2.40198747,   3.19970865,
            2.52258531,   2.70797982,   2.54410633,   2.48464073,
            5.1813573 ,   2.59804433,   4.05577506,   4.08723678,
            2.76233312,   5.48137562,   5.1817834 ,   4.33260202,
            6.34032271,   5.77686686,   5.36107125,   5.3328116 ,
            5.18925194,   4.66230224,   3.13874977,   3.17508182,
            3.1986122 ,   3.93469004,   5.25255624,   7.19535909,
            7.34507408,   5.84095987,   4.86338622,   4.84782583,
            5.67626555,   6.05159475,   6.42908244,   7.44653376,
            7.5657167 ,   8.51323216,   8.8481517 ,   8.55037342,
            8.81314153,   7.87584403,   6.87637759,   7.95065448,
           10.02877143,   9.40492693,   9.97178043,  10.33123802,
            9.64544183,   8.26611953,   8.27434462,   7.27165506,
            5.6709917 ,   5.45050085,   5.96900983,   6.60072447,
            7.78534843,   7.84492484,   5.55131151,   5.572     ,
            3.13828897,   2.72750849,   2.11072395,   1.32132751,
            0.54197202,   0.50989272,   1.49837952,   2.85919443,
            4.4573978 ,   3.09151662,   4.18383466,   5.24095248,
            5.8356141 ,   5.63447151,   6.37295694,   4.82640324,
            5.18567817,   4.75585821,   3.16939133,   4.46429576,
            4.47270732,   4.14814115,   5.31533815,   5.80154658,
            6.86509013,   6.66442417,   6.52294608,   7.33802256])}],
 'layout': {'bargap': 0.01,
  'legend': {'bgcolor': '#F5F6F9', 'font': {'color': '#4D5663'}},
  'paper_bgcolor': '#F5F6F9',
  'plot_bgcolor': '#F5F6F9',
  'xaxis': {'gridcolor': '#E1E5ED',
   'tickfont': {'color': '#4D5663'},
   'title': '',
   'titlefont': {'color': '#4D5663'},
   'zerolinecolor': '#E1E5ED'},
  'yaxis': {'gridcolor': '#E1E5ED',
   'tickfont': {'color': '#4D5663'},
   'title': '',
   'titlefont': {'color': '#4D5663'},
   'zeroline': False,
   'zerolinecolor': '#E1E5ED'}}}

We can also get the Data object directly

In [63]:

data=df.to_iplot()

In [64]:

data[0]['name']='My Custom Name'

And pass this directly to iplot

In [66]:

df.iplot(data=data,filename='Tutorial Custom Name')

Out[66]:

In [ ]: