#!/usr/bin/env python # coding: utf-8 # # [matta](https://github.com/carnby/matta) - Introduction: Let's Make Scaffold a Barchart # # By [@carnby](https://twitter.com/carnby) # # ## Introduction # # Probably you have seen the barchart example by [Mike Bostock](http://bost.ocks.org/mike/). It is [here](http://bl.ocks.org/mbostock/3885304). In this notebook I explain how to use [matta](https://github.com/carnby/matta) to implement this barchart. # # ### Why use matta? # # One thing is to have an example of a visualization, and another one is to have a [reusable implementation](http://bost.ocks.org/mike/chart/). Reusable implementations are not about having a specific function to draw. In my opinion, they are about an entire context where you can easily use your visualization with other datasets. # # ### How do we do it? # # In this notebook we see the basic _scaffolding_ done by matta to reproduce the example chart to visualize a pandas DataFrame. By being able to use a DataFrame, we can forget about converting the dataset to the specific layout the visualization designer had in mind, and instead, you can focus on converting to a DataFrame (which will probably be very, very easy) # # Let's begin. # ## Initial Setup # Here we load matta. # # If you see the README, you will notice that you can install matta's javascript and css into your IPython profile. In this way you do not need to issue a `init_javascript` call. It is here just for demonstration - if you use a core matta visualization and export the notebook to NBViewer, you will need to execute it, to allow your visitor's browser to load the required js/css files. # # If you installed matta into your profile, then using the function will do no harm - it detects that matta was loaded and does nothing. # In[1]: import matta # we do this to load the required libraries when viewing on NBViewer matta.init_javascript(path='https://rawgit.com/carnby/matta/master/matta/libs') # ## Data # # Mike's example loads a TSV (Tab Separated Values) file with letter frequency. We can load directly into a pandas DataFrame. # In[2]: import pandas as pd df = pd.read_csv('http://bl.ocks.org/mbostock/raw/3885304/964f9100166627a89c7e6c23ce8128f5aefd5510/data.tsv', delimiter='\t') df.head() # ## Sketching the Visualization # # First, let's sketch the visualization by defining what are its options and code. # # The visualization options or arguments are contained in a dictionary. Note that the dictionary contains a subdictionary named `variables`. Those variables will be exposed as methods of the scaffolded visualization, and are available in code as `_variable_name`. # # Note also the `data` dictionary. It indicates that the visualization receives a pandas `DataFrame`. This dataframe is available internally as the `_data_dataframe` variable. # # This is the visualization code. Note that is almost a copy-and-paste version of the original example. We just renamed the variables to `_variable_name` and used other auxiliary variables like `_vis_width` which are exposed by matta. # # Note that the code is not strictly javascript. Actually, the file is expected to be a [jinja2](http://jinja.pocoo.org/) template. # # We save this template as `barchart.js`, as `barchart_args['visualization_js']` points to it. # **`skeleton/__init__.py`** # ```python # VISUALIZATION_CONFIG = { # 'requirements': ['d3'], # 'visualization_name': 'barchart', # 'visualization_js': './barchart.js', # 'figure_id': None, # 'container_type': 'svg', # 'data': { # 'dataframe': None, # }, # 'options': { # 'background_color': None, # 'x_axis': True, # 'y_axis': True, # }, # 'variables': { # 'width': 960, # 'height': 500, # 'padding': {'left': 30, 'top': 20, 'right': 30, 'bottom': 30}, # 'x': 'x', # 'y': 'y', # 'y_axis_ticks': 10, # 'color': 'steelblue', # 'y_label': None, # 'rotate_label': True, # }, # } # ``` # **`skeleton/template.js`** # # ```javascript # var x = d3.scale.ordinal() # .rangeRoundBands([0, _vis_width], .1); # # var y = d3.scale.linear() # .range([_vis_height, 0]); # # if (_y_label == null) { # _y_label = _y; # } # # x.domain(_data_dataframe.map(function(d) { return d[_x]; })); # y.domain([0, d3.max(_data_dataframe, function(d) { return d[_y]; })]); # # {% if options.x_axis %} # var xAxis = d3.svg.axis() # .scale(x) # .orient("bottom"); # # container.append("g") # .attr("class", "x axis") # .attr("transform", "translate(0," + _vis_height + ")") # .call(xAxis); # {% endif %} # # {% if options.y_axis %} # var yAxis = d3.svg.axis() # .scale(y) # .orient("left"); # # if (_y_axis_ticks != null) { # yAxis.ticks(_y_axis_ticks); # } # # var y_label = container.append("g") # .attr("class", "y axis") # .call(yAxis) # .append("text"); # # if (_rotate_label) { # y_label.attr("transform", "rotate(-90)") # .attr("y", 6) # .attr("dy", ".71em") # .style("text-anchor", "end"); # } else { # y_label # .attr("y", 6) # .attr('x', 12) # .attr("dy", ".71em") # .style("text-anchor", "start"); # } # # y_label.text(_y_label); # {% endif %} # # // NOTE: this is needed for the internal color scale manager. # _bar_color_update_scale_func(_data_dataframe); # # var bar = container.selectAll(".bar") # .data(_data_dataframe); # # bar.enter().append('rect').classed('bar', true); # # bar.exit().remove(); # # bar.attr("x", function(d) { return x(d[_x]); }) # .attr("width", x.rangeBand()) # .attr("y", function(d) { return y(d[_y]); }) # .attr("height", function(d) { return _vis_height - y(d[_y]); }) # .attr('fill', _bar_color); # ``` # This is the actual matta code to display the visualization in the notebook. # ## Importing the Visualization # In[3]: barchart = matta.import_visualization('skeleton') # In[4]: barchart(dataframe=df, x='letter', y='frequency', rotate_label=False, bar_color='purple') # Note that the keyword arguments are keys from the `VISUALIZATION_CONFIG` dictionary. If you use a keyword argument not present in the dictionary, an `Exception` will be raised. # # Remember that in the visualization configuration we had a "colorables" section. The colorable bar_color was specified as "purple" in the previous chart, but we can also make it dynamic by specifying a source column from the dataframe, a color palette and a scale type: # In[7]: barchart(dataframe=df, x='letter', y='frequency', rotate_label=False, bar_color={'value': 'letter', 'palette': 'cubehelix', 'n_colors': df.shape[0], 'scale': 'ordinal'}) # ## Visualization Scaffolding # # The next step is to scaffold a reusable visualization. Actually, the code is very similar: # In[6]: barchart(x='letter', y='frequency').scaffold(filename='./scaffolded_barchart.js') # What this does is to create a file named `scaffolded_barchart.js` which contains a reusable visualization. All variables declared in the arguments dictionary are available as property methods. The values specified when defining the arguments or when scaffolding will serve as defaults, but everything is changeable. Note that we did not specify a DataFrame this time! # ## Testing the Visualization # # To test the visualization, we will serialize the DataFrame and then display an IFrame with the visualization using a very simple template (which we, again, copied from the original source by Mike). # # matta includes a `dump_data` function that calls a JSON serializer under the hoods. This serializer is able to handle DataFrames and other typical python data structures. # In[ ]: from matta import dump_data dump_data(df, './data.json') # Now let's write the HTML file: # # ```html # # # # # # # ``` # # Note that we include `d3-legend` because it is required by the core `matta` library, which is required under the hood by our barchart. # I uploaded the result to the [following gist](https://gist.github.com/carnby/00f7a94ea97aa9cc0be2). You can see it on [bl.ocks.org](http://bl.ocks.org/carnby/00f7a94ea97aa9cc0be2). # ## Conclusions # # That's it! :) # # We copied-and-pasted implemented a barchart. The cool thing is that we didn't had to worry about data formats, since we knew the data was a DataFrame. We also didn't have to worry about dependencies like loading `d3.js` or to have a reusable visualization, because matta does all that. # # With matta you can have _readymade_ visualizations :)