This lab is adapted from the fantastic D3 Tutorial by Scott Murray. We’ll work through a quick version of his tutorial, and then build a simple interactive visualization dashboard for a sample dataset. If you're interested in a more thorough introduction to D3, take a look at his book Interactive Data Visualization for the Web, which is available for free online. Additionally, if you're thinking of working with D3 for your project, have a look at this fantastic gallery for some really good starting points!
We're going to use this lab to learn and practice D3 fundamentals, to make sure the concepts are solidified to make fancier visualizations.
Don't forget to fill out the response form!
HTML - By now, hopefully you’ve heard about HTML. HTML, invented by lazy physicists, is the language of the web, and we’ll use it for laying out and displaying documents - in this case, our visualizations.
CSS - In modern webpages, HTML is used for layout of elements, while CSS is used to style the visual presentation of HTML, doing things like setting fonts and line widths, etc. We'll make heavy use of CSS to style our visualizations.
DOM - A DOM is an object model used to represent HTML documents as well as documents in other markup formats. A DOM can be thought of as a tree of nodes, where each node represents an element in the document, with the tree rooted at as the document. A visualization of the DOM for a simple table looks like this:
Javascript - A client side scripting language embedded in most web browsers. Has really nice integration with the current page's DOM and can manipulate it.
SVG - Scalable Vector Graphics format for describing vector graphics in XML. Very low level - allows for description of basic shapes. A green rectangle with a black border in SVG looks like this:
<svg xmlns="http://www.w3.org/2000/svg" version="1.1" width="200" height="200">
<rect width="150" height="150" fill="rgb(0, 255, 0)" stroke-width="1" stroke="rgb(0, 0, 0)" transform=translate(5,5)/>
</svg>
Which, rendered by your web browser, looks like this:
Since it’s XML - we can load it as a DOM! And we can manipulate it with Javascript!
D3 - A javascript library to generate visualizations. Works on things other than SVG, but SVG enables us to do things we can’t with raster images - including interactivity.
The key idea behind D3 is that we can programmatically manipulate SVG elements in a webpage to create great visualizations, and with the help of a little javascript, we can make them interactive.
Since an iPython notebook is just a webpage, we can actually modify elements inside it while we're running. Let's take a look at what we mean - create a new html element in the middle of this webpage by running the following snippet.
%%html
<div id="d3-lab">
What did we just do? We created an empty element (a div) with the ID 'd3-lab' in the middle of this very web page.
Hmm... so what can we do with it? Well, let's execute some JavaScript inline as well.
%%javascript
require.config({paths: {d3: "/files/d3.v3.min"}});
require(["d3"], function(d3) {
d3.select("#d3-lab").append("b").text("My insanely cool visualization.");
});
You should see some text right beneath your html block above. OK! This is a building block for creating visualizations.
Ignoring the boilerplate - let's decompose the line that really matters - d3.select(#d3-lab).text("My insanely cool visualization.");
. What is this thing doing?
First - d3
is an object which contains a number of methods. One of them is select
which allows us to select DOM elements.
One of the key things in D3 is that almost all methods return a selection. Thus, we can chain together any methods that are avaiable to a selection on top of each other.
In this example, once we've selected the ID, we care about (d3-lab
), we then append
a b
element (b
is bold in HTML). The append
method returns another selection - namely, the element it just created. We then chain on the text
method onto the append
, which inserts text inside its selection.
We now have the building blocks we need to build real visualizations with D3.
One of the central concepts in D3 is binding data to visual elements. This comes from the concept that a visualization is a transformation of data elements to some visual representation of that data. This might be transforming a bunch of numbers into a bar chart, or using a statistical function to extract important words from Shakespeare's text and projecting them onto blades of a fancy chandelier above a bar in a museum.
First, let's clear up some terminology:
rect
's - one for each bar.Binding Data Elements to DOM Nodes is the process of associating Data Elements with a selection DOM Nodes in a one-to-one mapping. You can think of this like a join in a database system.
Let's look at a concrete example
%%html
<div id="d3-bind">
%%javascript
require.config({paths: {d3: "/files/d3.v3.min"}});
require(["d3"], function(d3) {
//Setup local variables.
var w = 500;
var h = 100;
var barPadding = 1;
//Set up our dataset.
var dataset = [15,12,21,42,12,10,1];
//Create an svg element that's w x h in size.
var svg = d3.select("#d3-bind")
.append("svg")
.attr("width", w)
.attr("height", h);
//Bind our data to SVG and create a rectangle for each one
//This is where the magic happens!
var bars = svg.selectAll("rect")
.data(dataset)
.enter()
.append("rect");
//For each bar, set its attributes as a function of its position
//in the dataset and
bars.attr("x", function(d, i) { return i * (w / dataset.length); })
.attr("y", function(d) { return h - (d * 4); })
.attr("width", w / dataset.length - barPadding)
.attr("height", function(d) { return d * 4; })
.attr("fill", "teal");
});
Alright, this function is a bit more substantial. What did we do here?
First, we set up some local variables - width and height of the SVG element that we're going to manipulate. Then - we create our dataset. Here, it's just a list of numbers.
Next, we bind our SVG to our dataset - more specifically, we bind all rect elements inside our SVG to each element in our dataset. But - we don't actually have any rect
elements in our SVG - so won't that selection be empty? This is where this myseterious function enter
comes in. The selection acts as a placeholder for any items that don't exist yet. This picture from the D3 paper you read for class summarizes what this function does pretty well.
That is, we can think of the binding process as a database full outer join, and enter is a function that runs on the items in the result where the data is not bound to an element yet, update is a function which runs on the results where the data is bound to an element, and exit is a function which runs on elements that aren't bound to data. Another way to think of this is with a Constructor/Update/Destructor pattern from Object Oriented Programming. The first time a data item comes in, we run enter
on it, and when it is deleted we run exit
.
Finally, we set some visual attributes of each rect item in our dataset. These attributes: x and y position of each bar, height and width of each bar, and fill color - are defined either as values (as in the case of width and fill), functions of each data item (as in the case of height and y), or functions of the data item and its index in the dataset (as in the case of x). D3 figures out how to set the attribute based on the type of the second argument you pass to the attr
method.
width
and x
attribute definitions).At this point, you've seen how to manipulate D3 selections, bind data to SVG elements, and update visual attributes based on our data. So far the result is a simple bar chart, but we're reasoning about everything in pixel space, and everything seems pretty low level.
Fortunately, D3 provides a bunch of tools to help us plot data on different scales and ,with different axes, or with a different layout. We'll have a look at each of these now.
"Scales are functions that map from an input domain to an output range." - Mike Bostock
You can think about scales as the things that let you load up data and manipulate it one unit - inches, feet, tons - and translate it into some desired output unit (often pixels on the screen of our web browser).
D3 provides a set of funcitonality to create these functions. Let's take a look at a simple linear scale. D3 contains other kinds of scale - logarithmic, power, quantile, etc.
A scale can be constructed with code that looks like the following:
var scale = d3.scale.linear().domain([x1,x2]).range([y1,y2]);
After executing this code, the value of scale
is a function which translates things from the range (x1,x2) to the range (y1,y2) with a linear transformation.
In the code below, we've created a few example scales - designed to help with sizing of the bars, as well as positioning them - these scales take care of adding in some padding on either side of the chart.
%%html
<div id="d3-scale">
%%javascript
require.config({paths: {d3: "/files/d3.v3.min"}});
require(["d3"], function(d3) {
//Setup local variables.
var w = 500;
var h = 400;
var barPadding = 20;
var padding = 50;
//Set up our dataset.
var dataset = [15,12,21,42,12,10,5,85];
//Define your scales here
var xScale = d3.scale.linear()
.domain([0, dataset.length])
.range([padding,(w-padding)]);
var widthScale = d3.scale.linear()
.domain([0, dataset.length])
.range([0, (w-2*padding)]);
var yScale = d3.scale.linear()
.domain([0, d3.max(dataset)])
.range([h-padding,padding]);
var heightScale = d3.scale.linear()
.domain([0, d3.max(dataset)])
.range([0,h-2*padding]);
//Create an svg element that's w x h in size.
var svg = d3.select("#d3-scale")
.append("svg")
.attr("width", w)
.attr("height", h);
//Bind our data to SVG and create a rectangle for each one
//This is where the magic happens!
var bars = svg.selectAll("rect")
.data(dataset)
.enter()
.append("rect");
//For each bar, set its attributes as a function of its position
//in the dataset and its value.
bars.attr("x", function(d, i) { return xScale(i); })
.attr("y", function(d) { return yScale(d); })
.attr("width", widthScale(0.8))
.attr("height", function(d) { return heightScale(d); })
.attr("fill", "teal");
});
Linear scales may seem trivial - but constructing these translations automatically can be super powerful when you're working on more complicated visualizations.
Axes are functions which generate visual elements used to generate the visual elements of an axis - the x axis, y-axis, gridlines, etc. you see in many visualizations.
In order to create an axis, we should have a scale
for that axis. Luckily, we've already defined a few.
Creating a new axis looks something like this:
var yAxis = d3.svg.axis()
.scale(yScale)
.orient("left")
.ticks(5)
svg.append("g")
.attr("class", "axis")
.attr("transform", "translate(" + xScale(0) + ",0)")
.call(yAxis)
In that code, first we create a new axis
, then we assign it to a scale (in this case, the yScale
). Next, we assign a number of ticks that we want the axis to display (by default, evenly spaced). Finally we append the axis to the svg element we've already selected, and translate it to the right place. The call piece of the last bit of code reflects the fact that yAxis
is a function which generates the axis object.
We could style this axis by adding some CSS to the page that would control the line width, what the tick marks look like, etc.
So far, we've looked at static visualizations. But the web is an incredibly dynamic medium, and we've already seen that we can change it. D3 provides some nice tools for describing what to do when data changes, and ways to control movement in our visualizations - let's make our charts move!
In this section, we'll work with our original bar chart again, but with an additional element inserted to act like a button.
%%html
<div id="d3-update">
%%javascript
require.config({paths: {d3: "/files/d3.v3.min"}});
require(["d3"], function(d3) {
//Setup local variables.
var w = 500;
var h = 400;
var barPadding = 20;
var padding = 50;
//Set up our dataset.
var dataset = [ 15, 12, 21, 42, 12, 10, 5, 85 ];
//Define your scales here
var xScale = d3.scale.linear()
.domain([0, dataset.length])
.range([padding,(w-padding)]);
var widthScale = d3.scale.linear()
.domain([0, dataset.length])
.range([0, (w-2*padding)]);
var yScale = d3.scale.linear()
.domain([0, d3.max(dataset)])
.range([h-padding,padding]);
var heightScale = d3.scale.linear()
.domain([0, d3.max(dataset)])
.range([0,h-2*padding]);
//Clear out old content, then create an svg element that's w x h in size.
d3.select("#d3-update").selectAll("*").remove();
var svg = d3.select("#d3-update")
.append("svg")
.attr("width", w)
.attr("height", h);
//Bind our data to SVG and create a rectangle for each one
//This is where the magic happens!
var bars = svg.selectAll("rect")
.data(dataset)
.enter()
.append("rect");
//For each bar, set its attributes as a function of its position
//in the dataset and
bars.attr("x", function(d, i) { return xScale(i); })
.attr("y", function(d) { return yScale(d); })
.attr("width", widthScale(0.8))
.attr("height", function(d) { return heightScale(d); })
.attr("fill", "teal");
//Add a text element to our div
d3.select("#d3-update").append("p").text("Click here to update our data.")
//On click, update with new data.
d3.select("#d3-update").select("p")
.on("click", function() {
//New values for dataset
dataset = [ 11, 12, 62, 20, 18, 17, 16, 18 ];
//Update all rects
svg.selectAll("rect")
.data(dataset)
.attr("y", function(d) {
return yScale(d);
})
.attr("height", function(d) {
return heightScale(d);
});
});
});
What did we do here? We simply added a new paragraph element to the end of our div, then we bound a javascript listener to that paragraph.
When we bind a listener to the element, it means that we specify a function to be executed when the listener is triggered - in this case when the text in the paragraph element is clicked.
In this function, we define a new dataset, and then bind the data to all rect
objects in our svg. That is - we overwrite the old data values with the new ones, and then update the data elements accordingly.
Ok, so we can make the data change, but that change was kind of abrupt.
D3 has some black magic built in that lets us animate the data.
Try adding the following two lines after ".data(dataset)" in the "update all the rects" box.
.transition()
.duration(2000)
Did you see that? Two extremely simple lines of code and D3 animated our chart for us.
What's more - you can change the functions it uses to animate. Try adding:
.ease("bounce")
Below duration(2000)
D3 Provides several easing functions that can animate not just shape changes, but also color changes, too.
ease
: available options include: "linear", "circle", "elastic". Can you describe how these are different from "bounce"..delay(function (d,i) { return i*200; })
beneath transition
above. What happens?Now that we know how to make static visualizations, and we have an understanding about how we can handle updates and changes to our charts - let's put that knowledge together to make interactive visualizations.
Let's make a final version of our chart in which we'll add some interactivity. The effect we're creating is the following:
We've actually already seen a little bit of interactivity - when we clicked the paragraph text the chart updated, and we're going to use a very similar trick to handle the mouse interaction.
%%html
<div id="d3-interactive">
%%javascript
require.config({paths: {d3: "/files/d3.v3.min"}});
require(["d3"], function(d3) {
//Setup local variables.
var w = 500;
var h = 400;
var barPadding = 20;
var padding = 50;
//Set up our dataset.
var dataset = [ 15, 12, 21, 42, 12, 10, 5, 85 ];
//Define your scales here
var xScale = d3.scale.linear()
.domain([0, dataset.length])
.range([padding,(w-padding)]);
var widthScale = d3.scale.linear()
.domain([0, dataset.length])
.range([0, (w-2*padding)]);
var yScale = d3.scale.linear()
.domain([0, d3.max(dataset)])
.range([h-padding,padding]);
var heightScale = d3.scale.linear()
.domain([0, d3.max(dataset)])
.range([0,h-2*padding]);
//Clear out old content, then create an svg element that's w x h in size.
d3.select("#d3-interactive").selectAll("*").remove();
var svg = d3.select("#d3-interactive")
.append("svg")
.attr("width", w)
.attr("height", h);
//Bind our data to SVG and create a rectangle for each one
//This is where the magic happens!
var bars = svg.selectAll("rect")
.data(dataset)
.enter()
.append("rect");
//For each bar, set its attributes as a function of its position
//in the dataset and
bars.attr("x", function(d, i) { return xScale(i); })
.attr("y", function(d) { return yScale(d); })
.attr("width", widthScale(0.8))
.attr("height", function(d) { return heightScale(d); })
.attr("fill", "teal");
//Create a tooltip.
var tip = d3.select("#d3-interactive")
.append("div")
.attr("id", "tooltip");
tip.append("p")
.attr("id", "value")
.style("text-align", "center");
tip.style("width", "30px")
.style("background-color", "white")
.style("position", "absolute")
.style("border", "2px solid");
//Your code for the DIY goes here.
});
The code should look pretty familiar - with one catch - we've added a visual element called "tooltip" that is hidden.
We can bind events to a set of elements by making a selection, which we've already seen before.
rect
objects in the SVG.The code for handling mouse ins is not too difficult. It should look something like this:
.on("mouseover", function(d,i) {
tip
.style("left", (padding+xScale(i))+"px")
.style("top", yScale(d/2)+"px")
.style("display", null)
.select("#value")
.text(d);
d3.select(this).attr("fill", "red");
})
Basically, we set the visibility of the tooltip to true by updating its "display" attribute to "true", update its horizontal and vertical position, and set the color of the current bar to "red".
With the mouse outs, we'll simply hide the tooltip and then make the bar teal again. The event for a mouseout is called "mouseout".
So, we've seen some of the features that D3 provides out of the box, but there's much more - here are a few other things D3 can help you do.
Contrary to what the name implies, a layout does not actually handle how visual elements are layed out on the screen. Instead, they can be thought of as helper functions that take your input data and transform it to a new data that is easier for certain visualizations to work with.
Example layouts include functions to translate your data (bar height) into information for a pie chart (width of wedge in degrees and offset), a layout for stacked bar charts, a layout for graph data (nodes and vertices), and many cartographic layouts for drawing maps.
D3 also includes tools to load up CSV and JSON files from a URL. In this case, your data becomes a record per row of your file, rather than a list of numbers. Handling this is slightly more complicated, but it's very similar to what we did above.
To handle shapes beyond basic circles and lines, SVG supports the concept of path
s. A path is like a digital line. You start your pen on a canvas, move to some other point, and continue until you pick your pen up. Optionally you might choose to fill in the area created by your curve.
A triangle in SVG looks like this:
<?xml version="1.0" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg width="4cm" height="4cm" viewBox="0 0 400 400"
xmlns="http://www.w3.org/2000/svg" version="1.1">
<title>Example triangle01- simple example of a 'path'</title>
<desc>A path that draws a triangle</desc>
<rect x="1" y="1" width="398" height="398"
fill="none" stroke="blue" />
<path d="M 100 100 L 300 100 L 200 300 z"
fill="red" stroke="blue" stroke-width="3" />
</svg>
As you can imagine, drawing complicated curves like this would be tricky, so d3 provides a line
method to help you draw lines more easily.
Ok - so D3 is pretty cool, but manipulating shapes and positioning them on the canvas just so feels pretty low-level. You might be asking yourself "Aren't there libraries like Matplotlib where I can say, 'here's my data, give me a bar chart.'?" The answer is, YES!
Some options:
Figure
objects to D3 visualizations. It has iPython notebook integration built in.ggplot2
package. ggplot2
's invtentor, Hadley Wickham, now actively contributes to Vega.So - try making those your first stop when designing an interactive visualization - it may be that you can create what you want in just a few lines of code, rather than spending hours fiddling with Javascript and SVG.