Requires IPython Notebook 2.0 for inline d3.js Javascript. If using Anaconda as your means of installing IPython Notebook, then as of this writing (May 11, 2014), it is still on IPython 1.x despite IPython 2.x having been released on April 1, 2014. After installing Anaconda, you can update it to IPython 2.x by typing into a command prompt:

conda update conda
conda update ipython
On Windows at least, this may end up removing the convenient launch icon for IPython Notebook. If so, you can manually launch from a command prompt with:

ipython notebook

For Windows, install the following and add C:\Program Files (x86)\GnuWin32\bin to your PATH

http://gnuwin32.sourceforge.net/packages/wget.htm
http://gnuwin32.sourceforge.net/packages/unzip.htm
http://gnuwin32.sourceforge.net/packages/coreutils.htm

In [584]:
import math
import os
import datetime
import numpy
import pandas
import matplotlib.pyplot as plt
%matplotlib inline
In [585]:
def tofloat(x):
    try:
        return float(x)
    except ValueError:
        return None

Cities were hand-selected, with WBAN manually looked up from http://cdo.ncdc.noaa.gov/qclcd/QCLCD?prior=N and the INCITS code manually looked up from http://en.wikipedia.org/wiki/List_of_United_States_counties_and_county_equivalents#Table (Topojson also codes counties by INCITS)

In [586]:
dfcities = pandas.DataFrame([{'City':'Centennial', 'WBAN':93067, 'INCITS':8005},
                             {'City':'San Diego', 'WBAN':3131, 'INCITS':6073},
                             {'City':'Washington, DC', 'WBAN':13743, 'INCITS':11001},
                             {'City':'San Francisco', 'WBAN':23234, 'INCITS':6075},
                             {'City':'New York City', 'WBAN':94728, 'INCITS':36061},
                             {'City':'Atlanta', 'WBAN':13874, 'INCITS':13121},
                             {'City':'Phoenix', 'WBAN':23183, 'INCITS':4013},
                             {'City':'Dallas', 'WBAN':3927, 'INCITS':48113},
                             {'City':'Seattle', 'WBAN':24233, 'INCITS':53033},
                             {'City':'Kansas City', 'WBAN':3947, 'INCITS':29165},
                             {'City':'Minneapolis', 'WBAN':14922, 'INCITS':27053},
                             {'City':'New Orleans', 'WBAN':12916, 'INCITS':22051},
                             {'City':'Chicago', 'WBAN':94846, 'INCITS':17031},
                             {'City':'Anchorage', 'WBAN':26451, 'INCITS':2020},
                             {'City':'Honolulu', 'WBAN':22521, 'INCITS':15003},
                             {'City':'Boston', 'WBAN':14739, 'INCITS':25025},
                             {'City':'Miami', 'WBAN':12839, 'INCITS':12086},
                             {'City':'Detroit', 'WBAN':94847, 'INCITS':26163},
                             {'City':'Pittsburgh', 'WBAN':94823, 'INCITS':42003},
                             {'City':'Las Vegas', 'WBAN':23169, 'INCITS':32003},
                             {'City':'Houston', 'WBAN':12960, 'INCITS':48201}])
In [587]:
os.mkdir("TempBarometerFiles")
os.chdir("TempBarometerFiles")
In [588]:
processingyear = datetime.date.today().year
processingmonth = datetime.date.today().month
dfdiff=pandas.DataFrame(numpy.zeros(0,dtype=[('INCITS', 'a10'),('Range', 'f8')]))
for x in range(0, 12):
    dt = datetime.datetime(processingyear, processingmonth, 1) - datetime.timedelta(days=1)
    processingyear = dt.year
    processingmonth = dt.month
    os.system("wget -q http://cdo.ncdc.noaa.gov/qclcd_ascii/QCLCD" + str(processingyear) + str(processingmonth).zfill(2) + ".zip")
    os.system("unzip QCLCD" + str(processingyear) + str(processingmonth).zfill(2) + ".zip")
    df = pandas.read_csv(str(processingyear) + str(processingmonth).zfill(2) + "hourly.txt",low_memory=False)
    dfsp = df.merge(dfcities, on="WBAN").ix[:,("INCITS", "Date", "StationPressure")]
    dfsp["StationPressureFloat"] = dfsp["StationPressure"].apply(lambda x: tofloat(x))
    del dfsp["StationPressure"]
    dfsp = dfsp.ix[dfsp["StationPressureFloat"].apply(lambda x: not math.isnan(x))]
    gb = dfsp.groupby(["INCITS","Date"])
    dfminmax = gb.min().join(gb.max(), lsuffix="Min", rsuffix="Max")
    dfdiffcur = pandas.DataFrame(dfminmax["StationPressureFloatMax"] - dfminmax["StationPressureFloatMin"], columns=["Range"])
    dfdiffcur.reset_index(level=0, inplace=True)
    dfdiff = dfdiff.append(dfdiffcur)
    os.system("rm *.txt")
In [589]:
arrhist = []
for incits in dfcities["INCITS"].values:
    arrhist.append({'id':incits, 'hist':numpy.histogram(dfdiff.ix[dfdiff["INCITS"]==incits,"Range"],range=(0,0.6))[0]})

Since about January, 2014, d3.js will attempt to cooperate with AMD if it is present. This is the case in IPython Notebook 2.0, so d3.js has to import d3.js through require.js instead of directly.

In [590]:
%%html
<style type="text/css">
.land {
  fill: silver;
}
.states {
  fill: none;
  stroke: black;
  stroke-linejoin: round;
}
</style>

<div id="county_map" style="height:600px; width:100%"></div>

GeoSparkGrams of Daily Barometric Volatility

Daily variation of barometric pressure (maximum minus minimum for each day) in inches, for the past 12 months. For each of the hand-picked major cities, the 365 daily ranges for that city are histogrammed.

Histogram is in 10 bins, from 0.00 delta inches to 0.60 delta inches of mercury (horizontal axis). Vertical axis is 150 days.

In [591]:
from IPython.core.display import Javascript, display
display(Javascript("var histdata = eval('" + pandas.DataFrame(arrhist).to_json(orient='records') + "');" + """
// https://github.com/mbostock/d3/issues/1693
require.config({
  paths: {
    d3: "http://d3js.org/d3.v3.min",
    topojson: "http://d3js.org/topojson.v1.min"
  }
});

require(["d3", "topojson"], function(d3, topojson) {

    var square = 40;
    var ydaysmax = 150;
    
    var path = d3.geo.path();

    var svg = d3.select('#county_map').append("svg")
        .attr("width", 960)
        .attr("height", 500);

    d3.json("http://mashupguide.net/wwod14/us.json", function(error, us) {
    
        var countylist = histdata.map(function(x) {return x.id});

        svg.insert("path", ".graticule")
           .datum(topojson.feature(us, us.objects.land))
           .attr("class", "land")
           .attr("d", path);

        svg.append("path")
           .datum(topojson.mesh(us, us.objects.states), function(a, b) { return a !== b; })
           .attr("class", "states")
           .attr("d", path);

        var percountysvg = svg.append("g")
           .selectAll("svg")
           .data(topojson.feature(us,us.objects.counties).features.filter(function(x) {return countylist.indexOf(x.id) >= 0;}))
           .enter()
           .append("svg")
           .attr("x", function(d) {return d3.geo.path().centroid(d)[0] - square/2})
           .attr("y", function(d) {return d3.geo.path().centroid(d)[1] - square/2})           
        
        percountysvg.append("rect")
           .attr("width", square)
           .attr("height", square)
           .attr("fill", "white")
           .attr("stroke", "black")

        var xscale = d3.scale.linear().domain([0, histdata[0].hist.length]).range([0, square]);
        var yscale = d3.scale.linear().domain([0, ydaysmax]).range([square, 0]);

        var areapath = d3.svg.area()
           .x(function(d) { return xscale(d.x); })
           .y0(square)
           .y1(function(d) { return yscale(d.y); })
           .interpolate("linear");
  
        percountysvg.append("path")
           .attr("d", function(d) {return areapath($.grep(histdata, function(x){ return x.id==d.id; })
                                                    [0].hist.map(function(y,i) { return {x:i,y:y}; }));})
           .attr("fill", "blue");
    });
});
"""))
In [592]:
os.chdir("..")
In [593]:
!rm -rf TempBarometerFiles
In [593]: