Matplotlib/Basemap and netcdf4-python example

Using netcdf4-python to read some NOMADS openDAP ensemble forecast data, use numpy to do some rudimentary analysis, and have matplotlib and Basemap plot the results.

In [2]:
# Make the output of plotting commands be displayed inline within the notebook,
%matplotlib inline 
from mpl_toolkits.basemap import Basemap  # import Basemap matplotlib toolkit
import numpy as np
import matplotlib.pyplot as plt
from netCDF4 import Dataset  # netcdf4-python module

Open an a NOMADS dataset containing NOAA/NCEP global ensemble forecast output with the netcdf4-python module. Only the last week of forecasts are kept online, so you'll probably have to change the date in this URL to make it work. I have saved the data from this openDAP url here, so you can also download that file and change the URL to a path that points to the downloaded file.

In [3]:
gfs_fcst = Dataset('')
#gfs_fcst = Dataset('') # use local netcdf file instead
print gfs_fcst # get some summary information about the dataset
<type 'netCDF4.Dataset'>
root group (NETCDF3_CLASSIC file format):
    title: GENS all members fcst starting from 00Z23jan2014, downloaded Jan 23 05:52 UTC
    Conventions: COARDS
    dataType: Grid
    history: Sun Jan 26 02:24:37 EST 2014 : imported by GrADS Data Server 2.0
    dimensions: ens, lat, lev, lon, time
    variables: ens, time, lev, lat, lon, absvprs, no4lftxsfc, no5wava500mb, no5wavh500mb, acpcpsfc, albdosfc, apcpsfc, capesfc, cape180_0mb, cfrzrsfc, cicepsfc, cinsfc, cin180_0mb, clwmrprs, cpratsfc, crainsfc, csnowsfc, cwatclm, cworkclm, dlwrfsfc, dpt2m, dswrfsfc, gfluxsfc, gpa1000mb, gpa500mb, hgtsfc, hgtprs, hgt2pv, hgtneg2pv, hgttop0c, hgt0c, hgtmwl, hgttrop, hpblsfc, icecsfc, landsfc, lftxsfc, lhtflsfc, o3mrprs, pevprsfc, potsig995, pratesfc, preslclb, preslclt, presmclb, presmclt, preshclb, preshclt, pressfc, pres2pv, presneg2pv, prescclb, prescclt, presmwl, prestrop, prmslmsl, pvort320k, pwatclm, rhprs, rh2m, rhsg330_1000, rhsg440_1000, rhsg720_940, rhsg440_720, rhsig995, rh30_0mb, rhclm, rhtop0c, rh0c, shtflsfc, snodsfc, soilw0_10cm, soilw10_40cm, soilw40_100cm, soilw100_200cm, spfhprs, spfh2m, spfh30_0mb, sunsdsfc, tcdcclm, tcdcblcll, tcdclcll, tcdcmcll, tcdchcll, tcdcccll, tmax2m, tmin2m, tmplclt, tmpmclt, tmphclt, tmpsfc, tmpprs, tmp_1829m, tmp_2743m, tmp_3658m, tmp2m, tmpsig995, tmp0_10cm, tmp10_40cm, tmp40_100cm, tmp100_200cm, tmp30_0mb, tmp2pv, tmpneg2pv, tmpmwl, tmptrop, tozneclm, ugwdsfc, uflxsfc, ugrdprs, ugrd_1829m, ugrd_2743m, ugrd_3658m, ugrd10m, ugrdsig995, ugrd30_0mb, ugrd2pv, ugrdneg2pv, ugrdmwl, ugrdtrop, ulwrfsfc, ulwrftoa, uswrfsfc, uswrftoa, vgwdsfc, vflxsfc, vgrdprs, vgrd_1829m, vgrd_2743m, vgrd_3658m, vgrd10m, vgrdsig995, vgrd30_0mb, vgrd2pv, vgrdneg2pv, vgrdmwl, vgrdtrop, vvelprs, vvelsig995, vwsh2pv, vwshneg2pv, vwshtrop, watrsfc, weasdsfc

Get the available forecast times, convert to datetime instances.

In [4]:
time = gfs_fcst.variables['time']
print time
from netCDF4 import num2date
valid_dates = num2date(time[:], time.units).tolist()
print [d.strftime('%Y%m%d%H') for d in valid_dates]
<type 'netCDF4.Variable'>
float64 time(time)
    grads_dim: t
    grads_mapping: linear
    grads_size: 65
    grads_min: 00z23jan2014
    grads_step: 6hr
    units: days since 1-1-1 00:00:0.0
    long_name: time
    minimum: 00z23jan2014
    maximum: 00z08feb2014
    resolution: 0.25
unlimited dimensions: 
current shape = (65,)

['2014012300', '2014012306', '2014012312', '2014012318', '2014012400', '2014012406', '2014012412', '2014012418', '2014012500', '2014012506', '2014012512', '2014012518', '2014012600', '2014012606', '2014012612', '2014012618', '2014012700', '2014012706', '2014012712', '2014012718', '2014012800', '2014012806', '2014012812', '2014012818', '2014012900', '2014012906', '2014012912', '2014012918', '2014013000', '2014013006', '2014013012', '2014013018', '2014013100', '2014013106', '2014013112', '2014013118', '2014020100', '2014020106', '2014020112', '2014020118', '2014020200', '2014020206', '2014020212', '2014020218', '2014020300', '2014020306', '2014020312', '2014020318', '2014020400', '2014020406', '2014020412', '2014020418', '2014020500', '2014020506', '2014020512', '2014020518', '2014020600', '2014020606', '2014020612', '2014020618', '2014020700', '2014020706', '2014020712', '2014020718', '2014020800']

Find the time index that is closest to the Super Bowl start time (00UTC 3 Feb) using the netcdf4-python num2date function. If you changed the URL above, you may have to change the date here so that it is in the forecast period.

In [5]:
from datetime import datetime
nt_superbowl = valid_dates.index(datetime(2014, 2, 3, 0, 0))
print valid_dates[nt_superbowl]
2014-02-03 00:00:00

Read the 2-meter temperature global ensemble forecast valid at the time, compute the ensemble mean. Also read the latitude and longitude values for the grid (needed for plotting).

In [6]:
t2mens = gfs_fcst.variables['tmp2m'][:,nt_superbowl,:,:]
t2m = t2mens.mean(axis=0)
print t2m.shape, t2m.min(), t2m.max()
lats = gfs_fcst.variables['lat'][:]; lons = gfs_fcst.variables['lon'][:]
lons, lats = np.meshgrid(lons, lats)  # make lats/lons into 2D arrays (needed for plotting)
(181, 360) 225.459 308.804

Plot the ensemble mean on a stereographic basemap centered on the NYC metro area, with a width and height of 6000 km.

In [10]:
fig = plt.figure(figsize=(8,8))
m = Basemap(projection='stere',lon_0=-74,lat_0=41,width=4.e6,height=4.e6)
x,y = m(lons, lats) # convert lats/lons to map projection coordinates
cs = m.contourf(x,y,t2m,np.linspace(230,300,41),  # color-filled contours
cb = m.colorbar(cs)  # draw colorbar
parallels = m.drawparallels(np.arange(10,90,10),labels=[1,0,0,0])  # draw parallels, label on left
meridians = m.drawmeridians(np.arange(-10,-100,-10),labels=[0,0,0,1]) # label meridians on bottom
t = plt.title('ens mean t2m fcst from %s for %s' % (valid_dates[0],valid_dates[nt_superbowl]),fontweight='bold')

Plot every ensemble member to get an idea of the forecast uncertainty using the matplotlib.pyplot subplot function.

In [11]:
fig = plt.figure(figsize=(16,14))
for nens in range(1,21):
    ax = plt.subplot(4,5,nens)
    cs = m.contourf(x,y,t2mens[nens],np.linspace(230,300,41),
    t = plt.title('ens member %s' % nens)

What's the probability it will be below freezing during the game? Try changing the threshold from 273K to see what happens...

In [13]:
freezeprob = 100.*((t2mens < 273).sum(axis=0))/t2mens.shape[0]
print freezeprob.min(), freezeprob.max(), freezeprob.shape
from mpl_toolkits.basemap import cm as basemapcm
cs = m.contourf(x,y,freezeprob,np.linspace(0,100.,41),cmap=basemapcm.GMT_haxby_r)
cb = m.colorbar()
xpt, ypt = m(-74,41)
nyc = m.plot([xpt],[ypt],'ro') # plot red dot near the Meadowlands.
t = plt.title('ens prob t2m < 273 (go Broncos!)')
0.0 100.0 (181, 360)