# Mapping Canadian Center for Climate Modeling and Analysis (CCCma) NetCDF files with Basemap¶

There are plenty of tutorials about Basemap using NetCDF (.nc) files, but CCCma files are in a special format. It took me a while to get everything working, so I thought I'd share my code in case anyone else is in the same situation.

Note that the only datasets I've tried this on are from the Fourth Generation Canadian Regional Climate Model (CanRCM4) > Second Generation Earth System Model (CanESM2) > RCP 8.5 category. I do not know if other models have their NetCDF files and data within organized the same way; tweaking may be necessary.

Note that although this data is from Environment Canada, climate models from around the world can be found.

The sample file used in this notebook is snc_NAM-44_CCCma-CanESM2_rcp85_r1i1p1_CCCma-CanRCM4_r2_day_20410101-20451231.nc

It contains percent snow cover for most of North America for every day from the years 2041 to 2045. Environment Canada has files projecting many different kind of atmospheric, land and oceanic data.

### 2. Choose what file and day you wish to use¶

This will be the only cell you need to enter values into. All the other cells can just be run, they will reference these values.

In [1]:
my_filename = ('snc_NAM-44_CCCma-CanESM2_rcp85_r1i1p1_'
'CCCma-CanRCM4_r2_day_20410101-20451231.nc')
# for this demo, we will see where it will be a white christmas in 2041.
my_year, my_month, my_day = (2041, 12, 25)

In [2]:
# calculations based on the above values:

# to increase compatibility with Python 3.x:
from __future__ import print_function
import re

# extracts variable name from beginning of filename:
my_variable = my_filename[:my_filename.find('_')]

print("my_variable = {}".format(my_variable))

first_year = int(re.search(('([12][90][0-9]{2})[01][0-9]'
'[0-3][0-9]\-[12][90][0-9]{2}[01][0-9][0-3][0-9]\.nc'),
my_filename).group(1))
print("first year in .nc file = {}".format(first_year))

month_lengths = (0,31,28,31,30,31,30,31,31,30,31,30)
month_cumulative = []
counter = 0
for length in month_lengths:
counter += length
month_cumulative.append(counter)
print("month_cumulative = {}".format(month_cumulative))

days_elapsed = (365 * (my_year - first_year) +
month_cumulative[my_month - 1] + my_day)
print(("{} days elapsed between Jan. 1, {} and specified date"
"of {}-{}-{}").format(days_elapsed, first_year,
my_year, my_month, my_day))
print("(leap days are not counted)")

my_variable = snc
first year in .nc file = 2041
month_cumulative = [0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334]
359 days elapsed between Jan. 1, 2041 and specified dateof 2041-12-25
(leap days are not counted)


### 3. Extract data from .nc file¶

In [3]:
import netCDF4
# netCDF4 is not part of standard Python,
import numpy as np

ncfile = netCDF4.Dataset(my_filename, 'r')

# Let's have a look at this file.
print("Attributes:")
print(ncfile.ncattrs())
print("\nVariables:")
print(ncfile.variables)

Attributes:
[u'title', u'institution', u'institute_id', u'contact', u'Conventions', u'experiment', u'experiment_id', u'driving_experiment', u'driving_model_id', u'driving_model_ensemble_member', u'driving_experiment_name', u'forcing', u'model_id', u'creation_date', u'rcm_version_id', u'project_id', u'CORDEX_domain', u'frequency', u'product', u'CCCma_runid', u'references', u'history', u'data_licence']

Variables:
OrderedDict([(u'time', <netCDF4.Variable object at 0x0000000003B986A8>), (u'rlon', <netCDF4.Variable object at 0x0000000003B98730>), (u'rlat', <netCDF4.Variable object at 0x0000000003B987B8>), (u'lon', <netCDF4.Variable object at 0x0000000003B98840>), (u'lat', <netCDF4.Variable object at 0x0000000003B988C8>), (u'rotated_pole', <netCDF4.Variable object at 0x0000000003B98950>), (u'snc', <netCDF4.Variable object at 0x0000000003B989D8>)])


We are interested in the following variables:

• lon: an array of longitudes for a 130 x 155 grid

• lat: an array of latitudes for a 130 x 155 grid

Note: the grid is made up of cells approximately 0.44 degrees on a side, but the grid is not exactly aligned north-south or east-west; that is why there are 130x155 each of longitude and latitude

snc: a 1825x130x155 grid of snow cover from 0 to 100 percent.

The first dimension is time, measured in days since December 1, 1949 NOT INCLUDING LEAP DAYS. In other words, there are exactly 36500 days between December 1, 1949 and December 1, 2049.

In the 'time' variable, which we do not need to use, the first value is 33246.5 days (half a day is added to make the time noon, which is unambiguously in the middle of the day instead of at the cusp between two days). We will verify this in the next cell.

In [4]:
nc_lons = np.array(ncfile.variables['lon'])
nc_lats = np.array(ncfile.variables['lat'])
nc_vars = np.array(ncfile.variables[my_variable])

print("Shape of nc_lons: {}".format(nc_lons.shape))
print("Shape of nc_lons: {}".format(nc_lats.shape))
print("Shape of nc_vars before specifying day: {}".\
format(nc_vars.shape))
nc_vars = nc_vars[days_elapsed]
print("Shape of nc_vars after specifying day: {}".\
format(nc_vars.shape))
print("Starting day of file: {}".format(\
ncfile.variables['time'][0]))

Shape of nc_lons: (130L, 155L)
Shape of nc_lons: (130L, 155L)
Shape of nc_vars before specifying day: (1825L, 130L, 155L)
Shape of nc_vars after specifying day: (130L, 155L)
Starting day of file: 33246.5

In [5]:
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
%matplotlib inline

fig = plt.figure(figsize=(8,8))
m = Basemap(width=6000000,height=4000000,
resolution='l',projection='stere',\
lon_0=-93,lat_0=41, area_thresh=1000)
lakes=True)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
parallels = np.arange(0., 90, 10.)
m.drawparallels(parallels,labels=[1,0,0,0],fontsize=10)
meridians = np.arange(180.,360.,10.)
m.drawmeridians(meridians,labels=[0,0,0,1],fontsize=10)
x, y = m(nc_lons, nc_lats)
clevs = range(1,101)
cs = m.contourf(x,y,nc_vars,clevs,cmap=plt.cm.Greens_r)