Brian E. J. Rose, University at Albany
I use CESM in my own research. We are going to be using CESM in this course. Everyone should visit the website and learn about it.
see http://www.cesm.ucar.edu/models/cesm1.2/ for more info
The software is somewhat modular, so different submodels can be combined together depending on the nature of the scientific problem at hand and the available computer power.
Our experiments will use CESM in the so-called Slab Ocean Model mode, in which the ocean is represented by a static layer of water with some fixed heat capacity but no motion.
Recall that we saw this schematic of different ways to represent the ocean in climate models:
Using the slab ocean greatly reduced the time required for the model to reach equilibrium.
The net effect heat transport by ocean currents is prescribed through a so-called q-flux, which really just means we prescribe sources and sinks of heat at different locations.
For (lots of) details, see http://www2.cesm.ucar.edu/working-groups/pwg/documentation/cesm1-paleo-toolkit/ocean/som
The key is that we allow the sea surface temperature to change, but we fix (prescribe) the net effect of ocean currents on the transport of energy.
Why do this?
Why should we believe the results of the slab ocean model?
But experience with coupled models (meaning interactive ocean circulation) has shown that the circulation does not change radically under $2\times CO_2$.
So the slab ocean model gives us a decent first guess at climate sensitivity. And it makes it possible to do a lot of experimentation that we wouldn’t be able to do otherwise.
Model is given realistic atmospheric composition, realistic solar radiation, etc.
We perform a control run to get a baseline simulation, and take averages of several years (because the model has internal variability – every year is a little bit different)
We then change something, e.g. $2\times CO_2$!
And allow the model to adjust to a new equilibrium, just as we did with the toy energy balance model.
Once it has gotten close to its new equilibrium, we run it for several more years again to get the new climatology.
Then we can look at the differences in the climatologies before and after the perturbation.
First, let's take a look at some of the ingredients that go into the control run. All of the necessary data will be served up by a special data server sitting in the department, so you should be able to run this code to interact with the data on any computer that is connected to the internet.
You can browse the available data through a web interface here:
http://ramadda.atmos.albany.edu:8080/repository/entry/show/Top/Users/BrianRose/CESM_runs
Within this folder called CESM runs
, you will find another folder called som_input
which contains all the input files.
The data are all stored in NetCDF
files, a standard file format for self-describing gridded data.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr
We are going to use a package called xarray (abbreviated here as xr
) to work with the datasets.
Here we are going to load the input topography file and take a look at what's inside.
In this case we are passing it a URL to our online dataserver. We'll put the URL in a string variable called datapath
to simplify things later on.
datapath = "http://ramadda.atmos.albany.edu:8080/repository/opendap/latest/Top/Users/BrianRose/CESM_runs/"
endstr = "/entry.das"
# Notice that in Python we can easily concatenate strings together just by `adding` them
fullURL = datapath + 'som_input/USGS-gtopo30_1.9x2.5_remap_c050602.nc' + endstr
print( fullURL)
http://ramadda.atmos.albany.edu:8080/repository/opendap/latest/Top/Users/BrianRose/CESM_runs/som_input/USGS-gtopo30_1.9x2.5_remap_c050602.nc/entry.das
# Now we actually open the dataset
topo = xr.open_dataset( fullURL )
print(topo)
<xarray.Dataset> Dimensions: (lat: 96, lon: 144) Coordinates: * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 ... Data variables: PHIS (lat, lon) float64 ... SGH (lat, lon) float64 ... SGH30 (lat, lon) float64 ... LANDFRAC (lat, lon) float64 ... LANDM_COSLAT (lat, lon) float64 ... Attributes: history: Written on date: 20050602\ndefinesurf -remap -t /fs/cgd/csm/i... make_ross: true topofile: /fs/cgd/csm/inputdata/atm/cam/topo/USGS-gtopo30_10min_c050419.nc gridfile: /fs/cgd/csm/inputdata/atm/cam/coords/fv_1.9x2.5.nc landmask: /fs/cgd/csm/inputdata/atm/cam2/hrtopo/landm_coslat.nc
The Dataset
object has several important attributes. Much of this should look familiar if you have worked with netCDF
data before. The xarray
package gives a very powerful and easy to use interface to the data.
We can access individual variables within the xarray.Dataset
object as follows:
topo.PHIS
<xarray.DataArray 'PHIS' (lat: 96, lon: 144)> array([[ 2.796465e+04, 2.796465e+04, 2.796465e+04, ..., 2.796465e+04, 2.796465e+04, 2.796465e+04], [ 2.718626e+04, 2.724515e+04, 2.730367e+04, ..., 2.699761e+04, 2.706319e+04, 2.712589e+04], [ 2.562017e+04, 2.579349e+04, 2.595301e+04, ..., 2.497548e+04, 2.521368e+04, 2.542803e+04], ..., [ 1.720216e-02, 7.134989e-03, 2.748055e-03, ..., 1.569097e-01, 8.055879e-02, 3.857289e-02], [ 5.444020e-02, 3.951677e-02, 2.799566e-02, ..., 1.231596e-01, 9.609957e-02, 7.320690e-02], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00, 0.000000e+00]]) Coordinates: * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ... Attributes: long_name: surface geopotential units: m2/s2 from_hires: true filter: remap
We will now read the geopotential and make a plot of the topography of the Earth's surface as represented on the 2º grid. The code below makes a colorful plot of the topography. We also use the land-sea mask in order to plot nothing at grid points that are entirely ocean-covered.
Execute this code exactly as written first, and then play around with it to see how you might customize the graph.
Note that the function pcolormesh
does most of the work here. It's a function that makes color 2D plots of an array.
g = 9.8 # gravity in m/s2
meters_per_kilometer = 1E3
height = topo.PHIS / g / meters_per_kilometer # in kilometers
# Note that we have just created a new xarray.DataArray object that preserves the axis labels
# Let's go ahead and give it some useful metadata:
height.attrs['units'] = 'km'
height.name = 'height'
height
<xarray.DataArray 'height' (lat: 96, lon: 144)> array([[ 2.853536e+00, 2.853536e+00, 2.853536e+00, ..., 2.853536e+00, 2.853536e+00, 2.853536e+00], [ 2.774108e+00, 2.780118e+00, 2.786089e+00, ..., 2.754858e+00, 2.761550e+00, 2.767948e+00], [ 2.614303e+00, 2.631989e+00, 2.648266e+00, ..., 2.548518e+00, 2.572824e+00, 2.594697e+00], ..., [ 1.755323e-06, 7.280601e-07, 2.804138e-07, ..., 1.601120e-05, 8.220284e-06, 3.936009e-06], [ 5.555123e-06, 4.032323e-06, 2.856700e-06, ..., 1.256730e-05, 9.806078e-06, 7.470092e-06], [ 0.000000e+00, 0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00, 0.000000e+00]]) Coordinates: * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ... Attributes: units: km
Let's make a plot! xarray
is able to automatically generate labeled plots. This is very handy for "quick and dirty" investigation of the data:
height.plot()
<matplotlib.collections.QuadMesh at 0x1136ff6d8>
If we want more control over the appearance of the plot, we can use features of matplotlib
# A filled contour plot of topography with contours every 500 m
lev = np.arange(0., 6., 0.5)
fig1, ax1 = plt.subplots(figsize=(8,4))
# Here we are masking the data to exclude points where the land fraction is zero (water only)
cax1 = ax1.contourf( height.lon, height.lat,
height.where(topo.LANDFRAC>0), levels=lev)
ax1.set_title('Topography (km) and land-sea mask in CESM')
ax1.set_xlabel('Longitude')
ax1.set_ylabel('Latitude')
cbar1 = fig1.colorbar(cax1)
Note that at 2º resolution we can see many smaller features (e.g. Pacific islands). The model is given a fractional land cover for each grid point.
Here let's plot the land-sea mask itself so we can see where there is at least "some" water:
fig2, ax2 = plt.subplots()
cax2 = ax2.pcolormesh( topo.lon, topo.lat, topo.LANDFRAC )
ax2.set_title('Ocean mask in CESM')
ax2.set_xlabel('Longitude'); ax2.set_ylabel('Latitude')
cbar2 = fig2.colorbar(cax2);
Notice that to make these plots we've just plotted the lat-lon array without using any map projection.
There are nice tools available to make better maps. We'll leave that as a topic for another day. But if you're keen to read ahead, check out:
Another important input file contains information about the slab ocean. You can see this file in the data catalog here:
Let's load it and take a look.
som_input = xr.open_dataset( datapath + 'som_input/pop_frc.1x1d.090130.nc' + endstr, decode_times=False )
print(som_input)
<xarray.Dataset> Dimensions: (lat: 180, lon: 360, time: 12) Coordinates: * time (time) float32 14.0 46.0 74.0 105.0 135.0 166.0 196.0 227.0 ... Dimensions without coordinates: lat, lon Data variables: area (lat, lon) float64 ... mask (lat, lon) int32 ... yc (lat) float32 ... xc (lon) float32 ... S (time, lat, lon) float64 ... T (time, lat, lon) float64 ... U (time, lat, lon) float64 ... V (time, lat, lon) float64 ... dhdx (time, lat, lon) float64 ... dhdy (time, lat, lon) float64 ... hblt (time, lat, lon) float64 ... qdp (time, lat, lon) float64 ... Attributes: creation_date: Fri Jan 30 10:22:53 MST 2009 comment: This data is on a standard 1x1d grid. calendar: standard author: D. Bailey note3: qdp is computed from depth summed ocean column note2: all fields interpolated to T-grid note1: fields computed from 20-yr monthly means from pop description: Input data for DOCN7 mixed layer model from b40.999 source: pop_frc.ncl conventions: CCSM data model domain description title: Monthly averaged ocean forcing from POP output
The ocean / sea ice models exist on different grids than the atmosphere (1º instead of 2º resolution).
Now we are going to look at the annual mean heat flux out of the ocean, which is the prescribed 'q-flux' that we give to the slab ocean model.
It is stored in the field qdp
in the input file.
The sign convention in CESM is that qdp > 0
where heat is going IN to the ocean. We will change the sign to plot heat going OUT of the ocean INTO the atmosphere (a more atmosphere-centric viewpoint).
som_input.qdp
<xarray.DataArray 'qdp' (time: 12, lat: 180, lon: 360)> [777600 values with dtype=float64] Coordinates: * time (time) float32 14.0 46.0 74.0 105.0 135.0 166.0 196.0 227.0 ... Dimensions without coordinates: lat, lon Attributes: spatial_op: Bilinear remapping: 1st order: destarea: NCL: ./map_gx1v5_to...
Unfortunately, here is a case in which the metadata are not very useful. There is no text description of what variable qdp
actually is, or what its units are. (It is actually in units of W/m2)
We can see that there are 12 x 180 x 360 data points. One 180 x 360 grid for each calendar month!
Now we are going to take the average over the year at each point.
We will use the power of xarray
here to take the average over the time dimension, leaving us with a single grid on 180 latitude points by 360 longitude points:
(-som_input.qdp.mean(dim='time')).plot()
<matplotlib.collections.QuadMesh at 0x113c74438>
Now make a nice plot of the annual mean q-flux.
# We can always set a non-standard size for our figure window
fig3, ax3 = plt.subplots(figsize=(10, 6))
lev = np.arange(-700., 750., 50.)
cax3 = ax3.contourf(som_input.xc, som_input.yc,
-som_input.qdp.mean(dim='time'),
levels=lev, cmap=plt.cm.bwr)
cbar3 = fig3.colorbar(cax3)
ax3.set_title( 'CESM: Prescribed heat flux out of ocean (W m$^{-2}$), annual mean',
fontsize=14 )
ax3.set_xlabel('Longitude', fontsize=14)
ax3.set_ylabel('Latitude', fontsize=14)
ax3.text(65, 50, 'Annual', fontsize=16 )
ax3.contour(topo.lon, topo.lat, topo.LANDFRAC, levels=[0.5], colors='k');
Notice all the spatial structure here:
All this structure is determined by ocean circulation, which we are not modeling here. Instead, we are prescribing these heat flux patterns as an input to the atmosphere.
This pattern changes throughout the year. Recall that we just averaged over all months to make this plot. We might want to look at just one month:
# select by month index (0 through 11)
som_input.qdp.isel(time=0)
<xarray.DataArray 'qdp' (lat: 180, lon: 360)> array([[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [-5.75126 , -5.742314, -5.734764, ..., -5.792866, -5.774798, -5.761612], [-5.533446, -5.537023, -5.540624, ..., -5.522859, -5.526363, -5.529893], [-5.31631 , -5.314115, -5.311888, ..., -5.322703, -5.320605, -5.318473]]) Coordinates: time float32 14.0 Dimensions without coordinates: lat, lon Attributes: spatial_op: Bilinear remapping: 1st order: destarea: NCL: ./map_gx1v5_to...
# select by array slicing (but for this you have to know the axis order!)
som_input.qdp[0,:,:]
<xarray.DataArray 'qdp' (lat: 180, lon: 360)> array([[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [-5.75126 , -5.742314, -5.734764, ..., -5.792866, -5.774798, -5.761612], [-5.533446, -5.537023, -5.540624, ..., -5.522859, -5.526363, -5.529893], [-5.31631 , -5.314115, -5.311888, ..., -5.322703, -5.320605, -5.318473]]) Coordinates: time float32 14.0 Dimensions without coordinates: lat, lon Attributes: spatial_op: Bilinear remapping: 1st order: destarea: NCL: ./map_gx1v5_to...
Here we got just the first month (January) by specifying [0,:,:]
after the variable name. This is called slicing or indexing an array. We are saying "give me everything for month number 0". Now make the plot:
fig4, ax4 = plt.subplots(figsize=(10,4))
cax4 = ax4.contourf( som_input.xc, som_input.yc,
-som_input.qdp.isel(time=0),
levels=lev, cmap=plt.cm.bwr)
cbar4 = plt.colorbar(cax4)
ax4.set_title( 'CESM: Prescribed heat flux out of ocean (W m$^{-2}$)',
fontsize=14 )
ax3.set_xlabel('Longitude', fontsize=14)
ax3.set_ylabel('Latitude', fontsize=14)
ax4.text(65, 50, 'January', fontsize=12 );
ax4.contour(topo.lon, topo.lat, topo.LANDFRAC, levels=[0.5], colors='k');
IPython
provides some really neat and easy-to-use tools to set up interactive graphics in your notebook.
Here we're going to create a figure with a slider that lets of step through each month of the q-flux data.
# A list of text labels for each month
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug',
'Sep', 'Oct', 'Nov', 'Dec']
# an example of slicing this list:
months[-2:]
['Nov', 'Dec']
# A function that takes a month index (0 - 11) and creates a plot just like above
def sh(month):
fig, ax = plt.subplots(figsize=(10,4))
cax = ax.contourf( som_input.xc, som_input.yc,
-som_input.qdp.isel(time=month),
levels=lev, cmap=plt.cm.bwr)
cbar = plt.colorbar(cax)
ax.set_title( 'CESM: Prescribed heat flux out of ocean (W m$^{-2}$)',
fontsize=14 )
ax.set_xlabel('Longitude', fontsize=14)
ax.set_ylabel('Latitude', fontsize=14)
ax.text(65, 50, months[month], fontsize=12 );
ax.contour(topo.lon, topo.lat, topo.LANDFRAC, levels=[0.5], colors='k');
# Calling this function with a single month index gives us a single plot:
sh(6)
When you execute the next cell, you should get a figure with a slider above it. Go ahead and play with it.
from ipywidgets import interact
interact(sh, month=(0,11,1));
A Jupyter Widget
Our control run is set up to simulate the climate of the "pre-industrial era", meaning before significant human-induced changes to the composition of the atmosphere, nominally the year 1850.
Output from the control run is available on the same data server as above. Look in the folder called som_1850_f19
(Here som
stands for "slab ocean model", 1850 indicated pre-industrial conditions, and f19
is a code for the horizontal grid resolution).
There are climatology files for each active model component:
I created these files by averaging over the last 10 years of the simulation. Let's take a look at the atmosphere file. The file is called
som_1850_f19.cam.h0.clim.nc
(the file extension .nc
is used to indicate NetCDF format).
atm_control = xr.open_dataset( datapath + 'som_1850_f19/som_1850_f19.cam.h0.clim.nc' + endstr, decode_times=False)
print(atm_control)
<xarray.Dataset> Dimensions: (ilev: 27, lat: 96, lev: 26, lon: 144, nbnd: 2, slat: 95, slon: 144, time: 12) Coordinates: * lev (lev) float64 3.545 7.389 13.97 23.94 37.23 53.11 70.06 ... * ilev (ilev) float64 2.194 4.895 9.882 18.05 29.84 44.62 61.61 ... * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 ... * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 ... * slat (slat) float64 -89.05 -87.16 -85.26 -83.37 -81.47 -79.58 ... * slon (slon) float64 -1.25 1.25 3.75 6.25 8.75 11.25 13.75 16.25 ... Dimensions without coordinates: nbnd Data variables: hyam (lev) float64 ... hybm (lev) float64 ... hyai (ilev) float64 ... hybi (ilev) float64 ... P0 float64 ... date (time) int32 ... datesec (time) int32 ... w_stag (slat) float64 ... time_bnds (time, nbnd) float64 ... date_written (time) |S64 ... time_written (time) |S64 ... ntrm int32 ... ntrn int32 ... ntrk int32 ... ndbase int32 ... nsbase int32 ... nbdate int32 ... nbsec int32 ... mdt int32 ... nlon (lat) int32 ... wnummax (lat) int32 ... gw (lat) float64 ... ndcur (time) int32 ... nscur (time) int32 ... co2vmr (time) float64 ... ch4vmr (time) float64 ... n2ovmr (time) float64 ... f11vmr (time) float64 ... f12vmr (time) float64 ... sol_tsi (time) float64 ... nsteph (time) int32 ... AEROD_v (time, lat, lon) float64 ... CLDHGH (time, lat, lon) float32 ... CLDICE (time, lev, lat, lon) float32 ... CLDLIQ (time, lev, lat, lon) float32 ... CLDLOW (time, lat, lon) float32 ... CLDMED (time, lat, lon) float32 ... CLDTOT (time, lat, lon) float32 ... CLOUD (time, lev, lat, lon) float32 ... CONCLD (time, lev, lat, lon) float32 ... DCQ (time, lev, lat, lon) float32 ... DTCOND (time, lev, lat, lon) float32 ... DTV (time, lev, lat, lon) float32 ... EMIS (time, lev, lat, lon) float32 ... FICE (time, lev, lat, lon) float32 ... FLDS (time, lat, lon) float32 ... FLDSC (time, lat, lon) float32 ... FLNS (time, lat, lon) float32 ... FLNSC (time, lat, lon) float32 ... FLNT (time, lat, lon) float32 ... FLNTC (time, lat, lon) float32 ... FLUT (time, lat, lon) float32 ... FLUTC (time, lat, lon) float32 ... FSDS (time, lat, lon) float32 ... FSDSC (time, lat, lon) float32 ... FSDTOA (time, lat, lon) float32 ... FSNS (time, lat, lon) float32 ... FSNSC (time, lat, lon) float32 ... FSNT (time, lat, lon) float32 ... FSNTC (time, lat, lon) float32 ... FSNTOA (time, lat, lon) float32 ... FSNTOAC (time, lat, lon) float32 ... FSUTOA (time, lat, lon) float32 ... ICEFRAC (time, lat, lon) float32 ... ICIMR (time, lev, lat, lon) float32 ... ICWMR (time, lev, lat, lon) float32 ... LANDFRAC (time, lat, lon) float32 ... LHFLX (time, lat, lon) float32 ... LWCF (time, lat, lon) float32 ... MSKtem (time, lat, lon) float32 ... OCNFRAC (time, lat, lon) float32 ... OMEGA (time, lev, lat, lon) float32 ... OMEGAT (time, lev, lat, lon) float32 ... PBLH (time, lat, lon) float32 ... PHIS (time, lat, lon) float32 ... PRECC (time, lat, lon) float32 ... PRECL (time, lat, lon) float32 ... PRECSC (time, lat, lon) float32 ... PRECSL (time, lat, lon) float32 ... PS (time, lat, lon) float32 ... PSL (time, lat, lon) float32 ... Q (time, lev, lat, lon) float32 ... QFLX (time, lat, lon) float32 ... QREFHT (time, lat, lon) float32 ... QRL (time, lev, lat, lon) float32 ... QRS (time, lev, lat, lon) float32 ... RELHUM (time, lev, lat, lon) float32 ... SFCLDICE (time, lat, lon) float32 ... SFCLDLIQ (time, lat, lon) float32 ... SHFLX (time, lat, lon) float32 ... SNOWHICE (time, lat, lon) float32 ... SNOWHLND (time, lat, lon) float32 ... SOLIN (time, lat, lon) float32 ... SWCF (time, lat, lon) float32 ... T (time, lev, lat, lon) float32 ... TAUX (time, lat, lon) float32 ... TAUY (time, lat, lon) float32 ... TGCLDCWP (time, lat, lon) float32 ... TGCLDIWP (time, lat, lon) float32 ... TGCLDLWP (time, lat, lon) float32 ... TH (time, ilev, lat, lon) float32 ... TH2d (time, lat, lon) float32 ... TMQ (time, lat, lon) float32 ... TREFHT (time, lat, lon) float32 ... TS (time, lat, lon) float32 ... TSMN (time, lat, lon) float32 ... TSMX (time, lat, lon) float32 ... U (time, lev, lat, lon) float32 ... U10 (time, lat, lon) float32 ... U2d (time, lat, lon) float32 ... UTGWORO (time, lev, lat, lon) float32 ... UU (time, lev, lat, lon) float32 ... UV2d (time, lat, lon) float32 ... UV3d (time, ilev, lat, lon) float32 ... UW2d (time, lat, lon) float32 ... UW3d (time, ilev, lat, lon) float32 ... V (time, lev, lat, lon) float32 ... V2d (time, lat, lon) float32 ... VD01 (time, lev, lat, lon) float32 ... VQ (time, lev, lat, lon) float32 ... VT (time, lev, lat, lon) float32 ... VTH2d (time, lat, lon) float32 ... VTH3d (time, ilev, lat, lon) float32 ... VU (time, lev, lat, lon) float32 ... VV (time, lev, lat, lon) float32 ... W2d (time, lat, lon) float32 ... WTH3d (time, ilev, lat, lon) float32 ... Z3 (time, lev, lat, lon) float32 ... Attributes: Conventions: CF-1.0 source: CAM case: som_1850_f19 title: UNSET logname: br546577 host: snow-08.rit.alba Version: $Name$ revision_Id: $Id$ initial_file: /data/rose_scr/cesm_inputdata/atm/cam/in... topography_file: /data/rose_scr/cesm_inputdata/atm/cam/to... history: Tue May 27 11:27:43 2014: ncrcat /networ... nco_openmp_thread_number: 1 DODS.strlen: 8 DODS.dimName: chars DODS_EXTRA.Unlimited_Dimension: time
Here's the vertical coordinate -- pressure levels in units of hPa:
atm_control.lev
<xarray.DataArray 'lev' (lev: 26)> array([ 3.544638, 7.388814, 13.967214, 23.944625, 37.23029 , 53.114605, 70.05915 , 85.439115, 100.514695, 118.250335, 139.115395, 163.66207 , 192.539935, 226.513265, 266.481155, 313.501265, 368.81798 , 433.895225, 510.455255, 600.5242 , 696.79629 , 787.70206 , 867.16076 , 929.648875, 970.55483 , 992.5561 ]) Coordinates: * lev (lev) float64 3.545 7.389 13.97 23.94 37.23 53.11 70.06 85.44 ... Attributes: long_name: hybrid level at midpoints (1000*(A+B)) units: level positive: down standard_name: atmosphere_hybrid_sigma_pressure_coordinate formula_terms: a: hyam b: hybm p0: P0 ps: PS
Lots of different stuff! These are all the different quantities that are calculated as part of the model simulation. Every quantity represents a long-term average for a particular month.
Want to get more information about a particular variable?
atm_control.co2vmr
<xarray.DataArray 'co2vmr' (time: 12)> array([ 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285, 0.000285]) Coordinates: * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 226.0 ... Attributes: long_name: co2 volume mixing ratio cell_methods: time: mean
This is the (prescribed) amount of CO2 in the atmosphere (about 285 parts per million by volume).
One nice thing about xarray.DataArray
objects is that we can do simple arithmetic with them (already seen several examples of this in the notes above). For example, change the units of CO2 amount to ppm:
atm_control.co2vmr * 1E6
<xarray.DataArray 'co2vmr' (time: 12)> array([ 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7, 284.7]) Coordinates: * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 226.0 ...
Here's another variable:
atm_control.SOLIN
<xarray.DataArray 'SOLIN' (time: 12, lat: 96, lon: 144)> [165888 values with dtype=float32] Coordinates: * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 226.0 ... * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ... Attributes: Sampling_Sequence: rad_lwsw units: W/m2 long_name: Solar insolation cell_methods: time: mean time: mean
Apparently this is the incoming solar radiation or insolation, with shape (12,96,144) meaning it's got 12 months, 96 latitude points and 144 longitude points.
Make two well-labeled plots of the insolation:
First, note that we can't make a map of a three-dimensional dataset! We need to either select a single month, or average over months, to reduce the dimensionality of the data down to 2 (lat/lon).
# First, the explicit way. We loop over every month and add them up
num_months = len(atm_control.SOLIN.time)
average = 0. * atm_control.SOLIN[0,:,:]
for n in range(0,num_months):
average += atm_control.SOLIN[n,:,:]
average /= num_months
print(average)
<xarray.DataArray 'SOLIN' (lat: 96, lon: 144)> array([[ 172.817383, 172.817383, 172.817383, ..., 172.817383, 172.817383, 172.817383], [ 173.020493, 173.02037 , 173.020142, ..., 173.019943, 173.02002 , 173.020279], [ 173.62915 , 173.62886 , 173.628769, ..., 173.629318, 173.628906, 173.629028], ..., [ 172.122559, 172.123032, 172.123047, ..., 172.122635, 172.12294 , 172.122757], [ 171.51123 , 171.511337, 171.51149 , ..., 171.511581, 171.511597, 171.511536], [ 171.308182, 171.308182, 171.308182, ..., 171.308182, 171.308182, 171.308182]], dtype=float32) Coordinates: time float64 14.0 * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ...
# Second, the automatic way using xarray's .mean() method
average2 = atm_control.SOLIN.mean(dim='time')
print(average2)
<xarray.DataArray 'SOLIN' (lat: 96, lon: 144)> array([[ 172.817383, 172.817383, 172.817383, ..., 172.817383, 172.817383, 172.817383], [ 173.020493, 173.02037 , 173.020142, ..., 173.019943, 173.02002 , 173.020279], [ 173.62915 , 173.62886 , 173.628769, ..., 173.629318, 173.628906, 173.629028], ..., [ 172.122559, 172.123032, 172.123047, ..., 172.122635, 172.12294 , 172.122757], [ 171.51123 , 171.511337, 171.51149 , ..., 171.511581, 171.511597, 171.511536], [ 171.308182, 171.308182, 171.308182, ..., 171.308182, 171.308182, 171.308182]], dtype=float32) Coordinates: * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ...
# Are they the same?
average == average2
<xarray.DataArray 'SOLIN' (lat: 96, lon: 144)> array([[ True, True, True, ..., True, True, True], [ True, True, True, ..., True, True, True], [ True, True, True, ..., True, True, True], ..., [ True, True, True, ..., True, True, True], [ True, True, True, ..., True, True, True], [ True, True, True, ..., True, True, True]], dtype=bool) Coordinates: time float64 14.0 * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ...
# Looks like it. But this will test if every single element is the same, using numpy.all():
np.all(average==average2)
True
# For the insolation, a "quick and dirty" plot will suffice!
ax = average.plot()
This is the annual average map of insolation, or incoming solar radiation.
It is kind of a boring map because insolation is the same at every longitude point (once you average over a single day).
But this figure shows that the earth receives greater than 400 W/m2 sunlight at the equator, and less than 200 W/m2 annually at the poles.
# June is the 6th month, which makes it index number 5 here
insolation_june = atm_control.SOLIN.isel(time=5)
# And december is index 11
insolation_dec = atm_control.SOLIN.isel(time=11)
(insolation_june - insolation_dec).plot()
<matplotlib.collections.QuadMesh at 0x31d36d668>
Now we are looking at the difference between the two solstice seasons. These differences are obviously largest at the poles. We'll look at the distribution of insolation in more detail later in the course.
Recall that our investigations so far have been guided by this figure of the observed annual, global mean energy budget:
In order to compare these numbers with the control run, we need to take global averages of the data.
A global average must be weighted by the area of each grid cell. We cannot simply average over each data point on a latitude-longitude grid.
WHY?
The global average needs to weighted by the cosine of latitude (do you understand why?)
We can implement this in xarray
as follows:
# functions available in xarray for standard mathematical operations
# (but still preserving DataArray attributes and axes)
from xarray.ufuncs import cos, deg2rad
# Take the cosine of latitude (first converting to radians)
coslat = cos(deg2rad(atm_control.lat))
# And divide by its mean value
weight_factor = coslat / coslat.mean(dim='lat')
# Want to see what we just created?
weight_factor
<xarray.DataArray 'lat' (lat: 96)> array([ 9.720485e-17, 5.248730e-02, 1.049172e-01, 1.572324e-01, 2.093756e-01, 2.612899e-01, 3.129185e-01, 3.642049e-01, 4.150931e-01, 4.655273e-01, 5.154525e-01, 5.648141e-01, 6.135580e-01, 6.616311e-01, 7.089806e-01, 7.555549e-01, 8.013030e-01, 8.461749e-01, 8.901215e-01, 9.330948e-01, 9.750478e-01, 1.015935e+00, 1.055710e+00, 1.094332e+00, 1.131757e+00, 1.167944e+00, 1.202854e+00, 1.236449e+00, 1.268692e+00, 1.299547e+00, 1.328981e+00, 1.356963e+00, 1.383460e+00, 1.408445e+00, 1.431889e+00, 1.453768e+00, 1.474057e+00, 1.492734e+00, 1.509779e+00, 1.525173e+00, 1.538899e+00, 1.550943e+00, 1.561290e+00, 1.569931e+00, 1.576854e+00, 1.582054e+00, 1.585523e+00, 1.587259e+00, 1.587259e+00, 1.585523e+00, 1.582054e+00, 1.576854e+00, 1.569931e+00, 1.561290e+00, 1.550943e+00, 1.538899e+00, 1.525173e+00, 1.509779e+00, 1.492734e+00, 1.474057e+00, 1.453768e+00, 1.431889e+00, 1.408445e+00, 1.383460e+00, 1.356963e+00, 1.328981e+00, 1.299547e+00, 1.268692e+00, 1.236449e+00, 1.202854e+00, 1.167944e+00, 1.131757e+00, 1.094332e+00, 1.055710e+00, 1.015935e+00, 9.750478e-01, 9.330948e-01, 8.901215e-01, 8.461749e-01, 8.013030e-01, 7.555549e-01, 7.089806e-01, 6.616311e-01, 6.135580e-01, 5.648141e-01, 5.154525e-01, 4.655273e-01, 4.150931e-01, 3.642049e-01, 3.129185e-01, 2.612899e-01, 2.093756e-01, 1.572324e-01, 1.049172e-01, 5.248730e-02, 9.720485e-17]) Coordinates: * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ...
You will find that many gridded datasets already provide a field that gives accurate area weighting.
In the case of the CESM output, the field is called gw
weight_factor2 = atm_control.gw / atm_control.gw.mean(dim='lat')
# Compare our two weights
# The field FLNT is actually the net longwave radiation flux at the top of the atmosphere --
# What we have called the OLR in this course
print( (atm_control.FLNT * weight_factor).mean(dim=('time', 'lon', 'lat')))
print( (atm_control.FLNT * weight_factor2).mean(dim=('time', 'lon', 'lat')))
# Notice how we can average simultaneously over multiple dimensions here!
<xarray.DataArray ()> array(234.14572580253756) <xarray.DataArray ()> array(234.13552628165561)
These numbers should be very close to each other.
Make sure you can take a global average, testing on surface temperature in the control run.
Surface temperature is called 'TS'
in the control run data file.
Calculate annual, global average 'TS'
Verify that you get something close to 288 K
If you don't, try to find and fix the errors. ____________________________
First, calculate the global average directly as above:
# Take the area-weighted global average
(atm_control.TS * weight_factor2).mean(dim=('time', 'lon', 'lat'))
<xarray.DataArray ()> array(287.6232949136381)
Since we want to repeat this operation, why not wrap it in a reusable function!
def global_average(field, weight=weight_factor2):
return (field * weight).mean(dim=('time', 'lon', 'lat'))
Look carefully at the function definition and make sure you understand it.
# Of course this produces the same result as above
global_average(atm_control.TS)
<xarray.DataArray ()> array(287.6232949136381)
# If we want to display the result as a plain floating-point number:
float(global_average(atm_control.TS))
287.6232949136381
Now that you have a working function to take global averages, we can compare some energy budget values against observations.
The model output contains lots of diagnostics about the radiative fluxes. Here some CESM naming conventions to help you find the appropriate output fields:
'F'
are an energy flux of some kind.'FLNT'
'FL'
means longwave flux (i.e. terrestrial)'FS'
means shortwave flux (i.e. solar)'U'
= up'D'
= down'N'
= net'T'
= top of atmosphere'S'
= surface'FLNT'
means 'net longwave flux at the top of atmosphere', i.e. the outgoing longwave radiation.You wil see that these are all 12 x 96 x 144 -- i.e. a two-dimensional grid for every calendar month.
atm_control.FLNT
<xarray.DataArray 'FLNT' (time: 12, lat: 96, lon: 144)> [165888 values with dtype=float32] Coordinates: * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 226.0 ... * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ... Attributes: Sampling_Sequence: rad_lwsw units: W/m2 long_name: Net longwave flux at top of model cell_methods: time: mean time: mean
Compute annual, global averages of the following quantities.
Compare your results briefly to the observations. ____________________________
# Insolation
Q = global_average(atm_control.SOLIN)
print(Q)
<xarray.DataArray ()> array(340.3058626212777)
# Absorbed shortwave radiation
ASR = global_average(atm_control.FSNT)
print(ASR)
<xarray.DataArray ()> array(234.03991728200748)
# Outgoing longwave radiation
OLR = global_average(atm_control.FLNT)
print(OLR)
<xarray.DataArray ()> array(234.13552628165561)
Comparing to the observations, it looks like both the OLR and ASR are a little lower than observed.
But, what about the net imbalance at the top of the atmosphere?
print(ASR - OLR)
<xarray.DataArray ()> array(-0.0956089996481353)
The net imbalance is very small (less than 0.1 W/m2, or about 10x smaller than the observed imbalance).
This means that the model is very close to global energy balance so we should not expect the climate to change if we keep running this simulation forward.
The negative sign here means that the model is actually losing energy (OLR is greater than ASR) -- but again, it is small.
Feel free to keep exploring the data!
Many other fields are four-dimensional (time, level, latitude, longitude).
For example, here is the shape of the array that hold the air temperature at every point and every month:
atm_control['T'].shape
(12, 26, 96, 144)
xarray
¶Normally we can access a variable with the notation Dataset.variable_name
But in xarray
(and also in numpy
and other packages), the notation object.T
actually represents the transpose operator:
print( atm_control.FLNT.shape)
print( atm_control.FLNT.T.shape)
(12, 96, 144) (144, 96, 12)
Here there is a name conflict because the air temperature variable is called T
. One solution is to access it through the dictionary method, as I did above:
atm_control['T']
<xarray.DataArray 'T' (time: 12, lev: 26, lat: 96, lon: 144)> [4313088 values with dtype=float32] Coordinates: * lev (lev) float64 3.545 7.389 13.97 23.94 37.23 53.11 70.06 85.44 ... * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 226.0 ... * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ... Attributes: mdims: 1 units: K long_name: Temperature cell_methods: time: mean time: mean
Another solution is just to rename the variable (here going from T
to Ta
):
atm_control.rename({'T': 'Ta'}, inplace=True)
atm_control.Ta
<xarray.DataArray 'Ta' (time: 12, lev: 26, lat: 96, lon: 144)> [4313088 values with dtype=float32] Coordinates: * lev (lev) float64 3.545 7.389 13.97 23.94 37.23 53.11 70.06 85.44 ... * time (time) float64 14.0 45.0 73.0 104.0 134.0 165.0 195.0 226.0 ... * lat (lat) float64 -90.0 -88.11 -86.21 -84.32 -82.42 -80.53 -78.63 ... * lon (lon) float64 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 ... Attributes: mdims: 1 units: K long_name: Temperature cell_methods: time: mean time: mean
And here is some code to plot the average sounding (temperature as function of pressure) at a particular point in the month of January.
plt.plot( atm_control.Ta[0,:,70,115], atm_control.lev )
plt.gca().invert_yaxis()
plt.ylabel('Pressure (hPa)')
plt.xlabel('Temperature (K)')
<matplotlib.text.Text at 0x31ca83898>
What was the location we just used for that plot? Let's check by indexing the latitude and longitude arrays:
print( atm_control.lat[70].values)
print( atm_control.lon[115].values)
42.63157894736841 287.5
These are actually the coordinates of the Albany area (read longitude in degrees east).
Thanks for playing!