The main goal of this workflow is to calculate the mean change in ocean heat uptake (OHU) associated with the transient climate response (TCR) for CMIP6. TCR is defined as the change in global mean surface temperature at the time of CO$_2$ doubling in a climate model run with a 1% increase in CO$_2$ per year. The amount and pattern of heat uptake into the oceans are important in determining the strength of radiative feedbacks and thus climate sensitivity. See Xie (2020) for an overview.
In order to use as many models as possible, we will need to load the model output in its native grid, then regrid to a common grid (here 1°x1° lat-lon) using xESMF. From there, we can take the average across models and either plot the result or save it as a netCDF file for later use.
Concepts | Importance | Notes |
---|---|---|
Intro to Xarray | Necessary | |
Computations and Masks with Xarray | Necessary | |
Load CMIP6 Data with Intake-ESM | Necessary | |
Intro to Cartopy | Helpful | |
Understanding of NetCDF | Helpful | |
Familiarity with CMIP6 | Helpful |
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import numpy as np
import pandas as pd
import xarray as xr
import intake
import xesmf as xe
from cartopy import crs as ccrs
from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter
First, we will open and search the Pangeo CMIP6 catalog for monthly hfds
(downward heat flux at the sea surface) for the control (piControl
) and 1%/year CO$_2$ (1pctCO2
) runs for all available models on their native grids. The argument require_all_on='source_id'
ensures that each model used has both experiments required for this analysis.
cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(cat_url)
query = dict(experiment_id=['1pctCO2', 'piControl'], table_id='Omon', variable_id='hfds',
grid_label='gn', member_id='r1i1p1f1', require_all_on='source_id')
cat = col.search(**query)
cat.df
Conveniently, NCAR contributed some data to CMIP6 that has already been regridded to a 1x1 lat-lon grid, which is the resolution I am interested in for the ensemble mean. We will use the coordinates from this Dataset when we create the xESMF regridder.
rg_query = dict(source_id='CESM2', experiment_id='piControl', table_id='Omon', variable_id='hfds',
grid_label='gr', member_id='r1i1p1f1', require_all_on=['source_id'])
rg_cat = col.search(**rg_query)
rg_cat.df
Now, make the dictionaries with the data:
dset_dict = cat.to_dataset_dict(zarr_kwargs={'use_cftime':True})
list(dset_dict.keys())
rg_dset_dict = rg_cat.to_dataset_dict(zarr_kwargs={'use_cftime':True})
list(rg_dset_dict.keys())
First, let's make a function to get the diagnostic of interest: the change in ocean heat uptake at the time of transient CO$_2$ doubling compared to the pre-industrial control:
def get_tcr(ctrl_key, expr_key):
ds_1pct = dset_dict[expr_key].squeeze()
ds_piCl = dset_dict[ctrl_key].squeeze()
ds_tcr = ds_1pct.isel(time=slice(12*59, 12*80)).mean(dim='time') - ds_piCl.isel(time=slice(12*59, 12*80)).mean(dim='time')
return ds_tcr
Note that the time slice is 20 years centered around year 70, which is when CO$_2$ doubles in a 1pctCO2 experiment ($1.01^{70}\approx 2$). Just for convenience, we will also define a function that creates the xESMF regridder and performs the regridding. The regridder is specific to the input (ds_in
) and output (regrid_to
) grids, so it must be redefined for each model.
def regrid(ds_in, regrid_to, method='bilinear'):
regridder = xe.Regridder(ds_in, regrid_to, method=method, periodic=True, ignore_degenerate=True)
ds_out = regridder(ds_in)
return ds_out
Finally, the following function takes the list of keys generated by Intake-ESM and splits them into two sorted lists of keys: one for the piControl experiment and another for 1pctCO2. This will work nicely with the get_tcr()
function.
def sorted_split_list(a_list):
c_list = []
e_list = []
for item in a_list:
if 'piControl' in item:
c_list.append(item)
elif '1pctCO2' in item:
e_list.append(item)
else: print('Could not find experiment name in key:'+item)
return sorted(c_list), sorted(e_list)
Let's make the lists and look at them to make sure they are properly sorted:
ctrl_keys, expr_keys = sorted_split_list(list(dset_dict.keys()))
for i in range(len(ctrl_keys)):
print(ctrl_keys[i]+'\t\t'+expr_keys[i])
If you look at the hfds
anomaly for SAM0-UNICON, you will see negative values around the North Atlantic, especially in the Labrador Sea and Denmark Strait. These are areas of deep water formation and ocean heat uptake. By the CMIP convention, as described in the hfds
attributes, a negative hfds
indicates an upward heat flux from the ocean to the atmosphere, so by physical reasoning, this data should have the opposite sign. We could do this manually, but for simplicity, let's just remove the model from our analysis.
dset_dict['CMIP.SNU.SAM0-UNICON.1pctCO2.Omon.gn']
get_tcr('CMIP.SNU.SAM0-UNICON.piControl.Omon.gn', 'CMIP.SNU.SAM0-UNICON.1pctCO2.Omon.gn').hfds.plot()
ctrl_keys.pop(-3)
expr_keys.pop(-3)
We will also remove AWI-CM because it raises a MemoryError
that causes this notebook to fail to execute via binderbot. Feel free to add it back if this notebook is being run locally.
ctrl_keys.pop(1)
expr_keys.pop(1)
First, we will define the output grid. It does not matter what the data actually is, since we just want the structure of the Dataset.
rg_ds = rg_dset_dict['CMIP.NCAR.CESM2.piControl.Omon.gr'].isel(time=0).squeeze()
rg_ds
Here we create a new dictionary to store our regridded data. The for-loop goes through the two sorted lists of keys and tries to regrid each model. This allows us to avoid removing a model and rerunning the code every time there is an error.
To summarize,
ds_regrid_dict = dict()
success_count = 0
model_count = 0
for ctrl_key, expr_key in zip(ctrl_keys, expr_keys):
model = ctrl_key.split('.')[2]
try:
ds_tcr = get_tcr(ctrl_key=ctrl_key, expr_key=expr_key)
ds_tcr_hfds_regridded = regrid(ds_tcr, rg_ds, method='nearest_s2d').hfds
except Exception as e:
print('Failed to regrid '+model+': '+str(e))
else:
ds_regrid_dict[model] = ds_tcr_hfds_regridded
print(model+' regridded and added to dictionary')
success_count += 1
finally:
model_count += 1
print('-'*40+'\n| '+str(success_count)+'/'+str(model_count)+' models successfully regridded! |\n'+'-'*40)
CESM2-FV2 fails because of some issue with the dimensions of the coordinates. If we remove ignore_degenerate=True
from the regridder defined in regrid()
, there may be a few more failures because of a degenerate element: a cell that has corners close enough that the cell collapses to a line or point.
Now we concat the results into a single DataArray:
ds = list(ds_regrid_dict.values())
coord = list(ds_regrid_dict.keys())
ds_out_regrid = xr.concat(objs=ds, dim=coord, coords='all').rename({'concat_dim':'model'})
ds_out_regrid
The following function extends lon
by one grid point, giving it the value of the first point. This fixes a bug/feature of Cartopy where a vertical white line will appear at the "seam" of the plot. For example, if you have a dataset with longitudes [-179.5, 179.5] and make a plot centered on the Pacific, there will likely be a white line at 180. This is only for improving the look of the plot, so if you are doing further analysis or exporting to netCDF, skip this.
def add_cyclic_point(xarray_obj, dim, period=None):
if period is None:
period = xarray_obj.sizes[dim] / xarray_obj.coords[dim][:2].diff(dim).item()
first_point = xarray_obj.isel({dim: slice(1)})
first_point.coords[dim] = first_point.coords[dim]+period
return xr.concat([xarray_obj, first_point], dim=dim)
Now we can take the ensemble mean and plot. Thanks to the work leading up to this point, it's as simple as using Xarray's .mean()
.
cmip6em_ohutcr = add_cyclic_point(ds_out_regrid.mean(dim='model'), 'lon', period=360)
# cmip6em_ohutcr.to_netcdf('cmip6_ohutcr.nc') # remove add_cyclic_point() and uncomment to save
cmip6em_ohutcr
fig = plt.figure(1, figsize=(12, 5), dpi=130)
ax_mean = plt.subplot(projection=ccrs.PlateCarree(central_longitude=-150))
mean_plot = ax_mean.contourf(cmip6em_ohutcr.lon, cmip6em_ohutcr.lat, cmip6em_ohutcr, transform=ccrs.PlateCarree(),
cmap='RdBu_r', levels=np.linspace(-35, 35, 15), extend='both')
ax_mean.set_title('CMIP6 ensemble-mean $\Delta\mathrm{OHUTCR}$')
ax_mean.coastlines()
ax_mean.set_xticks([-120, -60, 0, 60, 120, 180], crs=ccrs.PlateCarree())
ax_mean.set_yticks([-90, -60, -30, 0, 30, 60, 90], crs=ccrs.PlateCarree())
ax_mean.xaxis.set_major_formatter(LongitudeFormatter(zero_direction_label=True))
ax_mean.yaxis.set_major_formatter(LatitudeFormatter())
plt.colorbar(mean_plot, orientation='vertical', label='W m$^{-2}$')
Notice how the heat uptake is highest in the subpolar oceans, especially the North Atlantic. From this multi-model ensemble mean, we can see that this is a robust feature of climate models (and likely the climate system itself) in response to a CO$_2$ forcing. For more background and motivation, see Hu et al. (2020).
This notebook demonstrates the use of xESMF to regrid the CMIP6 data hosted in Pangeo's Google cloud storage. The regridded data allows us to use Xarray to take a multi-model mean, in this case, of changes in ocean heat uptake associated with each model's transient climate response.
Other example workflows using this CMIP6 cloud data.
Hu, S., Xie, S.-P., & Liu, W. (2020). Global Pattern Formation of Net Ocean Surface Heat Flux Response to Greenhouse Warming. Journal of Climate, 33(17), 7503–7522. https://doi.org/10.1175/JCLI-D-19-0642.1
Xie, S.-P. (2020). Ocean warming pattern effect on global and regional climate change. AGU Advances, 1, e2019AV000130. https://doi.org/10.1029/2019AV000130
Parts of this workflow were taken from a similar workflow in this notebook by NordicESMhub.