Notebook

Regridding with xESMF and calculating a multi-model mean¶

Overview¶

The main goal of this workflow is to calculate the mean change in ocean heat uptake (OHU) associated with the transient climate response (TCR) for CMIP6. TCR is defined as the change in global mean surface temperature at the time of CO$_2$ doubling in a climate model run with a 1% increase in CO$_2$ per year. The amount and pattern of heat uptake into the oceans are important in determining the strength of radiative feedbacks and thus climate sensitivity. See Xie (2020) for an overview.

In order to use as many models as possible, we will need to load the model output in its native grid, then regrid to a common grid (here 1°x1° lat-lon) using xESMF. From there, we can take the average across models and either plot the result or save it as a netCDF file for later use.

Prerequisites¶

Concepts	Importance	Notes
Intro to Xarray	Necessary
Computations and Masks with Xarray	Necessary
Load CMIP6 Data with Intake-ESM	Necessary
Intro to Cartopy	Helpful
Understanding of NetCDF	Helpful
Familiarity with CMIP6	Helpful

Time to learn: 30 minutes

Imports¶

In [ ]:

import matplotlib.pyplot as plt
import matplotlib.colors as colors
import numpy as np
import pandas as pd
import xarray as xr
import intake
import xesmf as xe
from cartopy import crs as ccrs
from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter

Access the data¶

First, we will open and search the Pangeo CMIP6 catalog for monthly hfds (downward heat flux at the sea surface) for the control (piControl) and 1%/year CO$_2$ (1pctCO2) runs for all available models on their native grids. The argument require_all_on='source_id' ensures that each model used has both experiments required for this analysis.

In [ ]:

cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(cat_url)

In [ ]:

query = dict(experiment_id=['1pctCO2', 'piControl'], table_id='Omon', variable_id='hfds', 
             grid_label='gn', member_id='r1i1p1f1', require_all_on='source_id')

cat = col.search(**query)
cat.df

Conveniently, NCAR contributed some data to CMIP6 that has already been regridded to a 1x1 lat-lon grid, which is the resolution I am interested in for the ensemble mean. We will use the coordinates from this Dataset when we create the xESMF regridder.

In [ ]:

rg_query = dict(source_id='CESM2', experiment_id='piControl', table_id='Omon', variable_id='hfds', 
             grid_label='gr', member_id='r1i1p1f1', require_all_on=['source_id'])

rg_cat = col.search(**rg_query)
rg_cat.df

Now, make the dictionaries with the data:

In [ ]:

dset_dict = cat.to_dataset_dict(zarr_kwargs={'use_cftime':True})
list(dset_dict.keys())

In [ ]:

rg_dset_dict = rg_cat.to_dataset_dict(zarr_kwargs={'use_cftime':True})
list(rg_dset_dict.keys())

Define some functions and organize¶

First, let's make a function to get the diagnostic of interest: the change in ocean heat uptake at the time of transient CO$_2$ doubling compared to the pre-industrial control:

In [ ]:

def get_tcr(ctrl_key, expr_key):
    ds_1pct = dset_dict[expr_key].squeeze()
    ds_piCl = dset_dict[ctrl_key].squeeze()
    ds_tcr = ds_1pct.isel(time=slice(12*59, 12*80)).mean(dim='time') - ds_piCl.isel(time=slice(12*59, 12*80)).mean(dim='time')
    return ds_tcr

Note that the time slice is 20 years centered around year 70, which is when CO$_2$ doubles in a 1pctCO2 experiment ($1.01^{70}\approx 2$). Just for convenience, we will also define a function that creates the xESMF regridder and performs the regridding. The regridder is specific to the input (ds_in) and output (regrid_to) grids, so it must be redefined for each model.

In [ ]:

def regrid(ds_in, regrid_to, method='bilinear'):
    regridder = xe.Regridder(ds_in, regrid_to, method=method, periodic=True, ignore_degenerate=True)
    ds_out = regridder(ds_in)
    return ds_out

Finally, the following function takes the list of keys generated by Intake-ESM and splits them into two sorted lists of keys: one for the piControl experiment and another for 1pctCO2. This will work nicely with the get_tcr() function.

In [ ]:

def sorted_split_list(a_list):
    c_list = []
    e_list = []
    for item in a_list:
        if 'piControl' in item:
            c_list.append(item)
        elif '1pctCO2' in item:
            e_list.append(item)
        else: print('Could not find experiment name in key:'+item)
    return sorted(c_list), sorted(e_list)

Let's make the lists and look at them to make sure they are properly sorted:

In [ ]:

ctrl_keys, expr_keys = sorted_split_list(list(dset_dict.keys()))

In [ ]:

for i in range(len(ctrl_keys)):
    print(ctrl_keys[i]+'\t\t'+expr_keys[i])

Note¶

If you look at the hfds anomaly for SAM0-UNICON, you will see negative values around the North Atlantic, especially in the Labrador Sea and Denmark Strait. These are areas of deep water formation and ocean heat uptake. By the CMIP convention, as described in the hfds attributes, a negative hfds indicates an upward heat flux from the ocean to the atmosphere, so by physical reasoning, this data should have the opposite sign. We could do this manually, but for simplicity, let's just remove the model from our analysis.

In [ ]:

dset_dict['CMIP.SNU.SAM0-UNICON.1pctCO2.Omon.gn']

In [ ]:

get_tcr('CMIP.SNU.SAM0-UNICON.piControl.Omon.gn', 'CMIP.SNU.SAM0-UNICON.1pctCO2.Omon.gn').hfds.plot()

In [ ]:

ctrl_keys.pop(-3)
expr_keys.pop(-3)

We will also remove AWI-CM because it raises a MemoryError that causes this notebook to fail to execute via binderbot. Feel free to add it back if this notebook is being run locally.

In [ ]:

ctrl_keys.pop(1)
expr_keys.pop(1)

Regrid the data¶

First, we will define the output grid. It does not matter what the data actually is, since we just want the structure of the Dataset.

In [ ]:

rg_ds = rg_dset_dict['CMIP.NCAR.CESM2.piControl.Omon.gr'].isel(time=0).squeeze()
rg_ds

Here we create a new dictionary to store our regridded data. The for-loop goes through the two sorted lists of keys and tries to regrid each model. This allows us to avoid removing a model and rerunning the code every time there is an error.

To summarize,

Get the diagnostic of interest and try to regrid to a 1x1 lat-lon grid
- If that fails for any reason, print the error
- If the regridding is successful, add it to the new dictionary
Repeat for all models

In [ ]:

ds_regrid_dict = dict()
success_count = 0
model_count = 0

for ctrl_key, expr_key in zip(ctrl_keys, expr_keys):
    model = ctrl_key.split('.')[2]
    try:
        ds_tcr = get_tcr(ctrl_key=ctrl_key, expr_key=expr_key)
        ds_tcr_hfds_regridded = regrid(ds_tcr, rg_ds, method='nearest_s2d').hfds
    except Exception as e:
        print('Failed to regrid '+model+': '+str(e))
    else: 
        ds_regrid_dict[model] = ds_tcr_hfds_regridded
        print(model+' regridded and added to dictionary')
        success_count += 1
    finally:
        model_count += 1
        
print('-'*40+'\n| '+str(success_count)+'/'+str(model_count)+' models successfully regridded! |\n'+'-'*40)

CESM2-FV2 fails because of some issue with the dimensions of the coordinates. If we remove ignore_degenerate=True from the regridder defined in regrid(), there may be a few more failures because of a degenerate element: a cell that has corners close enough that the cell collapses to a line or point.

Now we concat the results into a single DataArray:

In [ ]:

ds = list(ds_regrid_dict.values())
coord = list(ds_regrid_dict.keys())
ds_out_regrid = xr.concat(objs=ds, dim=coord, coords='all').rename({'concat_dim':'model'})
ds_out_regrid

Plot or save the data¶

The following function extends lon by one grid point, giving it the value of the first point. This fixes a bug/feature of Cartopy where a vertical white line will appear at the "seam" of the plot. For example, if you have a dataset with longitudes [-179.5, 179.5] and make a plot centered on the Pacific, there will likely be a white line at 180. This is only for improving the look of the plot, so if you are doing further analysis or exporting to netCDF, skip this.

In [ ]:

def add_cyclic_point(xarray_obj, dim, period=None):
    if period is None:
        period = xarray_obj.sizes[dim] / xarray_obj.coords[dim][:2].diff(dim).item()
    first_point = xarray_obj.isel({dim: slice(1)})
    first_point.coords[dim] = first_point.coords[dim]+period
    return xr.concat([xarray_obj, first_point], dim=dim)

Now we can take the ensemble mean and plot. Thanks to the work leading up to this point, it's as simple as using Xarray's .mean().

In [ ]:

cmip6em_ohutcr = add_cyclic_point(ds_out_regrid.mean(dim='model'), 'lon', period=360)
# cmip6em_ohutcr.to_netcdf('cmip6_ohutcr.nc') # remove add_cyclic_point() and uncomment to save
cmip6em_ohutcr

In [ ]:

fig = plt.figure(1, figsize=(12, 5), dpi=130)
ax_mean = plt.subplot(projection=ccrs.PlateCarree(central_longitude=-150))
mean_plot = ax_mean.contourf(cmip6em_ohutcr.lon, cmip6em_ohutcr.lat, cmip6em_ohutcr, transform=ccrs.PlateCarree(), 
                             cmap='RdBu_r', levels=np.linspace(-35, 35, 15), extend='both')
ax_mean.set_title('CMIP6 ensemble-mean $\Delta\mathrm{OHUTCR}$')
ax_mean.coastlines()
ax_mean.set_xticks([-120, -60, 0, 60, 120, 180], crs=ccrs.PlateCarree())
ax_mean.set_yticks([-90, -60, -30, 0, 30, 60, 90], crs=ccrs.PlateCarree())
ax_mean.xaxis.set_major_formatter(LongitudeFormatter(zero_direction_label=True))
ax_mean.yaxis.set_major_formatter(LatitudeFormatter())
plt.colorbar(mean_plot, orientation='vertical', label='W m$^{-2}$')

Notice how the heat uptake is highest in the subpolar oceans, especially the North Atlantic. From this multi-model ensemble mean, we can see that this is a robust feature of climate models (and likely the climate system itself) in response to a CO$_2$ forcing. For more background and motivation, see Hu et al. (2020).

Summary¶

This notebook demonstrates the use of xESMF to regrid the CMIP6 data hosted in Pangeo's Google cloud storage. The regridded data allows us to use Xarray to take a multi-model mean, in this case, of changes in ocean heat uptake associated with each model's transient climate response.

What's next?¶

Other example workflows using this CMIP6 cloud data.

Resources and references¶

Hu, S., Xie, S.-P., & Liu, W. (2020). Global Pattern Formation of Net Ocean Surface Heat Flux Response to Greenhouse Warming. Journal of Climate, 33(17), 7503–7522. https://doi.org/10.1175/JCLI-D-19-0642.1

Xie, S.-P. (2020). Ocean warming pattern effect on global and regional climate change. AGU Advances, 1, e2019AV000130. https://doi.org/10.1029/2019AV000130

Parts of this workflow were taken from a similar workflow in this notebook by NordicESMhub.