This notebook shows how to load and analyze ocean data from an out-of-the-box MOM6/CESM G-case simulation (coupled ocean ocean/sea ice).
NOTE: MOM6/CESM is not ready to be used for research.
%matplotlib inline
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import holoviews as hv
import datashader
from holoviews.operation.datashader import regrid, shade, datashade
hv.extension('bokeh', width=100)
This will launch a cluster of virtual machines in the cloud.
from dask.distributed import Client, progress
from dask_kubernetes import KubeCluster
cluster = KubeCluster(n_workers=10)
cluster
👆 Don't forget to click this link to get the cluster dashboard
client = Client(cluster)
client
This data is stored in xarray-zarr format in Google Cloud Storage. This format is optimized for parallel distributed reads from within the cloud environment.
It may take up to a minute to initialize the dataset when you run this cell.
# Load with Cloud object storage
import gcsfs
gcsmap = gcsfs.mapping.GCSMap('pangeo-data/MOM6.cesm/')
ds = xr.open_zarr(gcsmap, decode_times=False)
# Print dataset
print(ds)
The cells below show how to interactively explore the dataset.
sst_ds = hv.Dataset(ds['SST'], kdims=['time', 'geolon', 'geolat'])
sst = sst_ds.to(hv.QuadMesh, kdims=["geolon", "geolat"], dynamic=True)
%opts RGB [width=900 height=600]
datashade(sst, precompute=True, cmap=plt.cm.RdBu_r)
sss_ds = hv.Dataset(ds['SSS'], kdims=['time', 'geolon', 'geolat'])
sss = sst_ds.to(hv.QuadMesh, kdims=["geolon", "geolat"], dynamic=True)
%opts RGB [width=900 height=600]
datashade(sss, precompute=True, cmap=plt.cm.Spectral_r)
Here we make a data reduction by taking the time of SST and SSS. This demonstrates how the cluster distributes the reads from storage.
SST_mean = ds.SST.mean(dim=('time'))
SST_mean
SSS_mean = ds.SSS.mean(dim=('time'))
SSS_mean
%time SST_mean.load()
# plot mean SST
qm = hv.QuadMesh((ds.geolon.values, ds.geolat.values, SST_mean))
datashade(qm, precompute=True, cmap=plt.cm.RdBu_r)
%time SSS_mean.load()
# plot mean SSS
qm = hv.QuadMesh((ds.geolon.values, ds.geolat.values, SSS_mean))
datashade(qm, precompute=True, cmap=plt.cm.Spectral_r)