Compute the maximum water level during Hurricane Ike on a 9 million node triangular mesh storm surge model. Plot the results with Datashader.
%matplotlib inline
from dask.distributed import Client, progress
from dask_gateway import Gateway
import xarray as xr
import numpy as np
gateway = Gateway()
cluster = gateway.new_cluster()
cluster.scale(25)
cluster
client = Client(cluster)
pangeo = 'PANGEO-GCS-S3'
if pangeo=='ESIP-AWS-S3':
import s3fs
fs = s3fs.S3FileSystem(anon=True)
map = s3fs.S3Map('rsignell/adcirc_test01', s3=fs)
if pangeo=='PANGEO-GCS-S3':
import gcsfs
fs = gcsfs.GCSFileSystem(project='pangeo-181919')
map = gcsfs.mapping.GCSMap('pangeo-data/rsignell/adcirc_test01', gcs=fs, check=False, create=False)
ds = xr.open_zarr(map)
ds
ds['zeta']
Note that unfortunately Xarray does not yet understand that x
and y
are coordinate variables, becuase we have a triangular mesh. Hopefully Xarray will gain some capabilities in this area in the future.
How many GB of sea surface height data do we have?
ds['zeta'].nbytes/1.e9
We want to calculate the maximum water level at every cell over all time steps
ds['zeta'].max(dim='time')
Dask will read data only when necessary, so nothing has been loaded yet. In the next cell, however, we use persist
to persist the data to the Dask worker nodes, and watch the progress
max_var = ds['zeta'].max(dim='time').persist()
progress(max_var)
import numpy as np
import datashader as dshade
import holoviews as hv
import geoviews as gv
import cartopy.crs as ccrs
from holoviews.operation.datashader import datashade, rasterize
datashade.precompute = True
hv.extension('bokeh')
%opts Image RGB VectorField [width=800 height=600]
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
v = np.vstack((ds['x'], ds['y'], max_var)).T
verts = pd.DataFrame(v, columns=['x','y','vmax'])
points = gv.operation.project_points(gv.Points(verts, vdims=['vmax']))
tris = pd.DataFrame(ds['element'].values.astype('int')-1, columns=['v0','v1','v2'])
tiles = gv.WMTS('https://maps.wikimedia.org/osm-intl/{Z}/{X}/{Y}@2x.png')
value = 'max water level'
label = '{} (m)'.format(value)
trimesh = gv.TriMesh((tris, points), label=label)
%%opts Image [colorbar=True] (cmap='rainbow')
meshes = rasterize(trimesh,aggregator=dshade.mean('vmax'))
tiles * meshes
Because Xarray does not yet understand that x
and y
are coordinate variables on this triangular mesh, we create our own simple function to find the closest point. If we had a lot of these, we could use a more fancy tree algorithm.
# find the indices of the points in (x,y) closest to the points in (xi,yi)
def nearxy(x,y,xi,yi):
ind = np.ones(len(xi),dtype=int)
for i in range(len(xi)):
dist = np.sqrt((x-xi[i])**2+(y-yi[i])**2)
ind[i] = dist.argmin()
return ind
#just offshore of Galveston
lat = 29.2329856
lon = -95.1535041
ind = nearxy(ds['x'].values,ds['y'].values,[lon], [lat])
%%time
ds['zeta'][:,ind].plot()