Written by Philipp Rudiger
Created: May 15, 2019
Last updated: July 21, 2021
Many natural phenomena exhibit wide differences in the amount they vary across space, with properties varying quickly in some regions of space, and very slowly in others. These differences can make it challenging to make feasible, accurate models using a fixed sampling grid. For instance, for hydrology modelling, areas that are largely flat and uniform can be approximated with just a few samples, while areas of sharp elevation change need many samples per unit area to have a faithful representation of how water will flow. Datashader's support for irregular triangular meshes allows datasets from such simulations to be rendered onscreen efficiently. This notebook shows an example of rendering a dataset from the Chesapeake and Delaware Bay off the US Eastern coast, using data downloaded from the Datashader examples. Once Datashader and the dataset are installed, you can run this notebook yourself to get a live version with interactive zooming for the plots that support it.
First, let's load the data file and take a look at it:
import datashader as ds, datashader.transfer_functions as tf, datashader.utils as du, pandas as pd
fpath = './data/Chesapeake_and_Delaware_Bays.3dm'
df = pd.read_csv(fpath, delim_whitespace=True, header=None, skiprows=1,
names=('row_type', 'cmp1', 'cmp2', 'cmp3', 'val'), index_col=1)
print(len(df))
tf.Images(df.head(), df.tail())
Here we have 1.6 million rows of data, some marked 'ND' (vertices defined as lon,lat,elevation) and others marked 'E3T' (triangles specified as indexes into the provided vertices, in order starting with 1 (i.e. like Matlab or Fortran, not typical Python)).
We can extract the separate triangle and vertex arrays we need for Datashader:
e3t = df[df['row_type'] == 'E3T'][['cmp1', 'cmp2', 'cmp3']].values.astype(int) - 1
nd = df[df['row_type'] == 'ND' ][['cmp1', 'cmp2', 'cmp3']].values.astype(float)
nd[:, 2] *= -1 # Make depth increasing
verts = pd.DataFrame(nd, columns=['x', 'y', 'z'])
tris = pd.DataFrame(e3t, columns=['v0', 'v1', 'v2'])
print('vertices:', len(verts), 'triangles:', len(tris))
We'll also precompute the combined mesh data structure, which doesn't much matter for this 1-million-triangle mesh, but would save time for plotting for much larger meshes:
%time mesh = du.mesh(verts,tris)
We can now visualize the average depth at each location covered by this mesh (with darker colors indicating deeper areas (higher z values, since we inverted the z values above)):
cvs = ds.Canvas(plot_height=900, plot_width=900)
%time agg = cvs.trimesh(verts, tris, mesh=mesh)
tf.shade(agg)
When working with irregular grids, it's important to understand and optimize the properties of the mesh, not just the final rendered data. Datashader makes it easy to see these properties using different aggregation functions:
cvs = ds.Canvas(plot_height=400, plot_width=400)
tf.Images(tf.shade(cvs.trimesh(verts, tris, mesh=mesh, agg=ds.mean('z')), name="mean"),
tf.shade(cvs.trimesh(verts, tris, mesh=mesh, agg=ds.any()), name="any"),
tf.shade(cvs.trimesh(verts, tris, mesh=mesh, agg=ds.count()), name="count", how='linear'),
tf.shade(cvs.trimesh(verts, tris, mesh=mesh, agg=ds.std('z')), name="std")).cols(2)
Here "any" shows the areas of the plane that are covered by the mesh (for constructing a raster mask), "count" shows how many triangles are used in the calculation of each pixel (with few triangles in the offshore area in this case, and more around the complex inland geometry), and "std" showing how much the data varies in each pixel (highlighting regions of steep change). "max" and "min" can also be useful for finding unexpected areas in the mesh or simulation results, e.g. deep troughs too thin to show up in the plot directly.
Note that before calling tf.shade()
, the result of cvs.trimesh
is just an xarray array, and so you can run any algorithm you want on the aggregate to do automatic checking or transformation as part of a processing pipeline.
By default, the results are bilinearly interpolated between values at each vertex, but if interpolation is turned off, the average of the values at each vertex is used for the entire triangle:
cvs = ds.Canvas(plot_height=420, plot_width=420, x_range=(-76.56, -76.46), y_range=(38.78, 38.902))
from colorcet import bmy as c
tf.Images(tf.shade(cvs.trimesh(verts, tris, mesh=mesh, interp=True), cmap=c, name="Interpolated"),
tf.shade(cvs.trimesh(verts, tris, mesh=mesh, interp=False), cmap=c, name="Raw triangles"))
import holoviews as hv
import geoviews as gv
from holoviews import opts
from holoviews.operation.datashader import datashade
hv.extension("bokeh")
opts.defaults(
opts.Image(width=450, height=450),
opts.RGB(width=450, height=450))
wireframe = datashade(hv.TriMesh((tris,verts), label="Wireframe"))
trimesh = datashade(hv.TriMesh((tris,hv.Points(verts, vdims='z')), label="TriMesh"), aggregator=ds.mean('z'))
wireframe + trimesh
Here the underlying wireframe and triangles will be revealed if you enable the wheel-zoom tool and zoom in to either plot.
As you can see, HoloViews will reveal the lon,lat coordinates associated with the trimesh. However, HoloViews does not know how to reproject the data into another space, which is crucial if you want to overlay it on a geographic map in a different coordinate system. GeoViews provides this projection capability if you need it:
opts.defaults(opts.WMTS(width=500, height=500))
tiles = gv.WMTS('https://maps.wikimedia.org/osm-intl/{Z}/{X}/{Y}@2x.png')
%time points = gv.operation.project_points(gv.Points(verts, vdims=['z']))
tiles * datashade(hv.TriMesh((tris, points)), aggregator=ds.mean('z'), precompute=True)
Here you can enable the wheel-zoom tool in the toolbar, then use scrolling and your pointer to zoom and pan in the plot. As always with datashader, the data is provided to the browser in only one resolution to start with, and it will be updated when you zoom in only if you have a running Python process, and are not just viewing this on a static web page.