The Disappearing Walker Lake

While the loss of the Aral Sea in Kazakhstan and Lake Urmia in Iran have received a lot of attention over the last few decades, this trend is a global phenomena. Reciently a number of papers have been published including one focusing on the Decline of the world's saline lakes. Many of these lakes have lost the majority of their volume over the last century, including Walker Lake (Nevada, USA) which has lost 90 percent of its volume over the last 100 years.

The following example is intended to replicate the typical processing required in change detection studies similar to the Decline of the world's saline lakes.

In [ ]:
import intake
import numpy as np
import xarray as xr
import holoviews as hv
import geoviews as gv
import datashader as ds
import cartopy.crs as ccrs
import pandas as pd
import glob
from holoviews.operation.datashader import rasterize

from holoviews.operation.datashader import regrid, shade

hv.extension('bokeh', width=80)
In [ ]:
# arbitrarily choose a small memory limit (4GB) to stress the 
# out of core processing infrastructure
from dask.distributed import Client
client = Client(memory_limit=10e10, processes=False) # Note: was 6e9 
client

Landsat Image Data

To replicate this study, we first have to obtain the data from primary sources. The conventional way to obtain Landsat image data is to download it through USGS's EarthExplorer or NASA's Giovanni, but to facilitate the example two images have been downloaded from EarthExployer and cached.

The two images used by the original study are LT05_L1TP_042033_19881022_20161001_01_T1 and LC08_L1TP_042033_20171022_20171107_01_T1 from 1988/10/22 and 2017/10/22 respectivly. These images contain Landsat Surface Reflectance Level-2 Science Product images.

Loading into xarray via intake

In the next cell, we load the Landsat-5 files into a single xarray DataArray using intake. Data sources and caching parameters are specified in a catalog file. Intake is optional, since any other method of creating an xarray.DataArray object would work here as well, but it makes it simpler to work with remote datasets while caching them locally.

In [ ]:
cat = intake.open_catalog('catalog.yml')
l5 = cat.l5()
l5.cache[0].clear_all()
In [ ]:
L5_img = l5.read_chunked()
print(L5_img)
In [ ]:
L5_img = L5_img.squeeze(dim='band', drop=True)
L5_img
In [ ]:
l8 = cat.l8()
L8_img = l8.read_chunked()
L8_img = L8_img.squeeze(dim='band', drop=True)
print(L8_img)

Now let us view this DataArray:

And some of the related metadata:

In [ ]:
print("The shape of the DataArray is :", L5_img.shape)
print("With attributes:\n ", '\n  '.join('%s=%s'%(k,v) for k,v in L5_img.attrs.items()))

We can use this EPSG value shown above under the crs key to create a cartopy coordinate reference system that we will be using later on in this notebook:

In [ ]:
crs=ccrs.epsg(32611)

Computing the NDVI (1988)

Now let us compute the NDVI for the 1988 image. Note that we need to promote the DataArray format as returned by rasterio to an xarray DataSet. This restriction should be lifted in future (see geoviews issue 209).

In [ ]:
L5_img.data[L5_img.data==-9999] = np.NaN  # Replace the -9999
ndvi5_array = (L5_img[4]-L5_img[3])/(L5_img[4]+L5_img[3])
ndvi5 = ndvi5_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]

Computing the NDVI (2017)

Now we can do this for the Landsat 8 files for the 2017 image:

In [ ]:
L8_img.data[L8_img.data==-9999] = np.NaN  # Replace the -9999
ndvi8_array = (L8_img[4]-L8_img[3])/(L8_img[4]+L8_img[3])
ndvi8 = ndvi8_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]

Viewing change via dropdown

Using datashader together with geoviews, we can now easily build an interactive visualization where we select between the 1988 and 2017 images. The use of datashader allows these images to be dynamically updated according to zoom level (Note: it can take datashader a minute to 'warm up' before it becomes fully interactive). For more information on how the dropdown widget was created using HoloMap, please refer to the HoloMap reference.

In [ ]:
%opts Image (cmap='viridis') [width=500 height=500 tools=['hover'] colorbar=True]
In [ ]:
hmap = hv.HoloMap({'1988':gv.Image(ndvi5, crs=crs, vdims=['ndvi']), 
                   '2017':gv.Image(ndvi8, crs=crs, vdims=['ndvi'])}, 
                  kdims=['Year']).redim(x='lon', y='lat') # Mapping 'x' and 'y' from rasterio to 'lon' and 'lat'
rasterize(hmap)

Computing statistics and projecting display

The rest of the notebook shows how statistical operations can reduce the dimensionality of the data that may be used to compute new features that may be used as part of an ML pipeline.

The mean and sum over the two time points

The next plot (may take a minute to compute) shows the mean of the two NDVI images next to the sum of them:

In [ ]:
mean_avg = hmap.collapse(dimensions=['Year'], function=np.mean)
mean_img = gv.Image(mean_avg.data, crs=ccrs.epsg(32611), 
                    kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Mean over Year')

summed = hmap.collapse(dimensions=['Year'], function=np.sum)
summed_image = gv.Image(summed.data, crs=ccrs.epsg(32611), 
                        kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Sum over Year')

rasterize(mean_img) + rasterize(summed_image)

Difference in NDVI between 1988 and 2017

The change in Walker Lake as viewed using the NDVI can be shown by subtracting the NDVI recorded in 1988 from the NDVI recorded in 2017:

In [ ]:
difference = gv.Image(np.subtract(hmap['1988'].data, hmap['2017'].data), crs=ccrs.epsg(32611), 
                      kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Difference in NDVI')
rasterize(difference.redim(ndvi='delta_ndvi'))

You can see a large change (positive delta) in the areas where there is water, indicating a reduction in the size of the lake over this time period.

Slicing across lon and lat

As a final example, we can use the sample method to slice across the difference in NDVI along (roughly)the midpoint of the latitude and the midpoint of the longitude. To do this, we define the following helper function to convert latitude/longitude into the appropriate coordinate value used by the DataSet:

In [ ]:
def from_lon_lat(x,y):
    return ccrs.epsg(32611).transform_point(x,y, ccrs.PlateCarree())
In [ ]:
%%opts Curve [width=600 tools=['hover']]
lon_y, lat_x = from_lon_lat(-118, 39) # Longitude of -118 and Latitude of 39
(difference.sample(lat=lat_x) + difference.sample(lon=lon_y)).cols(1)
In [ ]: