While the loss of the Aral Sea in Kazakhstan and Lake Urmia in Iran have received a lot of attention over the last few decades, this trend is a global phenomena. Reciently a number of papers have been published including one focusing on the Decline of the world's saline lakes. Many of these lakes have lost the majority of their volume over the last century, including Walker Lake (Nevada, USA) which has lost 90 percent of its volume over the last 100 years.
The following example is intended to replicate the typical processing required in change detection studies similar to the Decline of the world's saline lakes.
import intake
import numpy as np
import xarray as xr
import holoviews as hv
import geoviews as gv
import datashader as ds
import cartopy.crs as ccrs
import pandas as pd
import glob
from holoviews.operation.datashader import rasterize
from holoviews.operation.datashader import regrid, shade
hv.extension('bokeh', width=80)
# arbitrarily choose a small memory limit (4GB) to stress the
# out of core processing infrastructure
from dask.distributed import Client
client = Client(memory_limit=10e10, processes=False) # Note: was 6e9
client
To replicate this study, we first have to obtain the data from primary sources. The conventional way to obtain Landsat image data is to download it through USGS's EarthExplorer or NASA's Giovanni, but to facilitate the example two images have been downloaded from EarthExployer and cached.
The two images used by the original study are LT05_L1TP_042033_19881022_20161001_01_T1 and LC08_L1TP_042033_20171022_20171107_01_T1 from 1988/10/22 and 2017/10/22 respectivly. These images contain Landsat Surface Reflectance Level-2 Science Product images.
intake
¶In the next cell, we load the Landsat-5 files into a single xarray DataArray
using intake. Data sources and caching parameters are specified in a catalog file. Intake is optional, since any other method of creating an xarray.DataArray
object would work here as well, but it makes it simpler to work with remote datasets while caching them locally.
cat = intake.open_catalog('catalog.yml')
l5 = cat.l5()
l5.cache[0].clear_all()
L5_img = l5.read_chunked()
print(L5_img)
L5_img = L5_img.squeeze(dim='band', drop=True)
L5_img
l8 = cat.l8()
L8_img = l8.read_chunked()
L8_img = L8_img.squeeze(dim='band', drop=True)
print(L8_img)
Now let us view this DataArray
:
And some of the related metadata:
print("The shape of the DataArray is :", L5_img.shape)
print("With attributes:\n ", '\n '.join('%s=%s'%(k,v) for k,v in L5_img.attrs.items()))
We can use this EPSG value shown above under the crs
key to create a cartopy coordinate reference system that we will be using later on in this notebook:
crs=ccrs.epsg(32611)
L5_img.data[L5_img.data==-9999] = np.NaN # Replace the -9999
ndvi5_array = (L5_img[4]-L5_img[3])/(L5_img[4]+L5_img[3])
ndvi5 = ndvi5_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]
Now we can do this for the Landsat 8 files for the 2017 image:
L8_img.data[L8_img.data==-9999] = np.NaN # Replace the -9999
ndvi8_array = (L8_img[4]-L8_img[3])/(L8_img[4]+L8_img[3])
ndvi8 = ndvi8_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]
Using datashader together with geoviews, we can now easily build an interactive visualization where we select between the 1988 and 2017 images. The use of datashader allows these images to be dynamically updated according to zoom level (Note: it can take datashader a minute to 'warm up' before it becomes fully interactive). For more information on how the dropdown widget was created using HoloMap
, please refer to the HoloMap reference.
%opts Image (cmap='viridis') [width=500 height=500 tools=['hover'] colorbar=True]
hmap = hv.HoloMap({'1988':gv.Image(ndvi5, crs=crs, vdims=['ndvi']),
'2017':gv.Image(ndvi8, crs=crs, vdims=['ndvi'])},
kdims=['Year']).redim(x='lon', y='lat') # Mapping 'x' and 'y' from rasterio to 'lon' and 'lat'
rasterize(hmap)
The rest of the notebook shows how statistical operations can reduce the dimensionality of the data that may be used to compute new features that may be used as part of an ML pipeline.
The next plot (may take a minute to compute) shows the mean of the two NDVI images next to the sum of them:
mean_avg = hmap.collapse(dimensions=['Year'], function=np.mean)
mean_img = gv.Image(mean_avg.data, crs=ccrs.epsg(32611),
kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Mean over Year')
summed = hmap.collapse(dimensions=['Year'], function=np.sum)
summed_image = gv.Image(summed.data, crs=ccrs.epsg(32611),
kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Sum over Year')
rasterize(mean_img) + rasterize(summed_image)
The change in Walker Lake as viewed using the NDVI can be shown by subtracting the NDVI recorded in 1988 from the NDVI recorded in 2017:
difference = gv.Image(np.subtract(hmap['1988'].data, hmap['2017'].data), crs=ccrs.epsg(32611),
kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Difference in NDVI')
rasterize(difference.redim(ndvi='delta_ndvi'))
You can see a large change (positive delta) in the areas where there is water, indicating a reduction in the size of the lake over this time period.
lon
and lat
¶As a final example, we can use the sample
method to slice across the difference in NDVI along (roughly)the midpoint of the latitude and the midpoint of the longitude. To do this, we define the following helper function to convert latitude/longitude into the appropriate coordinate value used by the DataSet
:
def from_lon_lat(x,y):
return ccrs.epsg(32611).transform_point(x,y, ccrs.PlateCarree())
%%opts Curve [width=600 tools=['hover']]
lon_y, lat_x = from_lon_lat(-118, 39) # Longitude of -118 and Latitude of 39
(difference.sample(lat=lat_x) + difference.sample(lon=lon_y)).cols(1)