Data Client

This notebook shows how to use the dataclient Python library to retrieve timeseries data from the time series service in a very fast manner.

For developers whishing to implement their own web service calls, you can find the REST documentation here: https://proba-v-mep.esa.int/api/timeseries/apidocs/

Defining area of interest

We first need to have a polygon that defines our area of interest. We'll read if from a GeoJson file and show it on the map.

Note that this map may not be visible in the exported version of this notebook!

In [16]:
import geopandas as gpd
import pandas as pd
polygons = gpd.read_file("../../datasets/sentinel2/reserves.geojson")
test_area = polygons[polygons.Reservaat == "De Maat"]
test_area = test_area.iloc(0)[0]
In [6]:
from ipyleaflet import Map,DrawControl,GeoJSON
from shapely.geometry import mapping, shape

m = Map(center=[test_area.geometry.centroid.y,test_area.geometry.centroid.x], zoom=11)
dc = DrawControl()
m.add_control(dc)

m.add_layer(GeoJSON(data=mapping(test_area.geometry)))

m

This notebook uses the MEP dataclient library. It is available at https://bitbucket.org/vitotap/mep-dataclient.

Now we want to retrieve a timeseries for our polygon. Before we can do this, we need to know which 'layer' to query. This list of layers can also be retrieved from the web service:

In [7]:
import requests
response = requests.get("https://proba-v-mep.esa.int/api/timeseries/v1.0/ts/")
if response.status_code == 200:
    layerlist = response.json()['layers']
    layers = [layer['name'] for layer in layerlist if layer['name'].startswith('S2_') or layer['name'].startswith('PROBA') or layer['name'].startswith('BIOPAR')]
else:
    raise IOError(response.text)
layers
Out[7]:
['BIOPAR_LAI_V2_GLOBAL',
 'BIOPAR_FAPAR_V1_GLOBAL',
 'PROBAV_L3_S10_TOC_NDVI_333M',
 'S2_MARKERMEER',
 'BIOPAR_FCOVER300_V1_GLOBAL',
 'BIOPAR_FCOVER_V1_GLOBAL',
 'BIOPAR_LAI300_V1_GLOBAL',
 'S2_FAPAR_CLOUDCOVER',
 'BIOPAR_FAPAR300_V1_GLOBAL',
 'BIOPAR_FAPAR_V2_GLOBAL',
 'BIOPAR_FCOVER_V2_GLOBAL',
 'S2_NDVI',
 'S2_LAI',
 'S2_FAPAR',
 'BIOPAR_LAI_V1_GLOBAL',
 'BIOPAR_NDVI300_V1_GLOBAL',
 'BIOPAR_NDVI_V2_GLOBAL',
 'BIOPAR_WB300_V1_GLOBAL',
 'S2_FCOVER',
 'BIOPAR_BA300_V1_GLOBAL']

All of these layers can be queried, so let's go multitemporal, and query a PROBA-V and Sentinel 2 timeseries for the same area.

In [10]:
start = "2016-01-01"
end = "2018-12-01"
import dataclient
sentinel2_data = dataclient.get_timeseries(test_area,"S2_NDVI","2016-01-01","2018-12-01")
probav_data = dataclient.get_timeseries(test_area,"PROBAV_L3_S10_TOC_NDVI_333M","2016-01-01","2018-12-01")

The result is a Pandas Series object, which can be plotted immediately:

In [11]:
%matplotlib inline
import matplotlib.pyplot as plt
sentinel2_data.name="S2 NDVI"
probav_data.name = "PROBA-V NDVI"
sentinel2_data.dropna().plot(figsize=(12,4))
probav_data.dropna().plot(figsize=(12,4))
plt.legend()
plt.show()

Due to it's low temporal resolution, the Sentinel 2 data is heavily affected by clouds. We can also request smoothed curves:

In [12]:
import dataclient
sentinel2_smooth = dataclient.get_timeseries(test_area,"S2_NDVI",start,end,endpoint="http://pep.vgt.vito.be:8080/TimeseriesSmooth/v1.0/ts/")
probav_smooth = dataclient.get_timeseries(test_area,"PROBAV_L3_S10_TOC_NDVI_333M",start,end,endpoint="http://pep.vgt.vito.be:8080/TimeseriesSmooth/v1.0/ts/")
In [13]:
sentinel2_smooth.name="S2 NDVI Smooth"
probav_smooth.name = "PROBA-V NDVI Smooth"
sentinel2_smooth.dropna().plot(figsize=(10,4))
probav_smooth.dropna().plot(figsize=(10,4))
plt.legend()
plt.show()

As we can see, the Sentinel 2 data needs more filtering to reduce the effects of clouds.

Timeseries dissection

Up to this point, we have only used the average pixel value for our timeseries analysis. However, we may want to derive more detailed statistics such as percentiles. To help you out, we added support for histograms to our web service and Python client.

The result of the 'dataclient.get_histogram' function is a time series of histograms, represented by a Pandas Dataframe.

In [18]:
import dataclient
avg_timeseries = dataclient.get_timeseries(test_area,"S2_NDVI", start, end)
histogram = dataclient.get_histogram(test_area,"S2_NDVI", start, end)
histogram.head()
Out[18]:
-0.08 -0.076 -0.07200000000000001 -0.068 -0.064 -0.06 -0.056 -0.052000000000000005 -0.048 -0.044 ... 0.884 0.888 0.892 0.896 0.9 0.904 0.908 0.912 0.916 0.92
2016-01-02 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2016-01-12 2645.0 46238.0 1507.0 4271.0 1005.0 4271.0 6031.0 4146.0 6784.0 4774.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2016-01-19 832.0 NaN 2547.0 NaN NaN 1030.0 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2016-01-29 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2016-02-11 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 251 columns

One way to plot this timeseries, is using boxplots. This is not so trivial because we already start from histograms, while most plotting libraries compute their own statistics to create the boxplots. This is a first attempt, which has some clear issues with labeling. Suggestions are welcome!

In [19]:
%matplotlib inline
import numpy as np
def binsToBoxPlot(row):
    cumsum = row.cumsum()
    return {
        "label":pd.to_datetime(row.name),
        "whislo":cumsum.index[np.searchsorted(cumsum,cumsum.quantile(0.05))[0]],
        "q1":cumsum.index[np.searchsorted(cumsum,cumsum.quantile(0.25))[0]],
        "med":cumsum.index[np.searchsorted(cumsum,cumsum.quantile(0.5))[0]],
        "q3":cumsum.index[np.searchsorted(cumsum,cumsum.quantile(0.75))[0]],
        "whishi":cumsum.index[np.searchsorted(cumsum,cumsum.quantile(0.95))[0]],
        "fliers":[]
    }
boxplots = histogram.apply(binsToBoxPlot, axis=1)
for boxplot in boxplots[::2]:
    del boxplot["label"]
for boxplot in boxplots[::3]:
    if 'label' in boxplot:
        del boxplot["label"]
for boxplot in boxplots[::5]:
    if 'label' in boxplot:
        del boxplot["label"]
fig, axes = plt.subplots(1, 1,figsize=(17, 6))
axes.bxp(boxplots)
#for tick in axes.get_xticklabels():
#    tick.set_rotation(90)
plt.gcf().autofmt_xdate()
plt.show()

An different approach is to plot individual timeseries for percentiles, this is fairly easy

In [20]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
def quantile(row,quantile):
    cumsum = row.cumsum()
    if len(row.dropna()) == 0:
        return np.NaN
    return cumsum.index[np.searchsorted(cumsum,cumsum.quantile(quantile))[0]]
    
median = histogram.apply(lambda row:quantile(row,0.5),axis=1)
high = histogram.apply(lambda row:quantile(row,0.95),axis=1)
low = histogram.apply(lambda row:quantile(row,0.05),axis=1)

fig, axes = plt.subplots(1, 1, figsize=(16,7))

plt.plot(median.dropna(),label="median")
plt.plot(high.dropna(),label="0.95 percentile")
plt.plot(low.dropna(),label="0.05 percentile")
#plt.plot(avg_timeseries.dropna(),label="average")
plt.title("S2 NDVI")
plt.legend()
plt.show()

Timeseries timelapse

A final way to visualize the timeseries in full detail, is to retrieve the actual images, so we can plot them in an animated timelapse. To access the web service (WCS) that allows us to download these images, we need to provide our username and password. (The same as used for https://www.vito-eodata.be or https://terrascope.be )

In [ ]:
import getpass
import ipywidgets as widgets

username = widgets.Text(
    value='',
    placeholder='',
    description='Username:',
    disabled=False
)
username
In [ ]:
password = getpass.getpass()
credentials=(username.value, password)

With the following call, we download a set of images based on the bounding box of out test area. They are saved into the specified output directory. ('de_maat_rgb') We try to download an image for each available date. For some dates the image may be fully clouded, and for other dates the bounding box may not intersect the available data, which will result in an error that can be ignored.

In [ ]:
from dateutil.parser import parse
import pandas as pd
import dataclient
dataclient.MepLayer(credentials, "CGS__CGS_S2_10M_BANDS", "CGS", "https://sentineldata.vito.be/ows") \
.download_bbox("de_maat_rgb",test_area.geometry,startdate=start,enddate=end,verbose=False)

To filter our retrieved images by cloudcover, we retrieve a timeseries that contains a cloud percentage.

In [28]:
clouds = dataclient.get_timeseries(test_area,"S2_FAPAR_CLOUDCOVER",start,end)

The retrieved images contain 4 10m bands, with reflectance values from 0 to 10000, we define a function that plots this as RGB.

In [25]:
from rasterio.plot import show
def show_sentinel2_rgb(dataset,axis=None):
    band1 = (dataset.read(3,masked=True)-200)/1600.
    band2 = (dataset.read(2,masked=True)-200)/1600.
    band3 = (dataset.read(1,masked=True)-200)/1600.
    
    rgb = np.dstack([band1,band2,band3])
    return axis.imshow(rgb,animated=True,vmin=0.0,vmax=1.0)

Now we can convert this into a timelapse, which shows the image timeseries combined with the average timeseries.

In [29]:
%matplotlib inline

import rasterio
from IPython.display import HTML
from os import listdir
from os.path import isfile, join
import re
import matplotlib.animation as anim

#https://tomroelandts.com/articles/how-to-create-animated-gifs-with-python

image_paths = [f for f in listdir("de_maat_rgb") if f.endswith("tiff") and isfile(join("de_maat_rgb", f))]
image_paths.sort()

size=(800, 300)
fig = plt.figure(figsize=(14, 7))
#fig.set_size_inches(size[0] / 100, size[1] / 100)
ax1=plt.subplot2grid((4, 1), (0, 0), rowspan=3)
ax2=plt.subplot2grid((4, 1), (3, 0))
ax1.set_xticks([])
ax1.set_yticks([])

images = []
prog = re.compile(".*([0-9]{6}).*")
i = 0
for f in image_paths:
    with rasterio.open(join("de_maat_rgb", f)) as dataset:
        fapar = dataset.read(1,masked=True)
        #print(fapar.count())
        if fapar.count() > 10000:                        
            date = pd.to_datetime(prog.match(f).group(1),yearfirst=True)                        
            if date in sentinel2_smooth.dropna().index and clouds[date]<0.2:
                plt_txt = ax1.text(0, 0, date.strftime("%Y-%m-%d"), color='red',fontsize=12)
                plt_im = show_sentinel2_rgb(dataset,ax1)
                graph, = ax2.plot(sentinel2_smooth.dropna())
                #point, = ax2.plot(date, filtered[date], 'g*')
                line =ax2.axvline(x=date,animated=True)
                images.append([plt_im,graph,line,plt_txt])
            #else:
            #    images.append([plt_im,graph])
            
animation = anim.ArtistAnimation(fig, images)
HTML(animation.to_jshtml())
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[29]:


Once Loop Reflect