This notebook will show overview of all the resampling methods in Pyresample.
Set some environment variables for performance tweaking
import os
os.environ['PYTROLL_CHUNK_SIZE'] = "1024"
os.environ['DASK_NUM_WORKERS'] = "4"
os.environ['OMP_NUM_THREADS'] = "1"
We'll use Satpy to read some data.
from satpy import Scene
from satpy.resample import get_area_def
import glob
fnames = glob.glob("/home/lahtinep/data/satellite/new/*201909031245*")
scn = Scene(reader='seviri_l1b_hrit', filenames=fnames)
scn.load([10.8])
data = scn[10.8]
Get the source and target area definitions.
source_adef = data.attrs['area']
euro4_adef = get_area_def('euro4')
Nerest neighbour is the most basic resampler Pyresample offers. As the name says, it'll pick value from the closest source location and place it in the target area.
Create the resampler, calculate the indices needed for resampling and resample the data to the target area. The radius_of_influence
is used so that there are no gaps in the northern areas due to source data sparsity.
from pyresample.kd_tree import XArrayResamplerNN
resampler = XArrayResamplerNN(source_adef, euro4_adef, radius_of_influence=50e3)
resampler.get_neighbour_info()
res = resampler.get_sample_from_neighbour_info(data)
Compute the resampling and show the resulting image.
res.plot.imshow()
And a small section from the north:
res[:100, 500:600].plot.imshow()
The nearest neighbour interpolation creates rough features near the edges of the geostationary disk. This can be counteracted by using bilinear intepolation, which creates smoother results.
NOTE: This is much slower than nearest neighbour resampling.
from pyresample.bilinear.xarr import XArrayResamplerBilinear
resampler = XArrayResamplerBilinear(source_adef, euro4_adef,
radius_of_influence=50e3)
resampler.get_bil_info()
res = resampler.get_sample_from_bil_info(data)
res.plot.imshow()
And the same crop we had for nearest neighbour:
res[:100, 500:600].plot.imshow()
As we see, the result is much smoother.
The bucket resampling collects data into the closest target area pixels, or bins, or "buckets". Each source pixel can end up in only one pixel, fractional binning hasn't been (yet) implemented.
Load the higher resolution HRV channel for better coverage.
scn = Scene(reader='seviri_l1b_hrit', filenames=fnames)
scn.load(['HRV'])
data = scn['HRV']
source_adef = data.attrs['area']
Initialize the resampler with target area and source coordinates.
from pyresample.bucket import BucketResampler
lons, lats = source_adef.get_lonlats(chunks=1024)
resampler = BucketResampler(euro4_adef, lons, lats)
First, find out how many pixels fall in each of the target pixels. This can be used e.g. to create a density map of lightning data. The data in this example isn't very suitable to demonstrate the usage.
counts = resampler.get_count()
import matplotlib.pyplot as plt
plt.imshow(counts, cmap='gray')
plt.colorbar()
plt.show()
The next example shows how to get the sum of all values in each of the target locations. One usecase would be to create the above mentioned lightning density plot from allready aggregated data, that is in some other projection and needs to be resampled.
sums = resampler.get_sum(data)
plt.imshow(sums, cmap='gray')
plt.colorbar()
plt.show()
More usable example for the data we are handling is to average the values in each target location. This could be done basically by dividing the above two results, but it is better to use the built-in version that does some additional error checking.
The northern parts show nicely how the sparsity of the data at the edge of SEVIRI imaging area results in plenty of pixels without any data in them.
average = resampler.get_average(data)
plt.imshow(average, cmap='gray')
plt.colorbar()
plt.show()
The next bucket resampler is used to calculate the fractional occurences of categorical (integer) data in each bin. Value of 1.0
means all the values are from the category in question, while 0.0
means there were no hits. If there were no hits in any category (e.g. like in the north in this and previous examples) the value is np.nan
.
Create categorical data from the HRV channel and calculate the fractions for a subset of the categories. The full per-category fractions can be retrieved by omitting categories
keyword argument, or setting it to None
. Omitting categories
will require searching for the unique values from the array and slow down the processing, so it's better to always define them if known.
import numpy as np
# Digitize the data to nearest 20 reflectance units
data_int = 20. * (data / 20.).astype(np.uint8)
# Compute the fractions for categories 60, 80 and 100
fractions = resampler.get_fractions(data_int, categories=[60, 80, 100])
# Show which categories are available
fractions.keys()
# Compute only 60 % reflectance fractions for viewing
fraction = fractions[60].compute()
plt.imshow(fraction)
plt.colorbar()
plt.show()