The first step is to find the satellite data. Normally, we would browse over to http://thredds.ucar.edu/thredds/ and find the top-level THREDDS Data Server (TDS) catalog. From there we can drill down to find satellite data products.
For this tutorial, we want to utilize the new GOES-16 imagery. This imagery is still considered experimental and not for operational use, so it is found on the test dataserver at http://thredds-test.unidata.ucar.edu/. Navigate to the Test Datasets
directory, then GOES-16 Products
and GOES-16
. Here you'll find a folder for each day of the archive. In the daily folders there are subfolders for the CONUS, full disk, and mesoscale sector images. In each of these there is a folder for each channel. As you can see, there is a massive amount of data coming down from GOES-16!
We could download the files to our computers from here, but that can become tedious for downloading many files, requires us to store them on our computer, and does not lend itself to automation.
We can use Unidata's Siphon package to parse the catalog from the TDS. This provides us a nice programmatic way of accessing the data. We start by importing the TDSCatalog
class from siphon and giving it the URL to the catalog we just surfed to manually. Note: Instead of giving it the link to the HTML catalog, we change the extension to XML, which asks the TDS for the XML version of the catalog. This is much better to work with in code.
Next we'll build a URL to get to the current day's data for a specified channel:
from datetime import datetime
from siphon.catalog import TDSCatalog
date = datetime.strftime(datetime.utcnow(), '%Y%m%d')
channel = 8
cat = TDSCatalog('http://thredds-test.unidata.ucar.edu/thredds/catalog/satellite/goes16/GOES16/'
'{}/CONUS/Channel{:02d}/catalog.xml'.format(date, channel))
We now have a TDSCatalog
object called cat
that we can examine and use to get handles to work with the data. To find the latest file, we can look at the cat.datasets
attribute. This is a Python dictionary, mapping the name of the dataset to a Python Dataset object (which came from more XML supplied by the TDS). Since this is a dictionary, we can look at a list of keys and see what datasets are available. Let’s look at the last five keys:
list(cat.datasets)[-5:]
['GOES16_20170407_015224_6.19_2km_32.3N_91.4W.nc4', 'GOES16_20170407_015724_6.19_2km_32.3N_91.4W.nc4', 'GOES16_20170407_020224_6.19_2km_32.3N_91.4W.nc4', 'GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'GOES16_20170407_021224_6.19_2km_32.3N_91.4W.nc4']
What we really want is the most recent data, which is the last item in the list. We can pull that out, and use its name to get the actual Python Dataset
object:
dataset_name = list(cat.datasets)[-2]
dataset = cat.datasets[dataset_name]
print(dataset)
<siphon.catalog.Dataset object at 0x106e51ac8>
The catalog.Dataset
class provides access to a lot of information about a dataset, like metadata (e.g. time range, spatial extent). What we want most however, is to know how to access the data. This is provided by the dataset.access_urls
attribute:
dataset.access_urls
{'CdmRemote': 'http://thredds-test.unidata.ucar.edu/thredds/cdmremote/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'CdmrFeature': 'http://thredds-test.unidata.ucar.edu/thredds/cdmrfeature/grid/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'DAP4': 'http://thredds-test.unidata.ucar.edu/thredds/dap4/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'HTTPServer': 'http://thredds-test.unidata.ucar.edu/thredds/fileServer/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'ISO': 'http://thredds-test.unidata.ucar.edu/thredds/iso/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'NCML': 'http://thredds-test.unidata.ucar.edu/thredds/ncml/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'NetcdfSubset': 'http://thredds-test.unidata.ucar.edu/thredds/ncss/grid/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'OPENDAP': 'http://thredds-test.unidata.ucar.edu/thredds/dodsC/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'UDDC': 'http://thredds-test.unidata.ucar.edu/thredds/uddc/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'WCS': 'http://thredds-test.unidata.ucar.edu/thredds/wcs/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4', 'WMS': 'http://thredds-test.unidata.ucar.edu/thredds/wms/satellite/goes16/GOES16/20170407/CONUS/Channel08/GOES16_20170407_020724_6.19_2km_32.3N_91.4W.nc4'}
These different URLs provide access to the data in different ways: some support different protocols (like OPeNDAP or CDMRemote), others allow harvesting metadata (e.g. ISO). We will use the netCDF4 package to read the data directly from the server using OPeNDAP. You could of course download the files locally if you needed to archive them for research, but there's no need to do that to make a 'real-time' display.
from netCDF4 import Dataset
ds = Dataset(dataset.access_urls['OPENDAP'])
We have a netCDF4-python dataset object now that contains all of the data for a given satellite image. We need to explore the variables available in the file and pull out the useful parts that we need to make a map.
list(ds.variables)
['y', 'x', 'lambert_projection', 'Sectorized_CMI']
Our goal is to plot the imagery, so we'll be using the Sectorized_CMI
variable from the .variables
dictionary.
Rather than just giving back the raw array of data, this gives back a Variable
object; from here not only
can we get the raw data values, but there is useful metadata as well. We can see just what additional information
is present by printing out the Variable
object:
data_var = ds.variables['Sectorized_CMI']
print(data_var)
<class 'netCDF4._netCDF4.Variable'> int16 Sectorized_CMI(y, x) _FillValue: 0 standard_name: brightness_temperature units: kelvin grid_mapping: lambert_projection add_offset: 173.15 scale_factor: 0.0393162 valid_min: 0 valid_max: 4095 _ChunkSizes: [1024 1024] unlimited dimensions: current shape = (2560, 3072) filling off
This reveals several useful pieces of information (such as a longer description of the variable), but we're going to focus on the grid_mapping
data. This attribute is defined by the NetCDF Climate and Forecast (CF) Metadata Conventions and specifies a variable that contains information about the grid's projection.
We also need to get the x and y coordinates of the image from the dataset object. We use an empty slice ([:]
) to copy the actual numeric values out of the variables (for easier use with matplotlib and cartopy).
x = ds.variables['x'][:]
y = ds.variables['y'][:]
We will also grab the variable corresponding to the grid_mapping
attribute so that we can have a look at the projection information. Rather than hard coding the name of the variable (in this case Lambert_Conformal
), we just directly pass the grid_mapping
attribute to the .variables
dictionary; this makes it easier to re-use the code in the future with different data.
proj_var = ds.variables[data_var.grid_mapping]
print(proj_var)
<class 'netCDF4._netCDF4.Variable'> int32 lambert_projection() grid_mapping_name: lambert_conformal_conic standard_parallel: 25.0 longitude_of_central_meridian: -95.0 latitude_of_projection_origin: 25.0 false_easting: 0.0 false_northing: 0.0 semi_major: 6371200.0 semi_minor: 6371200.0 unlimited dimensions: current shape = () filling off
This shows that the projection is Lambert conformal conic. The variable also includes a few parameters (such as the latitude and longitude of the origin) needed to properly set up the projection to match what was used to create the image. This variable also has information about the assumed shape of the earth, which in this case is spherical with a radius of 6371.2 km.
We are finally ready to plot the satellite data! Using imshow()
we can get an image, but there is a lot of work to do here. The data need to be projected to a meaningful representation, map outlines added, and annotations added.
# Make sure the notebook puts figures inline
%matplotlib inline
import matplotlib.pyplot as plt
plt.imshow(data_var[:])
plt.show()
The mapping will be accomplished with a combination of Cartopy and Matplotlib. To begin, we will use Cartopy’s crs
(Coordinate Reference System) module. With the crs
, we will create a Globe
object that contains information about the assumed planet shape used in the projection.
proj_var
<class 'netCDF4._netCDF4.Variable'> int32 lambert_projection() grid_mapping_name: lambert_conformal_conic standard_parallel: 25.0 longitude_of_central_meridian: -95.0 latitude_of_projection_origin: 25.0 false_easting: 0.0 false_northing: 0.0 semi_major: 6371200.0 semi_minor: 6371200.0 unlimited dimensions: current shape = () filling off
import cartopy.crs as ccrs
# Create a Globe specifying a spherical earth with the correct radius
globe = ccrs.Globe(ellipse='sphere', semimajor_axis=proj_var.semi_major,
semiminor_axis=proj_var.semi_minor)
Now that we have a globe of the appropriate shape, we need to create our projection. Knowing that that data are using the Lambert conformal conic projection, we will use the LambertConformal
class. This class uses the globe model we just created, along with the projection information from the file, namely the central latitude, central longitude, and standard parallels.
proj = ccrs.LambertConformal(central_longitude=proj_var.longitude_of_central_meridian,
central_latitude=proj_var.latitude_of_projection_origin,
standard_parallels=[proj_var.standard_parallel],
globe=globe)
Now that we know how to properly reference the imagery data, we can plot
the data. CartoPy's projections are designed to interface with matplotlib, so they can just be passed as the projection
keyword argument when creating an Axes
using the add_subplot
method. Since the x and y coordinates, as well as the image data, are referenced in the lambert conformal projection, we can pass all of them directly to plotting methods (such as imshow
) with no additional information. The extent
keyword argument to imshow
is used to specify the bounds of the image data being plotted. It is especially important to specify that the origin
is at the the upper left of the image (standard practice in imagery). If your forget this, your image will be flipped. Try it!
print(x[0], x[-1], y[0], y[-1])
-2779168.60944 3460374.02806 3417470.07785 -1781810.15965
# Create a new figure with size 10" by 10"
fig = plt.figure(figsize=(10, 10))
# Put a single axes on this figure; set the projection for the axes to be our
# Lambert conformal projection
ax = fig.add_subplot(1, 1, 1, projection=proj)
# Plot the data with mostly the defaults
# Note, we save the image returned by imshow for later...
im = ax.imshow(data_var[:], extent=(x.min(), x.max(), y.min(), y.max()), origin='upper')
# Add high-resolution coastlines to the plot
ax.coastlines(resolution='50m', color='black')
<cartopy.mpl.feature_artist.FeatureArtist at 0x114db6f98>
This is a nice start, but it would be nice to have better geographic references for the image. For example, where are the states? Cartopy's feature
module has support for adding geographic features to plots ,with many features are built in. The BORDERS
built-in feature contains country borders. There is also support for creating "custom" features from the Natural Earth set of free vector and raster map data (CartoPy will automatically download the necessary data and cache it locally). Here we create a feature for states/provinces.
import cartopy.feature as cfeat
# Add country borders with a thick line.
ax.add_feature(cfeat.BORDERS, linewidth='2', edgecolor='black')
# Set up a feature for the state/province lines. Tell cartopy not to fill in the polygons
state_boundaries = cfeat.NaturalEarthFeature(category='cultural',
name='admin_1_states_provinces_lakes',
scale='50m', facecolor='none')
# Add the feature with dotted lines, denoted by ':'
ax.add_feature(state_boundaries, linestyle=':', edgecolor='black')
# Redisplay modified figure
fig
The map is much improved now, but it would look much better with a different color scheme.
Colormapping in matplotlib (which backs CartoPy) is handled through two pieces:
Let's start by setting the colormap to be black and white and normalizing the data to get the best contrast. As with many satellite data fields, the water vapor values range from 0-255. We clip that range down to enhance contrasts in the data. We'll make a histogram to see the most common values in the data. Note: cmap
and norm
can also be set during the imshow
call as keyword arguments.
That's great, but all of the zero values don't let us really see the range of values. Do they stop at 200? Maybe, or maybe there are just very few data points there. It's best to remove the masked elements of the array and plot the histogram again.
plt.hist(data_var[:].compressed().flatten(), bins=255);
# Set colormap
im.set_cmap('Greys_r')
# Set norm
im.set_norm(plt.Normalize(200,255))
# Show figure again
fig
In meteorology, we have many ‘standard’ colortables that have been used for certain types of data. We have included these in Metpy in the metpy.plots.ctables
module. By importing the ColortableRegistry
we gain access to the colortables, as well as handy normalizations to go with them. We can see the colortables available by converting the dictionary to a list.
from metpy.plots.ctables import registry
print(list(registry))
['Carbone42', 'ir_bd', 'ir_drgb', 'ir_rgbv', 'ir_tpc', 'ir_tv1', 'NWS8bitVel', 'NWSReflectivity', 'NWSReflectivityExpanded', 'NWSSpectrumWidth', 'NWSStormClearReflectivity', 'NWSVelocity', 'rainbow', 'test', 'viridis', 'wv_tpc', 'WVCIMSS']
Let’s use the WVCIMSS
colormap, a direct conversion of the GEMPAK colormap. The code below asks for the colormap, as well as a normalization that starts at 0 and increases by a value of 1 for each color in the table. We then apply it to the existing image we have been working with.
wv_norm, wv_cmap = registry.get_with_range('WVCIMSS', 170, 255)
im.set_cmap(wv_cmap)
im.set_norm(wv_norm)
fig
One more thing that would be nice is putting the date and time on the image, so let's do that. First grab the start_date_time
attribute from the dataset object:
ds.start_date_time
'2017097020724'
This timestamp contains the year, Julian day, hour, minute, and second that the image was collected. We will use datetime.strptime
to parse this out into a Python datetime
object.
timestamp = datetime.strptime(ds.start_date_time, '%Y%j%H%M%S')
print(timestamp)
2017-04-07 02:07:24
A Python datetime
object is easy to work with. Let's add it to our plot.
We use the text
method to draw text on our plot. In this case, we call it with a transform
keyword argument, which allows us to tell matplotlib how to interpret the x and y coordinates. In this case, we set the transform to ax.transAxes
, which means "interpret x and y as being in axes space". The axes space has x and y in the range [0, 1] across the entire plotting area (e.g. (0, 0) is lower left, (1, 1) is upper right). Using this, we can put text in the lower right corner (x = 0.99
, y = 0.01
) regardless of the range of x and y (or longitude and latitude) in the plot. We also need to make sure to right-align the text so that the text ends at the specified point.
We use the strftime
method to format the datetime as a string. The details of that format string are described here.
The code below uses matplotlib's path effects to make the text have an outline effect as well. We won't go into detail on that here, but details can be found in the documentation if you want to know more.
We also need to add an annotation telling us what channel is being plotted, as well as a statement that the data is experimental only.
For completeness, the code below replicates the entirety of the plotting code from above.
# Same as before, except we call imshow with our colormap and norm.
fig = plt.figure(figsize=(10, 10))
ax = fig.add_subplot(1, 1, 1, projection=proj)
im = ax.imshow(data_var[:], extent=(x[0], x[-1], y[-1], y[0]), origin='upper',
cmap=wv_cmap, norm=wv_norm)
ax.coastlines(resolution='50m', color='black')
ax.add_feature(state_boundaries, linestyle=':', edgecolor='black')
ax.add_feature(cfeat.BORDERS, linewidth='2', edgecolor='black')
# Add text (aligned to the right); save the returned object so we can manipulate it.
text_time = ax.text(0.99, 0.01, timestamp.strftime('%d %B %Y %H%MZ'),
horizontalalignment='right', transform=ax.transAxes,
color='white', fontsize='x-large', weight='bold')
text_channel = ax.text(0.5, 0.97, 'Experimental GOES-16 Ch.{}'.format(channel),
horizontalalignment='center', transform=ax.transAxes,
color='white', fontsize='large', weight='bold')
# Make the text stand out even better using matplotlib's path effects
from matplotlib import patheffects
text_time.set_path_effects([patheffects.Stroke(linewidth=2, foreground='black'),
patheffects.Normal()])
text_channel.set_path_effects([patheffects.Stroke(linewidth=2, foreground='black'),
patheffects.Normal()])
ax.set_extent([240., 290., 20., 49.])
NOTE: This is just a quick taste of producing an animation using matplotlib. The animation support in matplotlib is robust, but sometimes installation of the underlying tools (mencoder/ffmpeg) can be a little tricky. In order to make sure we get don't get bogged down, this is really more of a demo than something expected to work out of the box.
For windows builds, you might try:
On OSX and linux, conda-forge has packages, so it may be as easy as:
#!conda install -y -n unidata-workshop -c http://conda.anaconda.org/conda-forge ffmpeg
First we'll import the animation support from matplotlib. We also tell it that we want it to render the animations to HTML using the HTML5 video tag:
import os.path
import sys
from matplotlib import rcParams
from IPython.display import HTML
from matplotlib.animation import ArtistAnimation
Then we create an empty figure which will serve as the basis for all of the frames of the animation:
fig = plt.figure(figsize=(10,10))
ax = fig.add_subplot(1, 1, 1, projection=proj)
ax.coastlines(zorder=2)
ax.coastlines(resolution='50m', color='black')
ax.add_feature(state_boundaries, linestyle=':', edgecolor='black')
ax.add_feature(cfeat.BORDERS, linewidth='2', edgecolor='black')
<cartopy.mpl.feature_artist.FeatureArtist at 0x114eca160>
Then we loop over a bunch of the datasets. For each one we pull out the data and plot both the timestamp and the image. The ArtistAnimation
class takes the Figure
instance and a list as required arguments. The contents of this list is a collection of matplotlib artists for each frame of the animation. In the loop below, we populate this list with the Text
instance created when adding the timestamp as well as the image that results from plotting the data.
# List used to store the contents of all frames. Each item in the list is a tuple of
# (image, text)
artists = []
# How much to downsample
downsample = 2
# Loop over the last 20 satellite images in the catalog
for ds in list(cat.datasets.values())[-20:]:
# Open the data using the HTTPServer access url and pass to GINIFile. Then convert
# it to a netcdf-like dataset
nc = Dataset(ds.access_urls['OPENDAP'])
# Pull out the image data, x and y coordinates, and the time. Also go ahead and
# convert the time to a python datetime
x = nc.variables['x'][::downsample]
y = nc.variables['y'][::downsample]
timestamp = datetime.strptime(nc.start_date_time, '%Y%j%H%M%S')
img_data = nc.variables['Sectorized_CMI'][::downsample]
# Plot the image and the timestamp. We save the results of these plotting functions
# so that we can tell the animation that these two things should be drawn as one
# frame in the animation
im = ax.imshow(img_data, extent=(x.min(), x.max(), y.min(), y.max()), origin='upper',
cmap=wv_cmap, norm=wv_norm)
# Add text (aligned to the right); save the returned object so we can manipulate it.
text_time = ax.text(0.99, 0.01, timestamp.strftime('%d %B %Y %H%MZ'),
horizontalalignment='right', transform=ax.transAxes,
color='white', fontsize='x-large', weight='bold', animated=True)
text_channel = ax.text(0.5, 0.97, 'Experimental GOES-16 Ch.{}'.format(channel),
horizontalalignment='center', transform=ax.transAxes,
color='white', fontsize='large', weight='bold', animated=True)
# Make the text stand out even better using matplotlib's path effects
from matplotlib import patheffects
text_time.set_path_effects([patheffects.Stroke(linewidth=2, foreground='black'),
patheffects.Normal()])
text_channel.set_path_effects([patheffects.Stroke(linewidth=2, foreground='black'),
patheffects.Normal()])
ax.set_extent([240., 290., 20., 49.])
# Stuff them in a tuple and add to the list of things to animate
artists.append((im, text_time, text_channel))
# Create the animation--in addition to the required args, we also state that each
# frame should last 200 milliseconds
anim = ArtistAnimation(fig, artists, interval=200., blit=False)
HTML(anim.to_html5_video())