As atmospheric scientists, you view, analyze, and synthesize a multitude of datasets every day. These data and model output are collected by numerous entities and are stored in disparate locatons. Luckily, we have access to the THREDDS catalog which nicely aggregates and organizes that data for us. Using MetPy, we can programmatically access, analyze, and display those data for our specific needs on demand.
Today, you will explore a few of the many uses of MetPy by creating a multi-layer plot like this one. Plots like these are ubiquitous in research, forecasting, and education. You may have seen a plot like this in a textbook!
Since the plot you will be creating today includes three layers (satellite imagery, model output, and surface observations), you will repeat this process three times. With each iteration, you will discover nuances of each data product, what the product looks like on its own, and how to prepare it for a final production-quality plot.
from datetime import datetime
import cartopy.crs as ccrs
import cartopy.feature as cfeature
# Notice that we will use "plt" to access matplotlib
import matplotlib.pyplot as plt
import metpy.calc as mpcalc
import metpy.plots as mpplots
import numpy as np
from matplotlib.patheffects import withStroke
from metpy.io import parse_metar_file
from metpy.units import pandas_dataframe_to_unit_arrays
# Here is where we import the TDSCatalog class from siphon for obtaining our data
from siphon.catalog import TDSCatalog
Breakout room activity
The THREDDS Data Server provides us with coherent access to a large collection of real-time and archived datasets from a variety of environmental data sources at a number of distributed server sites. You can browse the TDS in your web browser using this link:
Instructions
In your breakout rooms, take a few moments to browse the catalog in a new tab of your web browser. Then, as a group, find the following information in the cell below. Determine one representative from each room to log the information in the cell below and share with the larger group at the end of the activity.
Answer: netCDF-4
Answer: GRIB-2
The first layer we will obtain is satellite imagery. To do this, we will create a THREDDS Data Server (TDS) catalog object, then extract a single image with its metadata, and visualize the result in a simple plot. For these tasks, we will use xarray to store the information in memory.
# Create TDS catalog object
# Notice the URL ends with .xml rather than .html
# This is because while your web browser uses html
# to show you the data in a nice, human-friendly
# webpage, siphon requires an xml document to read
# the data properly.
satcat = TDSCatalog(
"https://thredds.ucar.edu/thredds/catalog/satellite/goes/east/products/CloudAndMoistureImagery/CONUS/Channel02/current/catalog.xml"
)
# satcat includes many images as a part of the dataset
# choose the first (most recent) image in the data set, index = 0
satdata = satcat.datasets[0].remote_access(use_xarray=True)
# extract data array with geographic information
cmi = satdata.metpy.parse_cf("Sectorized_CMI")
# extract the date and time of the most recent array
dt = datetime.strptime(satdata.attrs["start_date_time"], "%Y%j%H%M%S")
Now that we have the data ready, it's time to plot. Incrementally plotting data is considered a best practice for understanding the data you are working with and determining if you are on your way to creating your intended product. To plot this image, we will use matplotlib to display our data. Let's first make sure we have the information we need:
- What is the name of the dataarray variable we need to plot?
- What is the matplotlib function to use?
- What is the correct syntax for using the function?
# ACTIVITY: Plot CMI
# Create your plot code below
## INSTRUCTOR'S ANSWER KEY ## S
plt.imshow(cmi)
<matplotlib.image.AxesImage at 0x7fe09e797070>
# CHALLENGE: Plot CMI
# Using the matplotlib documentation as a reference,
# change the color map of your above plot to grayscale
# https://matplotlib.org/stable/tutorials/colors/colormaps.html
## INSTRUCTOR'S ANSWER KEY ##
plt.imshow(cmi, cmap="Greys_r")
<matplotlib.image.AxesImage at 0x7fe07e3d3520>
Now we grab data from the Real-time Meso-analysis (RTMA), which gives us a gridded estimate of the realtime conditions. Our goal is to use MetPy to calculate equivalent potential temperature (θe), using the fields we have available in the RTMA.
This time we will also add some additional detail to our intermediate plot. To provide geographic reference, we will plot with a geographic projection and add coastlines.
# Again using Siphon, get the full collection of RTMA
# data opened using xarray.
rtma_cat = TDSCatalog('https://thredds.ucar.edu/thredds/catalog/grib/NCEP/RTMA/CONUS_2p5km/catalog.xml')
rtma_data = rtma_cat.datasets['Full Collection Dataset'].remote_access(use_xarray=True)
rtma_data = rtma_data.metpy.parse_cf()
Before we calculate any derived variables, we need to look up the required input variables. Looking at the metpy.calc documentation equivalent potential temperature, we can see that this function requires three variables: surface pressure, temperature, and dewpoint. Each of these variables are contained in our rtma_data Dataset, but we need to know what variable name each are stored under.
# ACTIVITY:
# rtma_data is an xarray Dataset containing several different variables.
# Use the space below to programmatically display the list of all
# variables in the Dataset.
## INSTRUCTOR'S ANSWER KEY ##
# Recall from day 1
# Have learner share in chat window
# without using print, errors come up accessing one specific variable before the variable list is displayed
print(rtma_data)
<xarray.Dataset> Dimensions: (: 2, altitude_above_msl: 1, height_above_ground: 1, height_above_ground1: 1, time: 736, time1: 732, x: 2145, y: 1377) Coordinates: * time1 (time1) datetime64[ns] ... reftime1 (time1) datetime64[ns] ... * x (x) float32 ... * y (y) float32 ... * altitude_above_msl (altitude_above_msl) float32 ... metpy_crs object ... * time (time) datetime64[ns] ... reftime (time) datetime64[ns] ... * height_above_ground1 (height_above_ground1) float32 ... * height_above_ground (height_above_ground) float32 ... Dimensions without coordinates: Data variables: (12/22) LambertConformal_Projection >i4 0 time1_bounds (time1, ) datetime64[ns] ... Total_precipitation_Forecast_altitude_above_msl_1_Hour_Accumulation (time1, altitude_above_msl, y, x) float32 ... Total_cloud_cover_Analysis_entire_atmosphere_single_layer (time, y, x) float32 ... Pressure_error_surface (time, y, x) float32 ... Wind_direction_from_which_blowing_error_height_above_ground (time, height_above_ground1, y, x) float32 ... ... ... Wind_direction_from_which_blowing_Analysis_height_above_ground (time, height_above_ground1, y, x) float32 ... Wind_speed_Analysis_height_above_ground (time, height_above_ground1, y, x) float32 ... u-component_of_wind_Analysis_height_above_ground (time, height_above_ground1, y, x) float32 ... v-component_of_wind_Analysis_height_above_ground (time, height_above_ground1, y, x) float32 ... Geopotential_height_Analysis_surface (time, y, x) float32 ... Dewpoint_temperature_Analysis_height_above_ground (time, height_above_ground, y, x) float32 ... Attributes: Originating_or_generating_Center: ... Originating_or_generating_Subcenter: ... GRIB_table_version: ... Type_of_generating_process: ... Analysis_or_forecast_generating_process_identifier_defined_by_originating... file_format: ... Conventions: ... history: ... featureType: ... _CoordSysBuilder: ...
Now we pull out the fields (variables) we need for the θe calculation: surface pressure, surface temperature, and surface dewpoint. Here we are making use of xarray and MetPy to select the particular time we want.
Notice the use of the
squeeze()
method. This helps eliminate any stray dimensions for the vertical since we want a 2D array as output (some values are at a particular height, like 2 m, rather than the surface).
# From the list of variables in rtma_data, we will pull
# "Pressure_Analysis_surface" out of the dataset for our
# surface pressure.
pres = rtma_data.Pressure_Analysis_surface.metpy.sel(time=dt, method='nearest').squeeze()
# ACTIVITY:
# Using the information above, create a temp and dewp
# variable that represent the surface temperature and
# dewpoint, respectively.
# temp =
# dewp =
## INSTRUCTOR'S ANSWER KEY ##
temp = rtma_data.Temperature_Analysis_height_above_ground.metpy.sel(time=dt, method='nearest').squeeze()
dewp = rtma_data.Dewpoint_temperature_Analysis_height_above_ground.metpy.sel(time=dt, method='nearest').squeeze()
Now calculate equivalent potential temperature using the metpy.calc documentation
# ACTIVITY: Calculate theta_e
# Create a variable called theta_e to represent the
# equivalent potential temperature field
# theta_e =
## INSTRUCTOR'S ANSWER KEY ##
theta_e = mpcalc.equivalent_potential_temperature(pres, temp, dewp)
# CHALLENGE: Calculate another variable
# Locate another derived variable in the metpy.calc documentation
# that takes in any combination of surface pressure, temperature,
# or dewpoint, then calculate it as a new variable. Plot it using
# any plotting tool to visualize the output.
## INSTRUCTOR'S ANSWER KEY ##
# one of many possibilities
rh = mpcalc.relative_humidity_from_dewpoint(temp, dewp)
plt.imshow(rh, origin="lower")
<matplotlib.image.AxesImage at 0x7fe07e3c6ac0>
# Smooth the theta_e array using a gaussian filter.
# This will improve the readability of the final plot.
theta_e = mpcalc.smooth_gaussian(theta_e, n=50)
Recall using cartopy's
crs
(cartographic reference systems) projections from Day 1 of this workshop.
We can use MetPy shortcuts to create one of these objects for our data. This will help us during plotting, and can be used to change the geographic projection of our data on demand. This is the final preparation step we will take before plotting our data.
# Create the crs object for the theta_e array
# RTMA uses a LambertConformal projection
rtma_crs = theta_e.metpy.cartopy_crs
# Create axes, then use contourf to create filled contours of the theta_e field
ax = plt.axes(projection=ccrs.Robinson())
ax.contourf(theta_e['x'], theta_e['y'], theta_e, transform=rtma_crs)
# ACTIVITY: Plot coastlines
# What single line of code is needed to add
# coastlines to the above plot?
## INSTRUCTOR'S ANSWER KEY ##
ax.coastlines()
<cartopy.mpl.feature_artist.FeatureArtist at 0x7fe07e301c40>
TIP: If you are working with xarray dataarrays parsed via
metpy.parse_cf
, and you aren't sure what your horizontal variables (e.g.x
orlon
orlatitude
) are named, or you just want a shortcut, you can access these quickly withxarray.metpy.x
andxarray.metpy.y
For example, these two lines of code are equivalent for dataarrays parsed with metpy.parse.cf:
ax.contourf(theta_e['x'], theta_e['y'], theta_e, transform=rtma_crs)
ax.contourf(theta_e.metpy.x, theta_e.metpy.y, theta_e, transform=rtma_crs)
# ACTIVITY: Contour plot
# Our final plot will use the LambertConformal projection,
# the same projection the RTMA dataset uses. We also will
# not use filled contours, since they must be plotted
# overtop of a satellite image.
#
# Re-plot the map above using the same projection as the
# RTMA dataset, use the contour() plot type, and include
# coastlines.
#
# Matplotlib contour():
# https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.contour.html
#
# CHALLENGE: Single color contours
# Create the contour plot where all contours are a
# single color of your choosing.
## INSTRUCTOR'S ANSWER KEY ##
#ax = plt.axes(projection=rtma_crs)
#ax.contour(theta_e.metpy.x, theta_e.metpy.y, theta_e, transform=rtma_crs)
#ax.coastlines()
# CHALLENGE ANSWER:
ax = plt.axes(projection=rtma_crs)
ax.contour(theta_e.metpy.x, theta_e.metpy.y, theta_e, transform=rtma_crs, colors='blue')
ax.coastlines()
<cartopy.mpl.feature_artist.FeatureArtist at 0x7fe07e2e73d0>
Finally, we will add station plots to our map. Creating a map of station plots is a bit different from our previous two map layers. We're no longer looking at 2D arrays of data, and instead we're looking at a collection of points containing several observations. Despite those differences, though, the same general workflow applies: obtain, prepare, and plot the data. However, the details of each will differ greatly from the satellite and θe plots.
# METAR information is stored in a different
# location from the previous THREDDS catalog,
# notice the change in URL.
metar_cat = TDSCatalog('https://thredds-test.unidata.ucar.edu/thredds/catalog/noaaport/text/metar/catalog.xml')
# Open the metar file that contains data
# closest to the satellite image time, dt
metar_text = metar_cat.datasets.filter_time_nearest(dt).remote_open(mode='t')
metar_text
now holds an object containing our METAR data in ascii text. Now we will prepare the data for plotting by converting it into a pandas DataFrame object usingparse_metar_file()
. A pandas DataFrame is similar to a spreadsheet in that the data are stored into rows and columns.
Example:
This structure allows us to more easily apply some quality control. Using Pandas, we will remove stations that do not have a useful lat/lon value (empty or null), as well as set missing current weather observations to an empty string.
# parse_metar_file() outputs a pandas DataFrame
sfc_data = parse_metar_file(metar_text, year=dt.year, month=dt.month)
# Save the units for all columns to a new variable
sfc_units = sfc_data.units
# Filter out missing lat/lon data
sfc_data = sfc_data[sfc_data.latitude.notna() & sfc_data.longitude.notna()]
# Set missing weather condition data to an empty string, ''
sfc_data['current_wx1'][sfc_data['current_wx1'].isna()] = ''
Now we convert the Pandas DataFrame object into a dictionary object, where each column is now a unique key. This structure is what MetPy requires for creating the final station plot.
# Create final data structure
sfc_data = pandas_dataframe_to_unit_arrays(sfc_data, sfc_units)
The next preparation step is to prepare the current weather conditions. METARs use text abbreviations to denote the current weather conditions, but MetPy requires that these codes be converted to the World Meteorological Organization (WMO) numerical code for plotting.
For example:
We will do this conversion by passing the
current_wx1
key from thesfc_data
dictionary through the wx_code_to_numeric() MetPy function.
# ACTIVITY: Convert wx code
# Convert weather (wx) abbreviation to WMO numeric wx code
# using the documentation linked above.
#
# sfc_data['wx_code'] =
## INSTRUCTOR'S ANSWER KEY ##
sfc_data['wx_code'] = mpplots.wx_code_to_numeric(sfc_data['current_wx1'])
# ACTIVITY: Calculate u and v wind
# The METAR reports wind as speed and direction,
# but MetPy needs wind data as u and v components
# for plotting.
# Review the wind_components documentation and/or
# a code sample for a similar plot type, then
# write the u wind component to sfc_data['u']
# and the v wind component to sfc_data['v']
# Documentation:
# https://unidata.github.io/MetPy/latest/api/generated/metpy.calc.wind_components.html
# Code sample:
# https://unidata.github.io/MetPy/latest/examples/plots/Mesonet_Stationplot.html
## INSTRUCTOR'S ANSWER KEY ##
sfc_data['u'], sfc_data['v'] = mpcalc.wind_components(sfc_data['wind_speed'], sfc_data['wind_direction'])
Now that we have completed all of our data preparation, we can start building the station plot. To do this, we need MetPy's
StationPlot
class. TheStationPlot
class acts as a template that we fill in with the surface observations stored in thesfc_data
variable. The station plot is broken up into regions surrounding a center point. The variables we'll be plotting are:
- air temperature (red, north west
NW
region)- dew point temperature (blue, south west
SW
region)- cloud coverage (white, center
C
region)- current weather symbol/wx code (blue, east
E
region)- wind speed and direction (wind barb).
Example:
NOTE: We are making a few changes from the standard station plot for accessibility and readability. Traditionally, the dewpoint value is displayed in green. However, we are making a modification to make our final plot more accessible to colorblindness. We are also moving the wx code to the east (as oppposed to west) to increase readability.
# Start by creating the matplotlib axes
ax = plt.axes(projection=rtma_crs)
# Create the station plot object, stn,
# from the StationPlot class, using the
# PlateCarree projection
stn = mpplots.StationPlot(ax, sfc_data['longitude'].m, sfc_data['latitude'].m, transform=ccrs.PlateCarree())
# Populate the temperature and dewpoint
stn.plot_parameter('NW', sfc_data['air_temperature'], color='red')
stn.plot_parameter('SW', sfc_data['dew_point_temperature'], color='blue')
# Populate the center circle cloud coverage and weather code
stn.plot_symbol('C', sfc_data['cloud_coverage'], mpplots.sky_cover)
stn.plot_symbol('E', sfc_data['wx_code'], mpplots.current_weather, color='blue')
# Populate the wind bard
stn.plot_barb(sfc_data['u'], sfc_data['v'])
# Add coastlines and set extent to the Southeast US
ax.coastlines()
ax.set_extent((-113, -70, 25, 45))
Oh. Oh no.
Because there are so many surface observations available, we ended up with an illegible plot. Let's now select only a subset of the stations to plot by creating a mask (filter) over our surface observations. The output should limit the number of stations to one station every 175 km.
# Create an array of station locations
locs = rtma_crs.transform_points(ccrs.PlateCarree(), sfc_data['longitude'].m, sfc_data['latitude'].m)
# Create the 1:175 km plot mask
plot_mask = mpcalc.reduce_point_density(locs[..., :2], 175000)
# Increase the size of the plot on the screen
fig = plt.figure(figsize=(20, 10))
# Create matplotlib axes in the Plate Carree projection
ax = fig.add_subplot(projection=rtma_crs)
# Create the StationPlot class with the plot_mask applied
# (append [plot_mask] to the end of each variable)
stn = mpplots.StationPlot(ax, sfc_data['longitude'][plot_mask].m, sfc_data['latitude'][plot_mask].m, transform=ccrs.PlateCarree(), clip_on=True)
# ACTIVITY: Apply plot_mask
# Populate all variables as previous, but apply the plot_mask
# this time. The temperature field is completed for you.
# Continue with dewpoint, cloud coverage, wx_code, and the
# wind barb.
stn.plot_parameter('NW', sfc_data['air_temperature'][plot_mask], color='red')
## INSTRUCTOR'S ANSWER KEY ##
stn.plot_parameter('SW', sfc_data['dew_point_temperature'][plot_mask], color='blue')
stn.plot_symbol('C', sfc_data['cloud_coverage'][plot_mask], mpplots.sky_cover)
stn.plot_symbol('W', sfc_data['wx_code'][plot_mask], mpplots.current_weather, color='blue')
stn.plot_barb(sfc_data['u'][plot_mask], sfc_data['v'][plot_mask])
# Finishing touches
ax.coastlines()
ax.set_extent((-113, -70, 25, 45))
Now that we have obtained, prepared, and visualized all of the individual datasets, we can now put them all together in a multi-layer plot. Many of the visualization tools you previously used will be again applied here, with the addition of a few other tips to help with the readability of the final map.
# Create axes
fig = plt.figure(figsize=(20, 10))
ax = fig.add_subplot(1, 1, 1, projection=rtma_crs)
# Plot satellite image:
# regrid_shape controls the detail of the reprojection
# of the satellite data; a smaller number indicates
# faster plotting at the expense of detail.
# Start with 500 to get a general sense of the data,
# and increase to ~6000 for better quality plots.
image_extent = (cmi.metpy.x[0], cmi.metpy.x[-1], cmi.metpy.y[0], cmi.metpy.y[-1])
ax.imshow(cmi, extent=image_extent, origin='lower', cmap='Greys_r',
regrid_shape=6000, transform=cmi.metpy.cartopy_crs)
# Plot theta_e as contours:
ax.contour(theta_e.metpy.x, theta_e.metpy.y, theta_e,
levels=range(240, 400, 8), colors='tab:blue',
transform=theta_e.metpy.cartopy_crs)
# Plot surface observations as a station plot:
# Here we are also adding several readability modifications.
# path_effects - used to draw the text with a thin
# outline to help it show up better.
# RGB colors - used to lighten up the default red
# and blue to make text stand out.
stn = mpplots.StationPlot(ax, sfc_data['longitude'][plot_mask].m, sfc_data['latitude'][plot_mask].m,
transform=ccrs.PlateCarree(), fontsize=11, zorder=10, clip_on=True)
stn.plot_parameter('NW', sfc_data['air_temperature'][plot_mask], color=[1.0,0.3,0.3],
path_effects=[withStroke(linewidth=1, foreground='black')])
stn.plot_parameter('SW', sfc_data['dew_point_temperature'][plot_mask], color=[0.6,0.6,1.0],
path_effects=[withStroke(linewidth=1, foreground='black')])
stn.plot_symbol('C', sfc_data['cloud_coverage'][plot_mask], mpplots.sky_cover, color='white')
stn.plot_symbol('E', sfc_data['wx_code'][plot_mask], mpplots.current_weather, color=[0.6,0.6,1.0],
path_effects=[withStroke(linewidth=1, foreground='black')])
stn.plot_barb(sfc_data['u'][plot_mask], sfc_data['v'][plot_mask], color='white')
# Finishing touches, including logo and title
ax.add_feature(cfeature.BORDERS, color='yellow')
ax.add_feature(cfeature.COASTLINE, color='yellow')
ax.set_extent((-113, -70, 25, 45))
ax.set_title(r'GOES-16 Ch. 2, RTMA $\theta_e$, Surface Obs at {:%Y%m%dT%H%M}'.format(dt))
mpplots.add_metpy_logo(fig)
<matplotlib.image.FigureImage at 0x7fe07cbc7070>
/home/dcamron/miniconda3/envs/pyaos-lesson/lib/python3.8/site-packages/cartopy/mpl/style.py:90: UserWarning: facecolor will have no effect as it has been defined as "never". warnings.warn('facecolor will have no effect as it has been '
# ACTIVITY: Make the plot your own
# Use the remaining time to stylize your plot to you liking.
# Use the space below to experiment with your plot.
# Some ideas:
# - Stylize the title of the plot, changing the size
# - Change the colors of the contours or station plots
# - Add additional variables to the station plots
# - Swap the theta_e contours with another RTMA variable
# Challenges:
# - Make this a 3-panel plot with 3 separate axes
# - Add labels to your theta_e contours
# - Review the MetPy example gallery and add elements to your plot
# https://unidata.github.io/MetPy/latest/examples/index.html