In this notebook, we'll cover opening, inspecting, subsetting, and plotting a TDS dataset using Siphon's remote_access
method.
remote_access
to open a TDS datasetremote_access
Before beginning, let's import the packages to be used throughout this training:
import matplotlib.pyplot as plt
import numpy as np
from siphon.catalog import TDSCatalog
Before we use remote_access
, we need to find a dataset that we'd like to access.
As an example, we'll use this dataset from the Unidata THREDDS test catalog.
To access a dataset, we need to know two things:
The dataset name can be found on the dataset HTML page, e.g. "GFS_Global_0p5deg_20170825_1800.grib2".
The catalog URL is the URL of the dataset page up to ".html", replacing ".html" with ".xml".
catUrl = "https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.xml";
datasetName = "GFS_Global_0p5deg_20170825_1800.grib2";
If you have another TDS dataset in mind, you can replace the catlog URL and dataset name above to point to that dataset instead.
Next, we access the catalog using the catalog URL:
catalog = TDSCatalog(catUrl)
And then select our dataset using the dataset name:
ds = catalog.datasets[datasetName]
ds.name
We can now view the access protocols available for our dataset.
list(ds.access_urls)
remote_access
¶Now that we have our dataset and know its access protocols, we can access the remote dataset.
If the name of the service is not provided, remote_access
defaults to using the CdmRemote
service.
dataset = ds.remote_access()
The call to ds.remote_access
opens the remote dataset and returns a netCDF4-like dataset object, which provides access to the metadata.
# list attributes
list(dataset.ncattrs())
# list variables
list(dataset.variables)
We can also use remote_access
to open the dataset via OPENDAP.
dataset = ds.remote_access('OPENDAP')
The returned netCDF4-like dataset object contains the same metadata as that returned by access via CdmRemote.
list(dataset.ncattrs())
list(dataset.variables)
Other than possible reordering of listed attributes and variables, users should see no difference in the object returned by remote_acesss
using OPENDAP versus CdmRemote. To read more about the two services, see the resource links.
We can access variables by name using the dataset's variables
dictionary.
var = dataset.variables['Precipitable_water_entire_atmosphere_single_layer'];
And view the variable's metadata:
print(var.shape)
print(var.dimensions)
Now we can start plotting our data. Let's plot our variable, Precipitable_water_entire_atmosphere_single_layer
, for all lat
and lon
at time=0
. First, we need to access the lat
and lon
variables.
lat = dataset.variables['lat']
lon = dataset.variables['lon']
Note: At this point, no data have been transferred over the network. Data will not be transferred until a variable is sliced, and only data corresponding to the slice are downloaded.
v = np.squeeze(var[0,:,:]) # precipitable water data are subsetted and downloaded here
# plot reflectivity
plt.pcolormesh(lon[:], lat[:], v, shading='auto') # lat and lon data are subsetted and downloaded here.
plt.title(var.name);
Data are finally downloaded when we slice our variables to plot the data. Try changing the indices to request a different subset of data.
For more information on Siphon and remote_access
, see the Siphon docs.
You may also be interested in reading more about OPENDAP and CDM Remote.