The Storm EVent ImagRy (SEVIR) dataset is a collection of temporally and spatially aligned images containing weather events captured by satellite and radar. This dataset was created using publically available datasets distributed by NOAA, including the GOES-16 geostationary satellite, and data derived from NEXRAD weather radars, both of which are available on the Registry of Open Data on AWS. This tutorial provides information about the SEVIR dataset as well as code samples that can be used to access the data.
The animation below shows one of thousands of samples in the SEVIR dataset. Each of these "events" consists of 4 hours of data in 5 minute time increments over a 384 km x 384 km patch sampled somewhere over the US. Each event is SEVIR is captured by up to 5 sensors, or image types, described below.
This tutorial includes the following:
There is no shortage of weather data available to the public: Weather radar, GEO & LEO satellite, surface observations, numerical weather prediction models, balloons, ocean buoys, aircraft measurements, lightning sensing, radio occultation, and others are being measured every second of every day to aid in our ability to analyze and forecast the weather. As a result, both the size and complexity of all of this data can make it difficult to work with. Many datasets, such as radar and satellite, are too large to process in bulk for those without access to massive compute and storage. Some datasets also require a level of domain expertise to decode, understand and process. These are some of the barriers for scientists and researchers who want to understand and use this data for their applications.
To help alleviate these issues, the SEVIR dataset was constructed to enable faster R&D in weather sensing, avoidance, short-term forecasting and other related applications. SEVIR combines and aligns multiple weather sensing modalities into a single, accessible dataset accessible for free on the cloud that can be used by meteorologists, data scientists and other researchers.
SEVIR is a collection of thousands of "events", which are 4-hour sequences of weather captured by up to 5 sensors. These 5 sensing modalities are summarized in the table below:
Sensor | Data key | Description | Spatial Resolution | Patch Size | Time step | Sample Colorized Image |
---|---|---|---|---|---|---|
GOES-16 C02 0.64 μm | vis |
Visible satellite imagery | 0.5 km | 768x768 | 5 minute | ![]() |
GOES-16 C09 6.9 μm | ir069 |
Infrared Satellite imagery (Water Vapor) | 2 km | 192 x 192 | 5 minutes | ![]() |
GOES-16 C13 10.7 μm | ir107 |
Infrared Satellite imagery (Window) | 2 km | 192 x 192 | 5 minutes | ![]() |
Vertically Integrated Liquid (VIL) | vil |
NEXRAD radar mosaic of VIL | 1 km | 384 x 384 | 5 minutes | ![]() |
GOES-16 GLM flashes | lght |
Detections of inter cloud and cloud to ground lightning events | 8 km | N/A | Continuous | ![]() |
Four of the five sensor types listed above (vis
,ir069
,ir107
& vil
) are raster (image) types. Their data is represented as a sequence of single channel (gray scale) images. Each 4 hour event corresponding to these these data types are represented as a L x L x 49
tensor, where L
is the image size given in the Patch Size column above. The last dimension of length 49 represents the time dimension of the data, which is consistent across all samples. Additional information on working with these images is given below.
The lightning type lght
is not stored as an image, but rather as a collection of discrete events. This lightning data for an event is represented by an Nx5
matrix, that contains information about flash time and location. Additional details are provided below.
Many of the events in SEVIR can augmented using data collected by the National Weather Service (NWS). Approximately 20% of the cases in SEVIR were selected based on tagged "Storm Events" logged by the National Weather Service (NWS) which are archived here: https://www.ncdc.noaa.gov/stormevents/. This database includes additional information including:
For example, below is one entry from the NWS Storm Events database linked to the event depicted in Figure 1. This represents one row from a CSV file whose columns are described here
StormEvents_details-ftp_v1.0_d2019_c20191217.csv:
201904,30,1452,201904,30,1454,137733,827433,"ILLINOIS",17,2019,"April","Tornado","C",135,"MONTGOMERY","LSX","30-APR-19 14:52:00","CST-6","30-APR-19 14:54:00","0","0","0","0","0.00K","0.00K","Storm Chaser",,,,,"EF0","1.17","25",,,,,"1","N","DONNELLSON","2","NE","DONNELLSON","39.0456","-89.4682","39.0517","-89.4478","As a warm front lifted northward through the region, widespread showers and thunderstorms developed. Some of the storms produced severe weather including large hail and a couple of tornadoes. The activity persisted into the early morning hours of May 1st, but no reports of severe weather occurred with this event on May 1st.","Several storm chasers witnessed a weak short-lived tornado from different vantage points. The tornado lifted dust from a field and small branches and leaves from a few trees. The location is estimated from a combination of radar and chaser reports. The tornado was rated EF0 with a path length of 1.17 miles and a max path width of 25 yards. There were no deaths or injuries reported with this tornado.","CSV"
As can be seen, this text data includes descriptions of time and location of a tornado, and a summary of the event.
Each event in SEVIR is assigned a unique id
, e.g. S827433
. All SEVIR id
's beginning with an S
are associated with an NWS Storm Event. The number following S
in the SEVIR id is the Event ID used in the NWS Storm Event database. The remaining events in SEVIR that are not directly linked to NWS storm events were selected randomly from throughout the Continental US. These randomly selected cases have id
's that begin with R
, e.g. R18032414177862
.
This section provides descriptions and code samples of how to access, visualize and work with SEVIR data.
SEVIR contains two major components:
We described each component separately below.
The data catalog is contained in the file named CATALOG.csv
. Each row of this file represents a single event captured by one of the sensor types listed in Table 1. The columns of the catalog are as follows:
Column | Description |
---|---|
id |
Unique id given to each event in SEVIR. Note that up to 5 rows of the catalog may posses this ID, since each event may be captured by up to 5 of the sensor types listed in Table 1. |
file_name |
Name of the HDF5 file containing the image data. |
file_index |
File index within file_name where the data is located |
img_type |
Image or sensor type from the "Data key" column in Table 1 |
time_utc |
UTC Timestamp of the event, which typically corresponds to the middle frame in the event |
minute_offsets |
Colon separated values denoting time offset in minutes of each frame relative to time_utc |
episode_id |
Storm Event EPISODE_ID associated to the SEVIR event (NWS Storm Events only) (see Note 1) |
event_id |
Storm Event EVENT_ID associated to the SEVIR event (NWS Storm Events only) (see Note 1) |
llcrnrlat |
Latitude of the lower left corner |
llcrnrlon |
Longitude of the lower left corner |
urcrnrlat |
Latitude of the upper right corner |
urcrnrlon |
Longitude of the upper right corner |
proj |
Proj4 string describing the map projection of the image (See Note 2) |
size_x |
X Size of the images in pixels |
size_y |
Y Size of the images in pixels |
height_m |
X Size of the images in meters |
width_m |
Y Size of the images in meters |
data_min |
Minimum data value across all frames of the event |
data_max |
Maximum data value across all frames of the event |
pct_missing |
Percentage of missing values across all the frames |
Notes:
There may additional Storm Event IDs that overlap with the 4 hour event in SEVIR. The catalog will only provide one such IDs. Others can be found by comparing the time and locations of the Storm Event with the metadata of the SEVIR event.
All projections (currently) in SEVIR are Lambert Azmuthial Equal Area. This column exists incase future data use a different projection.
The image data in SEVIR is stored in a collection of several HDF5 files. These files are organized in the following directory structure
data
+-- img_type
| +-- year
The filenames of the data are formatted as follows:
SEVIR_<IMG_TYPE>_<EVENT_TYPE>_<YEAR>_<START>_<END>.h5
Where
<TYPE>
is the image or sensor type (vil
, vis
, ir069
, etc..),
<EVENT_TYPE>
is either STORMEVENTS
, if the events are linked with NWS storm events, RANDOMEVENTS
, if the events are randomly sampled, or ALLEVENTS
if the two types are mixed into the same file.
<YEAR>_<START>_<END>
Denotes the year, start date and end date of the events in the file.
Each file contains the id
's of the events, as well as the event data. For example a summary of one such file is:
$ h5ls SEVIR_VIL_STORMEVENTS_2018_0101_0630.h5
id Dataset {793}
vil Dataset {793, 384, 384, 49}
All raster types in SEVIR (vis
, ir069
, ir107
& vil
) store the data 4D tensors with shapes N x L x W x T
,
where N
is the number of events in the file, LxL
is the image size (see patch size column in Table 1), and T
is the number of time steps in the image sequence.
For efficient storage, data is saved in HDF files as an integer type. Depending on the sensor type, these integers can be decoded into floating type so they represent the actual values captured by each sensor type. This decoding is done using either a simple linear scaling, or using an exponential transformation, as described below.
The satellite images (vis
,ir069
and ir107
) use a linear scaling to encode their values. To convert the integer data stored in the file to floating types, apply the formula
decoded_data = encoded_data * SCALING_FACTOR
where SCALING FACTOR
is provided in the following table, along with the units of the decoded data:
Type | SCALING_FACTOR |
Decoded units |
---|---|---|
vis |
1e-4 | Reflectance factor |
ir069 |
1e-2 | Degrees C |
ir107 |
1e-2 | Degrees C |
The vil
images in SEVIR are stored as integers in the range 0-254, and uses the value 255 in regions where data is missing. It is often convenient to work directly with these encoded units. To convert these into units of kg/m2, which are the true units of vertically integrated liquid, apply the following rule:
0 if X <= 5
(X - 2) / 90.66 if 5 < X <= 18
exp(( X - 83.9 ) / 38.9) if 18 < X <= 254
Where X
is the integer value stored in the HDF file. This non-linear scaling rule was developed to make better use of the range 0-254 for storing VIL. The reason for this is that VIL data exhibits histograms that quite skewed, and the non-linear encoding provides better precision and lessens the skewness.
The lght
data is the only non-raster type in SEVIR (currently). This data is stored in the HDF using the SEVIR id
as a key. Associated to each id
is an N x 5
matrix describing each 4 hour event. Each row of this matrix represents a single lightning flash identified by the GOES-16 GLM sensor. The columns of this matrix are described in the following table:
Column index | Meaning |
---|---|
0 | Time of flash in seconds relative to time_utc column in the catalog. |
1 | Reported latitude of flash in degrees |
2 | Reported longitude of flash in degrees |
3 | Flash X coordinate when converting to raster |
4 | Flash Y coordinate when converting to raster |
The X
and Y
coordinates of the flash are used when converting the flash data into raster format (see more information below).
This section includes code samples for working with SEVIR data. SEVIR data can downloaded and accessed from local storage, or through Amazon's S3 cloud storage. This tutorial will assume SEVIR data was downloaded onto a local or network file system. The examples below require the following variables to be set:
# Enter path to the SEVIR data location
DATA_PATH = '../data'
CATALOG_PATH = '../CATALOG.csv'
# On some Linux systems setting file locking to false is also necessary:
import os
os.environ["HDF5_USE_FILE_LOCKING"]='FALSE'
The following code shows how to open a data file directly without using the catalog to visualize frames of an event captured by a single type using the h5py
module:
import os
import h5py # needs conda/pip install h5py
import matplotlib.pyplot as plt
file_index = 0
with h5py.File('%s/vil/2018/SEVIR_VIL_STORMEVENTS_2018_0101_0630.h5' % DATA_PATH,'r') as hf:
event_id = hf['id'][file_index]
vil = hf['vil'][file_index]
print('Event ID:',event_id)
print('Image shape:',vil.shape)
fig,axs=plt.subplots(1,4,figsize=(10,5))
axs[0].imshow(vil[:,:,10])
axs[1].imshow(vil[:,:,20])
axs[2].imshow(vil[:,:,30])
axs[3].imshow(vil[:,:,40])
plt.show()
Event ID: b'S728503' Image shape: (384, 384, 49)
The catalog can be used to easily obtain event data captured by multiple sensor types. The catalog can be grouped on the id
column to locate all available sensors for each event. Since that some events in SEVIR are not covered by all sensors, so it is also necessary to filter out events that aren't covered by all requested image types.
The code below shows how to extract an event observed by four image types: vis
, vil
, ir069
and ir107
import h5py # needs conda/pip install h5py
import matplotlib.pyplot as plt
import pandas as pd
# Read catalog
catalog = pd.read_csv(CATALOG_PATH,parse_dates=['time_utc'],low_memory=False)
# Desired image types
img_types = set(['vis','ir069','ir107','vil'])
# Group by event id, and filter to only events that have all desired img_types
events = catalog.groupby('id').filter(lambda x: img_types.issubset(set(x['img_type']))).groupby('id')
event_ids = list(events.groups.keys())
print('Found %d events matching' % len(event_ids),img_types)
# Grab a sample event and view catalog entries
sample_event = events.get_group( event_ids[-1])
print('Sample Event:',event_ids[-1])
sample_event
Found 12739 events matching {'ir069', 'ir107', 'vis', 'vil'} Sample Event: S858968
id | file_name | file_index | img_type | time_utc | minute_offsets | episode_id | event_id | event_type | llcrnrlat | ... | urcrnrlat | urcrnrlon | proj | size_x | size_y | height_m | width_m | data_min | data_max | pct_missing | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
31740 | S858968 | vis/2019/SEVIR_VIS_STORMEVENTS_2019_0901_0930.h5 | 75 | vis | 2019-09-17 19:54:00 | -120:-115:-110:-105:-100:-95:-90:-85:-80:-75:-... | 142807.0 | 858968.0 | Thunderstorm Wind | 46.167805 | ... | 49.388668 | -89.263121 | +proj=laea +lat_0=38 +lon_0=-98 +units=m +a=63... | 768 | 768 | 384000.0 | 384000.0 | 0.033034 | 1.115368 | 0.0 |
45328 | S858968 | ir107/2019/SEVIR_IR107_STORMEVENTS_2019_0701_1... | 543 | ir107 | 2019-09-17 19:54:00 | -120:-115:-110:-105:-100:-95:-90:-85:-80:-75:-... | 142807.0 | 858968.0 | Thunderstorm Wind | 46.167805 | ... | 49.388668 | -89.263121 | +proj=laea +lat_0=38 +lon_0=-98 +units=m +a=63... | 192 | 192 | 384000.0 | 384000.0 | -68.275558 | 20.554308 | 0.0 |
46244 | S858968 | ir069/2019/SEVIR_IR069_STORMEVENTS_2019_0701_1... | 544 | ir069 | 2019-09-17 19:54:00 | -120:-115:-110:-105:-100:-95:-90:-85:-80:-75:-... | 142807.0 | 858968.0 | Thunderstorm Wind | 46.167805 | ... | 49.388668 | -89.263121 | +proj=laea +lat_0=38 +lon_0=-98 +units=m +a=63... | 192 | 192 | 384000.0 | 384000.0 | -68.739433 | -23.089878 | 0.0 |
50085 | S858968 | vil/2019/SEVIR_VIL_STORMEVENTS_2019_0701_1231.h5 | 421 | vil | 2019-09-17 19:54:00 | -119:-114:-109:-104:-99:-94:-89:-84:-79:-74:-6... | 142807.0 | 858968.0 | Thunderstorm Wind | 46.167805 | ... | 49.388668 | -89.263121 | +proj=laea +lat_0=38 +lon_0=-98 +units=m +a=63... | 384 | 384 | 384000.0 | 384000.0 | 0.000000 | 254.000000 | 0.0 |
74782 | S858968 | lght/2019/SEVIR_LGHT_ALLEVENTS_2019_0901_1001.h5 | 0 | lght | 2019-09-17 19:54:00 | NaN | 142807.0 | 858968.0 | Thunderstorm Wind | 46.167805 | ... | 49.388668 | -89.263121 | +proj=laea +lat_0=38 +lon_0=-98 +units=m +a=63... | 48 | 48 | 384000.0 | 384000.0 | 0.000000 | 453187.000000 | 0.0 |
5 rows × 21 columns
This data frame contains rows for each of the image types in SEVIR corresponding to the listed id
. The following code uses rows of sample_event
to extract image data for each type:
def read_data( sample_event, img_type, data_path=DATA_PATH ):
"""
Reads single SEVIR event for a given image type.
Parameters
----------
sample_event pd.DataFrame
SEVIR catalog rows matching a single ID
img_type str
SEVIR image type
data_path str
Location of SEVIR data
Returns
-------
np.array
LxLx49 tensor containing event data
"""
fn = sample_event[sample_event.img_type==img_type].squeeze().file_name
fi = sample_event[sample_event.img_type==img_type].squeeze().file_index
with h5py.File(data_path + '/' + fn,'r') as hf:
data=hf[img_type][fi]
return data
vis = read_data(sample_event, 'vis')
ir069 = read_data(sample_event, 'ir069')
ir107 = read_data(sample_event, 'ir107')
vil = read_data(sample_event, 'vil')
# plot a frame from each img_type
fig,axs = plt.subplots(1,4,figsize=(10,5))
frame_idx = 30
axs[0].imshow(vis[:,:,frame_idx]), axs[0].set_title('VIS')
axs[1].imshow(ir069[:,:,frame_idx]), axs[1].set_title('IR 6.9')
axs[2].imshow(ir107[:,:,frame_idx]), axs[2].set_title('IR 10.7')
axs[3].imshow(vil[:,:,frame_idx]), axs[3].set_title('VIL')
(<matplotlib.image.AxesImage at 0x7f6a11ab79e8>, Text(0.5, 1.0, 'VIL'))
To include lightning in the data gathered in Example 2, we first need to convert it to a raster type to display it along with the other image types. The following code achieves this by converting the flash data into flash counts per pixel per 5 minute frame:
import numpy as np
def lght_to_grid(data):
"""
Converts SEVIR lightning data stored in Nx5 matrix to an LxLx49 tensor representing
flash counts per pixel per frame
Parameters
----------
data np.array
SEVIR lightning event (Nx5 matrix)
Returns
-------
np.array
LxLx49 tensor containing pixel counts
"""
FRAME_TIMES = np.arange(-120.0,125.0,5) * 60 # in seconds
out_size = (48,48,len(FRAME_TIMES))
if data.shape[0]==0:
return np.zeros(out_size,dtype=np.float32)
# filter out points outside the grid
x,y=data[:,3],data[:,4]
m=np.logical_and.reduce( [x>=0,x<out_size[0],y>=0,y<out_size[1]] )
data=data[m,:]
if data.shape[0]==0:
return np.zeros(out_size,dtype=np.float32)
# Filter/separate times
# compute z coodinate based on bin locaiton times
t=data[:,0]
z=np.digitize(t,FRAME_TIMES)-1
z[z==-1]=0 # special case: frame 0 uses lght from frame 1
x=data[:,3].astype(np.int64)
y=data[:,4].astype(np.int64)
k=np.ravel_multi_index(np.array([y,x,z]),out_size)
n = np.bincount(k,minlength=np.prod(out_size))
return np.reshape(n,out_size).astype(np.float32)
def read_lght_data( sample_event, data_path=DATA_PATH ):
"""
Reads lght data from SEVIR and maps flash counts onto a grid
Parameters
----------
sample_event pd.DataFrame
SEVIR catalog rows matching a single ID
data_path str
Location of SEVIR data
Returns
-------
np.array
LxLx49 tensor containing pixel counts for selected event
"""
fn = sample_event[sample_event.img_type=='lght'].squeeze().file_name
id = sample_event[sample_event.img_type=='lght'].squeeze().id
with h5py.File(data_path + '/' + fn,'r') as hf:
data = hf[id][:]
return lght_to_grid(data)
lght = read_lght_data(sample_event)
# include lightning counts in plot
fig,axs = plt.subplots(1,5,figsize=(14,5))
frame_idx = 30
axs[0].imshow(vis[:,:,frame_idx]), axs[0].set_title('VIS')
axs[1].imshow(ir069[:,:,frame_idx]), axs[1].set_title('IR 6.9')
axs[2].imshow(ir107[:,:,frame_idx]), axs[2].set_title('IR 10.7')
axs[3].imshow(vil[:,:,frame_idx]), axs[3].set_title('VIL')
axs[4].imshow(lght[:,:,frame_idx]), axs[4].set_title('Lightning')
(<matplotlib.image.AxesImage at 0x7f69e00c8470>, Text(0.5, 1.0, 'Lightning'))
Example 3 used the default colormap to display the data. To display colorized images similar to those seen in Figure 1, colormaps are available in the repo located here: https://github.com/MIT-AI-Accelerator/eie-sevir. This code requires the module sevir
found in this repo to be in your path:
import sys
sys.path.append('../') # add sevir module to path
from sevir.display import get_cmap
# Get colormaps for encoded types
vis_cmap,vis_norm,vis_vmin,vis_vmax = get_cmap('vis',encoded=True)
ir069_cmap,ir069_norm,ir069_vmin,ir069_vmax = get_cmap('ir069',encoded=True)
ir107_cmap,ir107_norm,ir107_vmin,ir107_vmax = get_cmap('ir107',encoded=True)
vil_cmap,vil_norm,vil_vmin,vil_vmax = get_cmap('vil',encoded=True)
lght_cmap,lght_norm,lght_vmin,lght_vmax = get_cmap('lght',encoded=True)
fig,axs = plt.subplots(1,5,figsize=(14,5))
frame_idx = 30
axs[0].imshow(vis[:,:,frame_idx],cmap=vis_cmap,norm=vis_norm,vmin=vis_vmin,vmax=vis_vmax), axs[0].set_title('VIS')
axs[1].imshow(ir069[:,:,frame_idx],cmap=ir069_cmap,norm=ir069_norm,vmin=ir069_vmin,vmax=ir069_vmax), axs[1].set_title('IR 6.9')
axs[2].imshow(ir107[:,:,frame_idx],cmap=ir107_cmap,norm=ir107_norm,vmin=ir107_vmin,vmax=ir107_vmax), axs[2].set_title('IR 10.7')
axs[3].imshow(vil[:,:,frame_idx],cmap=vil_cmap,norm=vil_norm,vmin=vil_vmin,vmax=vil_vmax), axs[3].set_title('VIL')
axs[4].imshow(lght[:,:,frame_idx],cmap=lght_cmap,norm=lght_norm,vmin=lght_vmin,vmax=lght_vmax), axs[4].set_title('Lightning')
(<matplotlib.image.AxesImage at 0x7f6a12427208>, Text(0.5, 1.0, 'Lightning'))
In many applications it's necessary to know the lat/lon positions of pixels in a SEVIR patch. The catalog provides the required information to perform the georeferencing the data.
Currently, all patches in SEVIR use the same map projection, a Lambert Equal Area projection centered at (lat,lon)=(38,-98). This specifics of the projection are stored in the proj
column:
sample_event.proj.values[0]
'+proj=laea +lat_0=38 +lon_0=-98 +units=m +a=6370997.0 +ellps=sphere '
This string uses the Proj4 syntax for defining a map projection.
The class below encodes the formulas for this particular projection. Using this, it's possible to convert between pixel and lat/lon coordinates for each event in SEVIR.
import re
import numpy as np
class LaeaProjection():
"""
Translates between pixel coordinates (X,Y) and Lat lon coordinates for a given SEVIR event
projected in laea.
Parameters
----------
event pd.Series
One row from SEVIR catalog
"""
def __init__(self,event):
self.proj = event.proj
self.lat0 = float(re.compile('\+lat_0=([+-]?\d+)').search(self.proj).groups()[0])
self.lon0 = float(re.compile('\+lon_0=([+-]?\d+)').search(self.proj).groups()[0])
self.R = float(re.compile('\+a=(\d+)').search(self.proj).groups()[0])
self.llcrnlat = event.llcrnrlat
self.llcrnlon = event.llcrnrlon
self.refX, self.refY = self.forward(self.llcrnlon,self.llcrnlat,pixel=False)
self.binX = event.width_m / event.size_x
self.binY = event.height_m / event.size_y
def forward(self,lon,lat,pixel=True):
"""
Maps lat/lon to pixel x,y. For projection coordinates instead of pixel, set pixel=False.
"""
sind = lambda t: np.sin(t*np.pi/180)
cosd = lambda t: np.cos(t*np.pi/180)
k = self.R * np.sqrt(2/(1+sind(self.lat0)*sind(lat)+cosd(self.lat0)*cosd(lat)*cosd(lon-self.lon0)))
x = k*cosd(lat)*sind(lon-self.lon0)
y = k*(cosd(self.lat0)*sind(lat) - sind(self.lat0)*cosd(lat)*cosd(lon-self.lon0))
if pixel:
x = (x-self.refX) / self.binX
y = (y-self.refY) / self.binY
return x,y
def inverse(self,x,y,pixel=True):
"""
Maps pixel coordinates to (lon,lat) position. If passing projection corrdinates, set pixel=False.
"""
if pixel:
x = x*self.binX + self.refX
y = y*self.binY + self.refY
x/=self.R
y/=self.R
sind = lambda t: np.sin(t*np.pi/180)
cosd = lambda t: np.cos(t*np.pi/180)
rho = np.sqrt(x*x+y*y)
c = 2*np.arcsin(0.5*rho)
sinc = np.sin(c)
cosc = np.cos(c)
lat = 180/np.pi*np.arcsin(cosc*sind(self.lat0)+y*sinc*cosd(self.lat0)/rho)
lon = self.lon0+180/np.pi*np.arctan(x*sinc/(rho*cosd(self.lat0)*cosc - y*sind(self.lat0)*sinc))
return lon,lat
Using this class, we can get a grid of latitudes and longitudes for the grids in Examples 2 & Example 3. Note that when we plot the images in this cell, we set origin='lower'
so that the positive y dimension corresponds with increasing latitudes (i.e. northward)
proj = LaeaProjection( sample_event[sample_event.img_type=='vil'].squeeze() )
X,Y = np.meshgrid(np.arange(vil.shape[0]),np.arange(vil.shape[0]))
lons,lats = proj.inverse(X,Y)
# Plot with origin='lower' so up corresponds to north.
fig,ax=plt.subplots(1,3,figsize=(15,5))
ax[0].imshow(vil[:,:,20],origin='lower')
la = ax[1].imshow(lats,origin='lower')
fig.colorbar(la,ax=ax[1])
ax[1].set_title('Pixel Latitudes')
lo = ax[2].imshow(lons,origin='lower')
fig.colorbar(lo,ax=ax[2])
ax[2].set_title('Pixel Longitudes')
Text(0.5, 1.0, 'Pixel Longitudes')
This particular domain overlaps with north eastern Minnesota. We can add markers for particular locations by converting from lat/lon coordinates into the image pixel coordinates:
lat,lon = 46.7867, -92.1005 # Duluth, MN
x,y=proj.forward(lon,lat)
print('x=%f,y=%f' % (x,y))
# Plot with origin='lower' so up corresponds to north.
fig,ax=plt.subplots(1,1,figsize=(5,5))
ax.imshow(vil[:,:,20],origin='lower')
ax.plot(x,y,linestyle='none', marker="o", markersize=16, alpha=0.6, c="red")
ax.text(x-30,y-30,'Duluth, MN',color='r')
x=200.021527,y=79.057562
Text(170.02152736199426, 49.057561772525304, 'Duluth, MN')
There are a number of python modules that compute these conversions for you, and allow you to visualize the georeferenced result with features like coast lines, state borders and other nice map features. pyproj
is a popular module for computing geospatial transformations. Basemap
and Cartopy
are nice plotting libraries. The cell below shows how perform the georeferencing done in Example 4, but instead using the Basemap
library:
# Note: Requires basemap module to run
import warnings
warnings.filterwarnings('ignore')
from mpl_toolkits.basemap import Basemap # conda/pip install basemap basemap-data-hires
# Create Basemap Object
s=sample_event[sample_event.img_type=='vil'].squeeze()
fig,ax=plt.subplots(1,1,figsize=(7,7))
m = Basemap(llcrnrlat=s.llcrnrlat, llcrnrlon=s.llcrnrlon,
urcrnrlat=s.urcrnrlat,urcrnrlon=s.urcrnrlon,
width=s.width_m, height=s.height_m,
lat_0=38, lon_0=-98,
projection='laea',
resolution='i',
ax=ax)
m.drawstates(color='w')
m.drawcoastlines(color='w')
m.drawcountries(color='w')
m.imshow(vil[:,:,24])
lat,lon = 46.7867, -92.1005 # Duluth, MN
x,y=m(lon,lat) # will be in projection coordinates
m.plot(x,y,linestyle='none', marker="o", markersize=16, alpha=0.6, c="red")
ax.text(x-30e3,y-30e3,'Duluth, MN',color='r')
Text(170021.5273619945, 49057.56177252438, 'Duluth, MN')