This notebook provides a Python pipeline of detecting wildfire from GOES-16 imagery using a robust principal component analysis (RPCA) based algorithm. The detail of this algorithm is described in Xu, Kong and Asgharzadeh (2021, link) which was presented in the 2021 IEEE International Symposium on Geoscience and Remote Sensing (IGARSS).
import numpy as np
from matplotlib import pyplot as plt
import pandas as pd
The Geostationary Operational Environmental Satellites (GOES)-16 [1], launched in 2017, is one of the newest weather satellites op-erated by NASA and the National Oceanic and AtmosphericAdministration (NOAA) to provide high frequency Earth imagery and atmospheric measurements. It scans the continen-tal U.S. (CONUS) every 5 minutes through 16 spectral bands covering visible and infrared wavelengths. The 2 km spatialresolution and high temporal resolution of GOES-16 imagery makes it suitable for detecting wildfires online.
Real-time feed and historical archive of GOES-16 Imagery are publicly accessible through Amazon S3. The goes2go
package provides function to download and visualize GOES imagery.
from goes2go.data import goes_timerange, goes_latest
import cartopy.crs as ccrs
import xarray as xr
from datetime import datetime
As an example, we can download the latest image and load it into an xarray
.
dir = 'GOES/'
G_ABI = goes_latest(satellite = 'G16',
domain = 'C',
product = 'ABI',
return_as = 'xarray',
save_dir = dir)
_______________________________ | Satellite: noaa-goes16 | | Product: ABI-L2-MCMIPC | | Domain: C | 📦 Finished downloading [1] files to [GOES\noaa-goes16\ABI-L2-MCMIPC]. 📚 Finished reading [1] files into xarray.Dataset.
scan_start = datetime.strptime(str(G_ABI.time_coverage_start.data), '%Y-%m-%dT%H:%M:%S.%fZ')
The raw data are hyperspectral images across 16 bands. We can project them to a natural color RGB for visualization. We can also overlay the map with various features provided by cartopy
. Here we specify coastlines and state boundaries.
For a full list of features, see http://www.naturalearthdata.com/downloads/
fig = plt.figure(figsize=(15, 12))
ax = plt.subplot(projection=G_ABI.rgb.crs)
ax.imshow(G_ABI.rgb.NaturalColor(), **G_ABI.rgb.imshow_kwargs)
ax.coastlines()
ax.add_feature(ccrs.cartopy.feature.STATES, linewidth=0.25)
plt.title('GOES-16 True Color', loc='left', fontweight='bold', fontsize=20)
plt.title('{}'.format(scan_start.strftime('%d %B %Y %H:%M UTC ')), loc='right', fontsize=15)
plt.show()
The coordinates variables of GOES images are x an y which represent horizontal and vertical scan angles in radians.
Source: link
They can be used to reproject the images to Lat/Lon coordinates using the code below.
#Source: https://github.com/blaylockbk/goes2go/blob/master/latlon_grids/create_standard_GOES_latlon_grids.ipynb
p = G_ABI['goes_imager_projection']
# The projection x and y coordinates equals the scanning angle (in radians)
# multiplied by the satellite height See details here:
# https://proj4.org/operations/projections/geos.html?highlight=geostationary
x = G_ABI['x'][:] * p.perspective_point_height
y = G_ABI['y'][:] * p.perspective_point_height
######################################################################
# The geostationary projection is the easiest way to plot the image on a
# map. Essentially, we are stretching the image across a map with the same
# projection and dimensions as the data.
try:
print('first try...', end='')
globe = ccrs.Globe(semimajor_axis=p.semi_major_axis,
semiminor_axis=p.semi_minor_axis,
inverse_flattening=p.inverse_flattening)
geos = ccrs.Geostationary(central_longitude=p.longitude_of_projection_origin,
satellite_height=p.perspective_point_height,
sweep_axis=p.sweep_angle_axis,
globe=globe)
except:
print('second try')
globe = ccrs.Globe(semimajor_axis=p.semi_major_axis[0],
semiminor_axis=p.semi_minor_axis[0],
inverse_flattening=p.inverse_flattening[0])
geos = ccrs.Geostationary(central_longitude=p.longitude_of_projection_origin[0],
satellite_height=p.perspective_point_height[0],
sweep_axis=p.sweep_angle_axis,
globe=globe)
first try...
X, Y = np.meshgrid(x,y)
a = ccrs.PlateCarree().transform_points(geos, X, Y)
lons, lats = a[:,:,0], a[:,:,1]
lats[np.isinf(lats)] = np.nan
lons[np.isinf(lons)] = np.nan
Now, Lat/Lon coordinates of (x,y) location can be obtained by
lat,lon = lats[x,y],lons[x,y]
We can also store the projected coordinates for future use.
gll = xr.Dataset(dict(latitude=(('y', 'x'), lats),
longitude=(('y', 'x'), lons)))
gll.latitude.attrs = dict(standard_name='latitude', long_name='latitude', units='degrees_north')
gll.longitude.attrs = dict(standard_name='longitude', long_name='longitude', units='degrees_east')
gll.coords['latitude'] = gll.latitude
gll.coords['longitude'] = gll.longitude
gll = gll.where(~np.isinf(gll.latitude))
gll = gll.where(~np.isinf(gll.longitude))
ds.attrs['name'] = f'lat/lon grid for GOES-16 CONUS'
ds = ds.where(~np.isinf(ds.latitude))
ds = ds.where(~np.isinf(ds.longitude))
name = f'./latlon_GOES-16_CONUS.nc'
ds.to_netcdf(name)
ds
The Kincade fire was a wildfire that burned in Sonoma County, California in the United States. The fire started northeast of Geyserville in The Geysers on 9:24 p.m. on October 23, 2019 and subsequently burned 77,758 acres until the fire was fully contained on November 6, 2019. (source: wiki)
Fire perimeter are GIS data that map the approximated location of fire. They serve as groundtruth for validating detection results. The National Interagency Fire Center provides fire perimeters for most wildfires that took place in 2019.
import shapefile as shp
shpFile = r"Historic_GeoMAC_Perimeters_2019\US_HIST_FIRE_PERIM_2019_dd83"
sf = shp.Reader(shpFile)
The variable sf
is a list of shapeRecords, each shapeRecord stores the points (vertices) that form the fire perimeter.
# This index was found manually
kincade_perimeter = sf.shapeRecords()[702].shape.points
lon_list,lat_list = [],[]
for lon,lat in kincade_perimeter:
lon_list.append(lon)
lat_list.append(lat)
plt.plot(lon_list,lat_list,'.');
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.title("Perimeter of Kincade Fire")
plt.show()
We can convert the fire perimeter to a Polygon which is a spatial geometry. We can also visualize the Polygon on a map using the folium
package.
import geopandas as gpd
from shapely.geometry.polygon import Polygon
from shapely.geometry import Point
import folium
from folium import plugins
# Function to construct polygon and folium map given perimeter
def firePolygon(fire_name,lon_list,lat_list):
polygon = Polygon(zip(lon_list,lat_list))
mean_lon, mean_lat = np.mean(lon_list), np.mean(lat_list)
polygon_df = gpd.GeoDataFrame(index=[0],geometry=[polygon])
polygon_df = polygon_df.set_crs(epsg=4326)
m = folium.Map([mean_lat,mean_lon], zoom_start=10, tiles='Stamen Terrain')
for _, r in polygon_df.iterrows():
# Without simplifying the representation of each borough,
# the map might not be displayed
sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=0.001)
geo_j = sim_geo.to_json()
geo_j = folium.GeoJson(data=geo_j,
style_function=lambda x: {'stroke':False,'fillColor': 'red'})
folium.Popup(fire_name).add_to(geo_j)
geo_j.add_to(m)
return polygon, polygon_df, m
kincade_polygon, kincade_gdf, kincade_map = firePolygon('Kincade Fire',lon_list,lat_list)
kincade_gdf.plot(figsize=(3,3))
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.title("Polygon of Kincade Fire")
plt.show()
We can check if a coordinate, represented by a Point object, falls within the Polygon.
kincade_gdf.plot(figsize=(3,3))
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.title("Polygon of Kincade Fire")
plt.plot(-122.8, 38.6,'r.')
plt.show()
point = Point(-122.8, 38.6)
if kincade_polygon.contains(point):
print("The point with longitude %s and latitude %s is within the Polygon."%(point.x,point.y))
The point with longitude -122.8 and latitude 38.6 is within the Polygon.
kincade_map
To detect Kincade Fire, we use GOES-16 ABI Radiance images from October 28.
start = '2019-10-28 0:00'
end = '2019-10-28 0:05'
g_rad = goes_timerange(start, end,
satellite = 'goes16',
product = 'ABI-L1b-Rad',
domain = 'C',
return_as = 'filelist',
save_dir=dir
)
_______________________________ | Satellite: noaa-goes16 | | Product: ABI-L1b-RadC | | Domain: C | 📦 Finished downloading [16] files to [GOES\noaa-goes16\ABI-L1b-RadC].
g_rad
file | start | end | creation | |
---|---|---|---|---|
0 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:18.200 |
1 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:16.200 |
2 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:17.500 |
3 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:16.700 |
4 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:22.600 |
5 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:12.400 | 2019-10-28 00:04:26.100 |
6 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:13.000 | 2019-10-28 00:04:22.000 |
7 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:20.600 |
8 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:12.400 | 2019-10-28 00:04:21.300 |
9 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:13.000 | 2019-10-28 00:04:19.700 |
10 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:23.600 |
11 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:12.400 | 2019-10-28 00:04:18.900 |
12 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:13.000 | 2019-10-28 00:04:24.200 |
13 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:11.800 | 2019-10-28 00:04:25.500 |
14 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:12.400 | 2019-10-28 00:04:26.500 |
15 | noaa-goes16/ABI-L1b-RadC/2019/301/00/OR_ABI-L1... | 2019-10-28 00:01:34.500 | 2019-10-28 00:04:13.000 | 2019-10-28 00:04:24.800 |
Our method only requires 3.9 μm (IR shortwave window) and 12.3 μm ("Dirty" longwave window) band.
g_sw = xr.open_dataset(dir+g_rad['file'][6])
g_lw = xr.open_dataset(dir+g_rad['file'][14])
rad_sw, rad_lw = g_sw['Rad'], g_lw['Rad']
Each pixel in GOES-16 image represents a 2 km × 2 km region. For our analysis, we crop a 100 km × 100 km region in Northern California.
xstart, ystart = 450, 100
height, width = 50, 50
xend, yend = xstart+height, ystart+width
rad_sw = rad_sw[xstart:xend,ystart:yend].data
rad_lw = rad_lw[xstart:xend,ystart:yend].data
We convert radiance data to brightness temperature (BT) data using the function BT=fk2bc2[log(fk1Lλ+1)−bc1] where Lλ is the radiance value and fk1,fk2,bc1,bc2 are wavelength dependent coefficients. Table of coefficients for different bands can be found in the GOES-R document.
def r2bt(rad,fk1,fk2,bc1,bc2):
bt = fk2 / (bc2*(np.log(fk1/rad+1)-bc1))
return bt
bt_sw = r2bt(rad_sw,2.00774e+05,3.68909e+03,0.50777,0.99929)
bt_lw = r2bt(rad_lw,6.40146e+03,1.16980e+03,0.27049,0.99894)
Our detection statistic is the brightness temperature difference (BTD).
btd = bt_sw - bt_lw
scan_start = datetime.strptime(g_sw.time_coverage_start, '%Y-%m-%dT%H:%M:%S.%fZ')
fig,ax = plt.subplots(1,figsize=(3,3),dpi=100)
plt.imshow(btd)
plt.axis("off")
plt.title('BTD', loc='left', fontweight='bold', fontsize=10)
plt.title('{}'.format(scan_start.strftime('%d %B %Y %H:%M UTC ')), loc='right', fontsize=8)
plt.show()
The BTD image can be seen as superposition of a spatially correlated background matrix and a sparse outlier matrix, where the outliers are potential fire pixels.
Robust Principal Component Analysis (RPCA) [2] decomposes an observation matrix into low-rank and sparse components through a two-objective optimization problem, minL,S ‖L‖∗+λ‖S‖1 s.t. Y=L+S where ‖L‖∗ is the sum of eigenvalues ∑iσi(L), and ‖S‖1 is the ℓ1-norm. The penalty coefficient λ controls the sparsity of S. It has been proved that under surprisingly broad conditions, the choice of penalty coefficient λ=1√max(m,n) can exactly recover the low-rank and sparse components [2].
We use a python implementation of RPCA (source: https://github.com/dganguli/robust-pca).
from r_pca import *
rpca = R_pca(btd,lmbda_coef = 1)
L_btd, S_btd = rpca.fit(max_iter=10000, iter_print=10000)
iteration: 1, error: 2.4605280979669835 iteration: 702, error: 3.491573954010701e-05
btd_min, btd_max = np.min(btd), np.max(btd)
fig,ax = plt.subplots(1,3,figsize = (9,3), dpi=100)
for i, image in enumerate(zip(['Original','Low-rank','Sparse'],[btd,L_btd,S_btd])):
name, mat = image
ax[i].imshow(mat, vmin=btd_min, vmax=btd_max)
ax[i].title.set_text(name)
ax[i].axis('off')
plt.show()
The sparse matrix may contain cloud pixels which are false positives. Water vapor absorbs atmospheric energy at 12.3 μm, therefore clouds are significantly cooler than the background in the 12.3 μm band. We propose to identify a pixel in S as a cloud pixel if its radiance at the 12.3 μm band, R12.3, is significantly low.
The T-point algorithm [3] is an automatic image thresholding method based on the pixel density function. It assumes that the pixel density can be approximated by a unimodal histogram with a heavy upper tail. The histogram can be decomposed into three parts: a steep rising slope, a steep descending slope, and a slow descending slope. It then fits a piece-wise linear regression to the steep descending and the low descending slope. The optimal cutoff value is set equivalent to the knot that minimizes the fitting error, such as the ℓd-norm, d∈{1,2}, of the regression. Let r denote a vector that stores the sorted negative pixel R12.3 of an m by n image in ascending order. If cloud pixels contribute to the lower tail of the R12.3 density, then they will contribute to the higher tail of the −R12.3 density. Let r(i), i∈[1,mn] be the −R12.3 of the i-th entry. If we assume the peak of −R12.3 histogram has been identified at index p, then the optimal cutoff index t∈[p+1,mn−1] is found by solving the optimization problem t=arg min{t∑i=p|r(i)−ˆr(i)|d+mn∑i=t|r(i)−ˆr(i)|d} where ˆr are the estimated −R12.3 by the fitted regression lines. Following Equation (1), we will identify a pixel as wildfire only if its R12.3 is above −r(t).
def Tshape_threshold(image_data):
fig,ax = plt.subplots(1,figsize=(4,2),dpi=200)
#Create pixel histogram
hist_norm=plt.hist(image_data.ravel(), bins='fd', density=True);
hist_values=hist_norm[0].tolist()
hist_axis = hist_norm[1].tolist()
#Get index M corrensponding to peak value vM
vM=max(hist_values)
M=hist_values.index(vM)
#Get index L corrensponding to the last non-empty bin
ne_list = [i for i, v in enumerate(hist_values) if v != 0]
L=ne_list[-1]
#print(M,L)
T=M+1
MSE=999
for k in range(M+1,L-1):
#define the two lines Lsteep and Lslow
Lsteep_m=(vM-hist_values[k])/(hist_axis[M]-hist_axis[k])
Lsteep_q=vM - Lsteep_m * hist_axis[M]
Lslow_m=(hist_values[k] - hist_values[L])/(hist_axis[k]-hist_axis[L])
Lslow_q=hist_values[k] - Lslow_m * hist_axis[k]
#Compute sum of MSE
MSE_k=0
for i in range(M,k):
line = hist_axis[i]*Lsteep_m+Lsteep_q
MSE_k += (line - hist_values[i])*(line - hist_values[i])
for i in range(k,L):
line = hist_axis[i]*Lslow_m+Lslow_q
MSE_k += (line - hist_values[i])*(line - hist_values[i])
#print(k, MSE_k, MSE, T)
MSE_k=MSE_k/(hist_axis[L]-hist_axis[M])
#Update threshold
if MSE_k < MSE:
MSE=MSE_k
T=k
#plot
x1, y1 = [hist_axis[M], hist_axis[T]], [vM, hist_values[T]]
x2, y2 = [hist_axis[T], hist_axis[L]], [hist_values[T], hist_values[L]]
plt.plot(x1, y1, x2, y2, marker = 'o')
plt.vlines(x = x1[1], ymin = 0, ymax = y1[1], linestyles='--', color='k')
plt.xlabel("$-R_{12.3}$",fontsize=15)
plt.show()
#plt.arrow(x1[1], y1[1]/3, 3, 0, head_width = 0.002,head_length = 1.5, color='k', alpha = 0.8)
return hist_axis[T]
T = -Tshape_threshold(-rad_lw)
fig,ax = plt.subplots(1,2,figsize = (6,3), dpi=100)
ax[0].imshow(S_btd, vmin=btd_min, vmax=btd_max)
ax[0].title.set_text("Before Cloud Masking")
ax[0].axis('off')
S_btd[rad_lw < T] = 0
ax[1].imshow(S_btd, vmin=btd_min, vmax=btd_max)
ax[1].title.set_text("After Cloud Masking")
ax[1].axis('off')
plt.show()
The sparse matrix can also contain noise due to sun glints and RPCA algorithm itself. We apply a low-pass filter and rule-based threshold to attenuate high-frequency noise.
from scipy import ndimage
S_btd_g = ndimage.gaussian_filter(S_btd, sigma=0.5)
fig,ax = plt.subplots(1,figsize = (3,3), dpi=100)
plt.imshow(S_btd_g, vmin=btd_min, vmax=btd_max)
plt.title("Blurred Image")
plt.axis('off')
plt.show()
A threshold value between 5 and 8 generally performs well.
TT = 6
fig,ax = plt.subplots(1,2,figsize = (6,3), dpi=100)
ax[0].imshow(S_btd, vmin=btd_min, vmax=btd_max)
ax[0].title.set_text("Before Noise Reduction")
ax[0].axis('off')
S_btd[S_btd_g < TT] = 0
ax[1].imshow(S_btd, vmin=btd_min, vmax=btd_max)
ax[1].title.set_text("After Noise Reduction")
ax[1].axis('off')
plt.show()
We can overlay our detection on the map we created before.
fire_points = []
for i in range(height):
for j in range(width):
if S_btd[i,j]>0:
x, y = xstart+i, ystart+j
lat, lon = lats[x,y], lons[x,y]
fire_points.append([lat,lon,S_btd[i,j]])
kincade_map.add_child(plugins.HeatMap(fire_points,min_opacity=0.5))
kincade_map.fit_bounds(kincade_map.get_bounds())
kincade_map
Our method detected two wildfires, the left one is the Kincade Fire as confirmed by the fire perimeter, the right one is actually the Grizzly Island Fire which started on Oct 28th.
GOES-16 has a built-in fire detection algorithm [4]. Its results can be extracted similar like other products.
g_fdc = goes_timerange(start, end,
satellite = 'goes16',
product = 'ABI-L2-FDCC',
domain = 'C',
return_as = 'xarray',
save_dir=dir
)
_______________________________ | Satellite: noaa-goes16 | | Product: ABI-L2-FDCC | | Domain: C | 📦 Finished downloading [1] files to [GOES\noaa-goes16\ABI-L2-FDCC]. 📚 Finished reading [1] files into xarray.Dataset.
g_fdc_mask = g_fdc['Mask']
g_fdc_mask_ncal = g_fdc_mask[xstart:xend,ystart:yend].data
fig,ax = plt.subplots(1,figsize=(3,3),dpi=100)
plt.imshow(g_fdc_mask_ncal)
plt.title('FDC', loc='left', fontweight='bold', fontsize=10)
plt.title('{}'.format(scan_start.strftime('%d %B %Y %H:%M UTC ')), loc='right', fontsize=8)
plt.axis("off")
plt.show()
The FDC algorithm assigns each pixel a mask code. Among them, 10-15 and 30-35 correspond to active fire (source: link).
fire_codes = set([i for i in range(10,16)]+[i for i in range(30,36)])
fire_points = []
for i in range(height):
for j in range(width):
if g_fdc_mask_ncal[i,j] in fire_codes:
x, y = xstart+i, ystart+j
lat, lon = lats[x,y], lons[x,y]
fire_points.append([lat,lon])
_, _, kincade_map = firePolygon('Kincade Fire',lon_list,lat_list)
kincade_map.add_child(plugins.HeatMap(fire_points,min_opacity=0.5))
kincade_map.fit_bounds(kincade_map.get_bounds())
kincade_map
We see that although FDC successfully identified the two wildfires, it is over-optimistic about the severity of Kincade Fire and yields significantly smaller coverage. In contrast, our method yields a more accurate characterization of the current status of the fire and can help stakeholders better estimate the risk and resources needed to suppress the fire.
[1] GOES-R Calibration Working Group and GOES-R Series Program, (2017): NOAA GOES-R Series Advanced Baseline Imager (ABI) Level 1b Radiances. NOAA National Centers for Environmental Information. doi:10.7289/V5BV7DSR.
[2] Emmanuel J Candès, Xiaodong Li, Yi Ma, and John Wright, "Robust principal component analysis?," Journal of the ACM (JACM), vol. 58, no. 3, pp. 1–37, 2011.
[3] Nicolas Coudray, Jean-Luc Buessler, and Jean-Philippe Urban, "Robust threshold estimation for images with uni-modal histograms," Pattern Recognition Letters, vol. 31,no. 9, pp. 1010–1019, 2010.
[4] Christopher C. Schmidt, Jay Hoffman, Elaine Prins, and Scott Lindstrom (2013): GOES-R Advanced Baseline Imager (ABI) Algorithm Theoretical Basis Document For Fire / Hot Spot Characterization. NOAA NESDIS Center for Satellite Applications and Research.