Good morning! Today we will handle some rasters in Python.
The modules used below are:
gdal
*: bindings to the GDAL library which is part of the osgeo librarynumpy
for array calculations.These two libraries are the base of all raster processing in Python!
There are additional libraries that offer wrapper functions for gdal
and provide additional raster procssesing functionality:
rasterio
, see a NDVI tutorialrios
, a set of Python modules which makes it easy to write raster processing code in PythonMany important processing methods are available through additional libraries:
otb
), e.g. segmentationrsgislib
e.g. segmentationscikit-image
for image interpretationscikit-learn
for machine learningLet's get started and we will go through the following steps:
# import modules
from osgeo import gdal
from osgeo.gdalconst import GA_ReadOnly, GDT_Float32
import numpy as np
# open file and print info about the file
# the "./" refers to the parent directory of my working directory
filename = './data/ospy_data5/aster.img'
dataSource = gdal.Open(filename, GA_ReadOnly)
print("\nInformation about " + filename)
print("Driver: ", dataSource.GetDriver().ShortName,"/", \
dataSource.GetDriver().LongName)
print("Size is ",dataSource.RasterXSize,"x",dataSource.RasterYSize, \
'x',dataSource.RasterCount)
print('\nProjection is: ', dataSource.GetProjection())
print("\nInformation about the location of the image and the pixel size:")
geotransform = dataSource.GetGeoTransform()
if not geotransform is None:
print('Origin = (',geotransform[0], ',',geotransform[3],')')
print('Pixel Size = (',geotransform[1], ',',geotransform[5],')')
Information about ./data/ospy_data5/aster.img Driver: HFA / Erdas Imagine Images (.img) Size is 5665 x 5033 x 3 Projection is: PROJCS["UTM Zone 12, Northern Hemisphere",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],TOWGS84[0,0,0,-0,-0,-0,0],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-111],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["Meter",1],AUTHORITY["EPSG","32612"]] Information about the location of the image and the pixel size: Origin = ( 419976.5 , 4662422.5 ) Pixel Size = ( 15.0 , -15.0 )
# Read data into an array
band2Arr = dataSource.GetRasterBand(2).ReadAsArray(0,0,dataSource.RasterXSize, dataSource.RasterYSize)
band3Arr = dataSource.GetRasterBand(3).ReadAsArray(0,0,dataSource.RasterXSize, dataSource.RasterYSize)
print(type(band2Arr))
# set the data type
band2Arr=band2Arr.astype(np.float32)
band3Arr=band3Arr.astype(np.float32)
# Derive the NDVI
mask = np.greater(band3Arr+band2Arr,0)
# set np.errstate to avoid warning of invalid values (i.e. NaN values) in the divide
with np.errstate(invalid='ignore'):
ndvi = np.choose(mask,(-99,(band3Arr-band2Arr)/(band3Arr+band2Arr)))
print("NDVI min and max values", ndvi.min(), ndvi.max())
# Check the real minimum value
print(ndvi[ndvi>-99].min())
<class 'numpy.ndarray'> NDVI min and max values -99.0 1.0 -1.0
# Write the result to disk
driver = gdal.GetDriverByName('GTiff')
outDataSet=driver.Create('./data/ndvi.tif', dataSource.RasterXSize, dataSource.RasterYSize, 1, GDT_Float32)
outBand = outDataSet.GetRasterBand(1)
outBand.WriteArray(ndvi,0,0)
outBand.SetNoDataValue(-99)
# set the projection and extent information of the dataset
outDataSet.SetProjection(dataSource.GetProjection())
outDataSet.SetGeoTransform(dataSource.GetGeoTransform())
# Finally let's save it... or like in the OGR example flush it
outBand.FlushCache()
outDataSet.FlushCache()
Let's check the resulting file via Bash (all commands in the current document with a ! in front are run via the system shell, which is Bash). Below we call the gdalinfo
command of the GDAL C++ library directly. If you want to learn more about gdalinfo
go to: http://www.gdal.org/gdalinfo.html
!gdalinfo ./data/ndvi.tif
Driver: GTiff/GeoTIFF Files: ./data/ndvi.tif Size is 5665, 5033 Coordinate System is: PROJCS["WGS 84 / UTM zone 12N", GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4326"]], PROJECTION["Transverse_Mercator"], PARAMETER["latitude_of_origin",0], PARAMETER["central_meridian",-111], PARAMETER["scale_factor",0.9996], PARAMETER["false_easting",500000], PARAMETER["false_northing",0], UNIT["metre",1, AUTHORITY["EPSG","9001"]], AXIS["Easting",EAST], AXIS["Northing",NORTH], AUTHORITY["EPSG","32612"]] Origin = (419976.500000000000000,4662422.500000000000000) Pixel Size = (15.000000000000000,-15.000000000000000) Metadata: AREA_OR_POINT=Area Image Structure Metadata: INTERLEAVE=BAND Corner Coordinates: Upper Left ( 419976.500, 4662422.500) (111d58' 4.52"W, 42d 6'35.34"N) Lower Left ( 419976.500, 4586927.500) (111d57'27.92"W, 41d25'47.74"N) Upper Right ( 504951.500, 4662422.500) (110d56'24.38"W, 42d 6'49.98"N) Lower Right ( 504951.500, 4586927.500) (110d56'26.64"W, 41d26' 2.04"N) Center ( 462464.000, 4624675.000) (111d27' 5.89"W, 41d46'22.91"N) Band 1 Block=5665x1 Type=Float32, ColorInterp=Gray NoData Value=-99
Important here is that the gdalinfo
contains information about the projection (via SetProjection
) and also about the corner coordinates (via SetGeoTransform
).
Reprojecting a raster image can be done by
gdalwarp
function of the GDAL C++ library directlyPython
and the gdal
modulegdalwarp
¶The easiest way to reproject a raster file is to use GDAL's gdalwarp tool. As an example, we will reproject the above NDVI image derived earlier into latitude/longitude (WGS84).
-t_srs "EPSG:4326"
is the CRS to reproject to, i.e. lat/long, known by its EPSG code. The CRS to reproject from is already specified in the source data set, see above.data/ndvi.tif
is the input dataset.data/ndvi_ll.ti
f is the output dataset (the output format is here automatically set by the extension i.e. GeoTIFF).Question 1: What is the CRS of the image before it is reprojected?
# via the Shell
!gdalwarp -t_srs "EPSG:4326" ./data/ndvi.tif ./data/ndvi_ll.tif
Creating output file that is 6334P x 4215L. Processing input file ./data/ndvi.tif. Using internal nodata values (e.g. -99) for image ./data/ndvi.tif. Copying nodata values from source ./data/ndvi.tif to destination ./data/ndvi_ll.tif. 0...10...20...30...40...50...60...70...80...90...100 - done.
# Let's check what the result is
!gdalinfo ./data/ndvi_ll.tif
Driver: GTiff/GeoTIFF Files: ./data/ndvi_ll.tif Size is 6334, 4215 Coordinate System is: GEOGCS["WGS 84", DATUM["WGS_1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433], AUTHORITY["EPSG","4326"]] Origin = (-111.967923047226961,42.113899650756700) Pixel Size = (0.000162266608994,-0.000162266608994) Metadata: AREA_OR_POINT=Area Image Structure Metadata: INTERLEAVE=BAND Corner Coordinates: Upper Left (-111.9679230, 42.1138997) (111d58' 4.52"W, 42d 6'50.04"N) Lower Left (-111.9679230, 41.4299459) (111d58' 4.52"W, 41d25'47.81"N) Upper Right (-110.9401263, 42.1138997) (110d56'24.45"W, 42d 6'50.04"N) Lower Right (-110.9401263, 41.4299459) (110d56'24.45"W, 41d25'47.81"N) Center (-111.4540247, 41.7719228) (111d27'14.49"W, 41d46'18.92"N) Band 1 Block=6334x1 Type=Float32, ColorInterp=Gray NoData Value=-99
To reproject a raster from within Python is not as straight-forward... Have a look at the following links, and use the internet to find a way!
Let's see the output of the reprojected NDVI image.
# Notebook magic to select the plotting method
# Change to inline to plot within this notebook
#%matplotlib inline
%matplotlib inline
from osgeo import gdal
import matplotlib.pyplot as plt
# Open image
dsll = gdal.Open("./data/ndvi_ll.tif")
# Read raster data
ndvi = dsll.ReadAsArray(0, 0, dsll.RasterXSize, dsll.RasterYSize)
# Now plot the raster data using gist_earth palette
plt.imshow(ndvi, interpolation='nearest', vmin=0, cmap=plt.cm.gist_earth)
plt.show()
dsll = None
R
¶We can quickly check what we have done with R (go now to your Rstudio, or R terminal) and try the following code:
library(raster)
a <- brick('./data/ospy_data5/aster.img')
plotRGB(a, 3, 2, 1, stretch='hist')
b <- raster('./data/ndvi.tif')
extent(b)
projection(b)
hist(b, 1, maxpixels=500, plot=TRUE)
For this scripting challenge I have downloaded a Landsat image with all bands processed to surface reflectance (Level 1T). You can download it from here. Unzip the file and you will see that it contains all the invidual bands.
Now, write a well-structured and documented script where you define *a function to derive the Normalised difference water index (NDWI)* and derive it from the landsat image and *reproject the image to Lat/Long WGS84* (hint: use GDAL from Bash in your notebook).
NDWI = band 3 - band 5 / band 3 + band 5
(more info about NDWI
)
A clean and well structured script is critical here.
*** Create an Ipython Notebook file (.ipynb) with Jupyter Notebook and put it on GitLab!***