Unsupervised classification of imagery using scikit-learn

This example shows how to classify imagery (for example from LANDSAT) using scikit-learn. There are many classification methods available, but for this example we will use K-Means as it's simple and fast. For imagery I grabbed the North Carolina dataset raster sample and I'm using the red, green and blue bands of the landsat 7 imagery within this pacakge. I'm using rasterio to read in the data.

In [4]:
import numpy as np
import matplotlib.pyplot as plt

import rasterio
import sklearn.cluster
In [5]:
red_path = r"C:\projects\quick_scripts\gis_se\landsat\ncrast\lsat7_2002_30.tif"
green_path = r"C:\projects\quick_scripts\gis_se\landsat\ncrast\lsat7_2002_20.tif"
blue_path = r"C:\projects\quick_scripts\gis_se\landsat\ncrast\lsat7_2002_10.tif"
In [6]:
with rasterio.open(red_path) as red, rasterio.open(green_path) as green, rasterio.open(blue_path) as blue:
    data = np.array([red.read(1), green.read(1), blue.read(1)])

data.shape # Note that this is a three band image giving us a 3 dimensional array.
Out[6]:
(3L, 475L, 527L)
In [7]:
plt.figure(figsize=(12, 12))
plt.imshow(np.dstack(data))
Out[7]:
<matplotlib.image.AxesImage at 0x155ef898>