from glob import glob
import os
import ipywidgets as widgets
wstyle = {'description_width': 'initial'}
import pandas as pd
from datetime import datetime, timedelta
import requests
import s3fs
import boto3
from rasterio.plot import show
import matplotlib.pyplot as plt
%matplotlib inline
from random import sample
collectFootprint = None
collectDate = None
from retrying import retry
os.environ['CURL_CA_BUNDLE']='/etc/ssl/certs/ca-certificates.crt'
🚨WARNING🚨: The operations in this notebook interact with Planet Labs via a programmatic interface and therefore count against your Planet Quota! Be sure you know what this means for your individual Planet Labs access.
This notebook describes ordering imagery from Planet Labs overlapping an area of interest. We'll make use of the porder
command-line tool.
Of course, it is necessary for this step that you have an API key with Planet Labs. To begin this process, check out the Planet Education and Research Program
⚠️Note⚠️: In order to run these commands in this notebook, porder
requires a .planet.json
file in your home directory with the following format:
{
"key" : "<planet API key>"
}
If you have the Planet CLI installed, you can run planet init
to place this file in the correct place for porder
.
The primary utility of this notebook is to download Planet Labs imagery to accompany the ASO snow mask for training a fresh ML model. However, you can also use this notebook to download Planet imagery for any area of interest, for model evaluation or dataset production. Simply specify a different GeoJSON file than the default ASO footprint, as outlined below.
If you're just using this notebook to download Planet imagery to run through the model, you can skip the next few steps.
In the space below, provide a directory to search for ASO collect .tif
files. This is the same place where the Step 1 notebook stored the raw downloaded ASO Collects. We'll use this information to find the GeoJSON footprint of the given collect.
imgDir = widgets.Text(description="ASO Location", style = wstyle)
imgDir
Now select an ASO collect:
if not imgDir.value:
imgDir.value = "/tmp/work"
collect = widgets.Select(
options = glob(imgDir.value + "/*.tif")
)
collect
if not collect.value:
collect.value = ' ASO_3M_SD_USCOGE_20180524.tif'
We'll check to be sure that this selected image has a corresponding GeoJSON file.
⚠️ If the below step fails, verify in Step 1 that the "Preprocess for Tiling" step completed without errors.
collectFootprint = os.path.splitext(collect.value)[0]+'.geojson'
collectDateStr = os.path.basename(collect.value).split('_')[-1].split(".")[0]
collectDate = datetime.strptime(collectDateStr, "%Y%m%d")
assert os.path.exists(collectFootprint), "corresponding geojson file not found!"
If you're using this notebook independently of model training here* is where you'll put your GeoJSON for your area of interest*.
Here we'll specify the Area of Interest (AOI) in GeoJSON format that Planet will use to search for imagery on our behalf. If you've already done the above steps, you'll see the below text box is already filled out.
AOIPath = widgets.Text(description="AOI Path", value = collectFootprint, style = wstyle)
AOIPath
Here you can choose some parameters around the temporal search window for your Planet imagery. All you need to do is select a central date and a window size (days), and we'll compute the start and end dates for the search query.
If you're using an ASO collect, the central date will be filled for you.
centralDate = widgets.DatePicker(
description='Central Date',
disabled=False,
value = collectDate
)
dateWindow = widgets.IntText(description = 'Window Size', value = 1)
widgets.VBox([centralDate, dateWindow], style = wstyle)
searchBuffer = timedelta(days = date)
outputFormat = "%Y-%m-%d"
searchStartDate = datetime.strftime(centralDate.value - searchBuffer, outputFormat)
searchEndDate = datetime.strftime(centralDate.value + searchBuffer, outputFormat)
print(searchStartDate, centralDate.value, searchEndDate, sep='\n')
We can now use porder
's idlist
function to search for Planet Asset Ids which match our specifications, which are that the images:
%%bash -s "$AOIPath.value" "$searchStartDate" "$searchEndDate" "$imgDir.value"
porder idlist --input $1 --start $2 --end $3 --asset analytic_sr --item PSScene4Band --outfile $4/planet_ids.csv
There's now a file in the directory specified above called planet_ids.csv
containing IDs of images found to satisfy these constraints.
idListFile = os.path.join(imgDir.value, 'planet_ids.csv')
pd.read_csv(idListFile, header=None)
Now that these image IDs have been identified, we can submit these IDs to the Planet Orders API to be clipped to our specified area of interest (the ASO collect footprint) and delivered. For the purposes of this analysis we use the direct AWS S3 delivery option, this is optional. (If you don't use that option, you'll be given a link to check when your clipping operation is finished, at which point you can download the new assets. The email
option of porder
will send an e-mail when this is finished, as well).
🚨FINAL WARNING🚨: any orders submitted this way will count against your quota. Check your Planet quotas via porder quota
.
In order to use the AWS functionality, you must create a credentials file in the following way:
amazon_s3:
bucket: "<bucket name, e.g. planet-orders>"
aws_region: "<region name, e.g. us-west-2>"
aws_access_key_id: "<AWS Access key>"
aws_secret_access_key: "<AWS Secret>"
path_prefix: "<bucket prefix>"
AWSCredFile = "/home/ubuntu/aws-cred.yml"
To order the imagery, we use the porder order
functionality. This requires an order name, which we generate from the ASO collect filename and today's date.
orderName = "ORDER-{}-{}".format(os.path.basename(collect.value).split('.')[0], datetime.strftime(datetime.now(), "%Y%m%d-%H%M"))
orderName
%%bash -s "$orderName" "$AOIPath.value" "$idListFile" "$AWSCredFile"
porder order --name $1 --item PSScene4Band --bundle analytic_sr --boundary $2 --idlist $3 --aws $4 --op clip email aws
This URL above is the order reference, and can be queried (if the email
operation was chosen, an e-mail will be sent when the operation is finished as well).
To query the URL endpoint, provide your Planet username and password below, and run the adjacent cells. (This private information isn't available anywhere other than in your computer's temporary memory and will be deleted when Python stops running).
username = widgets.Text(description = "username")
password = widgets.Password(description = "password")
orderUrl = widgets.Text(description = "Order URL")
widgets.VBox([username, password, orderUrl], box_style = 'info')
@retry(stop_max_delay = 20 * 60 * 60 , wait_fixed = 10000)
def checkOrderStatus():
with requests.Session() as session:
session.auth = (username.value, password.value)
r = session.get(orderUrl.value).json()
return(r)
When the above function returns results
, we've got images.
dataFiles = checkOrderStatus()['_links']['results']
dataFiles
From these we extract the image files, which we'll turn into tiles.
At this point, since we specified the AWS delivery option, the raw Planet Assets are stored in S3. Next, we'll tile these images to that same bucket in S3, which we'll have to specify again below.
For each image requested from the Planet Orders API, three objects are returned:
.json
)*_DN_udm_clip.tif
)*_SR_clip.tif
)We'll get the paths for the clipped images (*_SR_clip.tif
) and send them to our preprocess
module for tiling.
images = [_i['name'] for _i in dataFiles if _i['name'].endswith('_SR_clip.tif')]
images
Here, specify the AWS S3 bucket you'd like the image tiles stored in (usually, its the same as the one the raw images are currently stored in).
s3Bucket = widgets.Text(description="S3 Bucket")
s3Bucket
os.chdir("/home/ubuntu/planet-snowcover/")
for image in images:
imageLocal = os.path.basename(image)
! aws s3 cp --profile esip {'s3://planet-snowcover-imagery/'+image} /tmp/{imageLocal}
! /home/ubuntu/anaconda3/envs/pytorch_p36/bin/python -m preprocess tile --zoom 15 --indexes 1 2 3 4 --quant 10000 --aws_profile esip --skip-blanks s3://{s3Bucket.value} /tmp/{imageLocal}
! rm /tmp/{imageLocal}
Let's see how that panned out.
## Check some tiles
import rasterio as rio
fs = s3fs.S3FileSystem(session= boto3.Session(profile_name = 'esip'))
for image in images:
imageId = os.path.basename(image).split('.')[0]
tiles = fs.walk('planet-snowcover-imagery/{}'.format(imageId))
print(len(tiles))
sampTiles = sample(tiles, 10)
fig = plt.figure(figsize = (10, 10), dpi = 100)
grid = plt.GridSpec(5, 3)
plt.suptitle(imageId)
with rio.Env(profile_name='esip'):
for i, t in enumerate(sampTiles):
tile = '/'.join(t.split('/')[-3:])
ax = plt.subplot(grid[i])
ax.set_title(tile)
ax.axis('equal')
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
show(rio.open('s3://' + t).read(4), cmap='binary_r', ax = ax)
We've started with an ASO collect, identified relevant Planet assets, ordered those clipped assets from Planet, and processed them into tiles that live on Amazon S3. Below are the S3 folders that contain these tiled assets for reference during model training. We'll put them in a file.
assets =["s3://planet-snowcover-imagery/{}/".format(os.path.basename(image).split('.')[0]) for image in images]
assets
pd.Series(assets).to_csv('/tmp/work/assets.csv', index = False)
!cat /tmp/work/assets.csv
As a reminder, these images are overlapping with the following ASO collect:
collect.value
Running the below cell will determine the corresponding ASO tile locations, if any.
collectTiles = 'planet-snowcover-snow/{}_binary'.format(os.path.splitext(os.path.basename(collect.value))[0])
print("Tiles located at \"{}\"".format(collectTiles)) if fs.exists(collectTiles) else "tiles not present on S3"
This ASO tile location, paired with the above imagery tile locations, is sufficient information to train the snow identification model.
by Tony Cannistra.
© University of Washington, 2019.
Support from the National Science Foundation, NASA THP, Earth Science Information partners, and the UW eScience Institute.