Water Survey of Canada Usage

This notebook demonstrates methods for accessing and processing data extracted from the HYDAT database.

Table of Contents

Initialization

In [1]:
# Display graphics inline with the notebook
%matplotlib inline

# Standard Python modules
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Modules to display images and data tables
from IPython.core.display import display
from IPython.core.display import Image

# Module to request http data using Google maps api
import requests

# Module to manipulate dates for historical comparisons
import datetime

import seaborn as sns
sns.set_context('talk')

Read Cached Data Files

In [2]:
WSC_STATIONS = pd.read_pickle('../data/WSC_STATIONS')
WSC_LEVELS = pd.read_pickle('../data/WSC_LEVELS')
WSC_FLOWS = pd.read_pickle('../data/WSC_FLOWS')

display(WSC_STATIONS['STATION_NAME'])
STATION_NUMBER
05PA001                     KETTLE RIVER ABOVE KETTLE FALLS
05PA003                 NAMAKAN LAKE ABOVE KETTLE FALLS DAM
05PA005                       NORTHERN LIGHT LAKE AT OUTLET
05PA006             NAMAKAN RIVER AT OUTLET OF LAC LA CROIX
05PA007                     CROOKED LAKE NEAR CURTAIN FALLS
05PA010                           FRENCH LAKE NEAR ATIKOKAN
05PA011                     LAC LA CROIX AT CAMPBELL'S CAMP
05PA012                          BASSWOOD RIVER NEAR WINTON
05PA013                     NAMAKAN LAKE AT SQUIRREL ISLAND
05PB001                           SEINE RIVER NEAR LA SEINE
05PB002                 LITTLE TURTLE LAKE NEAR MINE CENTRE
05PB003                 MANITOU RIVER ABOVE DEVIL'S CASCADE
05PB004                 FOOTPRINT RIVER AT RAINY LAKE FALLS
05PB007                        RAINY LAKE NEAR FORT FRANCES
05PB009    SEINE RIVER AT STURGEON FALLS GENERATING STATION
05PB012                 LAC DES MILLE LACS ABOVE OUTLET DAM
05PB014                       TURTLE RIVER NEAR MINE CENTRE
05PB015                    PIPESTONE RIVER ABOVE RAINY LAKE
05PB018                          ATIKOKAN RIVER AT ATIKOKAN
05PB019    NORTHEAST TRIBUTARY TO DASHWA LAKE NEAR ATIKOKAN
05PB020      EASTERN TRIBUTARY TO DASHWA LAKE NEAR ATIKOKAN
05PB021      EYE RIVER NEAR HARDTACK LAKE NORTH OF ATIKOKAN
05PB022       EYE RIVER NEAR COULSON LAKE NORTH OF ATIKOKAN
05PB023                         RAINY LAKE AT NORTHWEST BAY
05PB024                           RAINY LAKE NEAR BEAR PASS
05PB025                            RAINY LAKE AT STOKES BAY
05PC009                        LA VALLEE RIVER AT LA VALLEE
05PC010                         STURGEON RIVER NEAR BARWICK
05PC016                         LA VALLEE RIVER NEAR DEVLIN
05PC018                       RAINY RIVER AT MANITOU RAPIDS
05PC019                         RAINY RIVER AT FORT FRANCES
05PC022                        LA VALLEE RIVER NEAR BURRISS
05PC024              RAINY RIVER AT PITHERS POINT SITE NO.1
05PC025              RAINY RIVER AT PITHERS POINT SITE NO.2
Name: STATION_NAME, dtype: object

Mapping WSC Stations in the Rainy River Watershed

The following cell creates a pandas dataframe of monitoring stations from the STATIONS.csv table extracted from the HYDAT database. The extaction searches for all stations with a specified region bounded by latitude and longitudes.

For reference, this is a map of the Rainy River drainage basin available from the International Joint Commission.

The following function maps a list of stations identified by their station numbers. In extracts latitude and longitude from the STATIONS table, then calls the google maps web api to create a map image.

In [3]:
def mapWSC(stationList,zoom=8):
    # returns a .png map image        
    locs = ["{0},{1}".format(WSC_STATIONS.ix[s,'LATITUDE'], WSC_STATIONS.ix[s,'LONGITUDE']) \
             for s in stationList]
    
    flows = [s for s in stationList if WSC_STATIONS.ix[s,'Flow'] == True]
    levels = [s for s in stationList if WSC_STATIONS.ix[s,'Level'] == True]
    
    rSet = set(levels).difference(set(flows))
    gSet = set(flows).difference(set(levels))
    bSet = set(levels).intersection(set(flows))

    google_maps_url = \
        "https://maps.googleapis.com/maps/api/staticmap?" + \
        "size=640x400" + \
        "&zoom={:d}".format(zoom) + \
        "&maptype=terrain" + \
        "&markers=color:red%7Csize:mid%7C" + \
        "|".join(["{0},{1}".format(WSC_STATIONS.ix[s,'LATITUDE'], \
                                   WSC_STATIONS.ix[s,'LONGITUDE']) for s in rSet]) + \
        "&markers=color:green%7Csize:mid%7C" + \
        "|".join(["{0},{1}".format(WSC_STATIONS.ix[s,'LATITUDE'], \
                                   WSC_STATIONS.ix[s,'LONGITUDE']) for s in gSet]) + \
        "&markers=color:blue%7Csize:mid%7C" + \
        "|".join(["{0},{1}".format(WSC_STATIONS.ix[s,'LATITUDE'], \
                                   WSC_STATIONS.ix[s,'LONGITUDE']) for s in bSet])

    return Image(requests.get(google_maps_url).content)


display(mapWSC(WSC_STATIONS.index))

Viewing Station Data

The HYDAT database is a collection of data associated with monitoring stations located throughout Canada. The STATIONS table contains a list of stations and attributes, including the latitude and longitude of their position. As an example, here we list attributes for 05PB007, a station monitoring the level of Rainy Lake near Fort Frances, Ontario.

In [4]:
display(WSC_STATIONS.ix['05PC018'])
STATION_NAME            RAINY RIVER AT MANITOU RAPIDS
PROV_TERR_STATE_LOC                                ON
REGIONAL_OFFICE_ID                                  5
HYD_STATUS                                          A
SED_STATUS                                        NaN
LATITUDE                                      48.6345
LONGITUDE                                    -93.9134
DRAINAGE_AREA_GROSS                             50200
DRAINAGE_AREA_EFFECT                              NaN
RHBN                                                0
REAL_TIME                                           0
CONTRIBUTOR_ID                                    647
OPERATOR_ID                                       600
DATUM_ID                                           10
Level                                            True
Flow                                             True
Name: 05PC018, dtype: object

Plotting Level Data

In [5]:
s = '05PA003'
display(WSC_STATIONS.ix[s])
display(mapWSC([s]))

plt.figure(figsize=(10,4))
WSC_LEVELS[s].plot(lw=1)
yrA = str(WSC_LEVELS[s].dropna().index[0].year)
yrB = str(WSC_LEVELS[s].dropna().index[-1].year)
plt.title(WSC_STATIONS.ix[s,'STATION_NAME'] + ' ' + yrA + '-' + yrB) 
plt.ylabel('Meters')
STATION_NAME            NAMAKAN LAKE ABOVE KETTLE FALLS DAM
PROV_TERR_STATE_LOC                                      ON
REGIONAL_OFFICE_ID                                        5
HYD_STATUS                                                D
SED_STATUS                                              NaN
LATITUDE                                               48.5
LONGITUDE                                          -92.6389
DRAINAGE_AREA_GROSS                                     NaN
DRAINAGE_AREA_EFFECT                                    NaN
RHBN                                                      0
REAL_TIME                                                 0
CONTRIBUTOR_ID                                          647
OPERATOR_ID                                             647
DATUM_ID                                                100
Level                                                  True
Flow                                                       
Name: 05PA003, dtype: object
Out[5]:
<matplotlib.text.Text at 0x115d137b8>

Plotting Flow Data

In [6]:
s = '05PC019'
display(WSC_STATIONS.ix[s])

display(mapWSC([s],12))
plt.figure(figsize=(10,4))
WSC_FLOWS[s].dropna().plot(lw=1)
yrA = str(WSC_FLOWS[s].dropna().index[0].year)
yrB = str(WSC_FLOWS[s].dropna().index[-1].year)
plt.title(WSC_STATIONS.ix[s,'STATION_NAME'] + ' ' + yrA + '-' + yrB) 
plt.ylabel('Cubic Meters per Second')
STATION_NAME            RAINY RIVER AT FORT FRANCES
PROV_TERR_STATE_LOC                              ON
REGIONAL_OFFICE_ID                                5
HYD_STATUS                                        A
SED_STATUS                                      NaN
LATITUDE                                    48.6085
LONGITUDE                                  -93.4034
DRAINAGE_AREA_GROSS                           38600
DRAINAGE_AREA_EFFECT                            NaN
RHBN                                              0
REAL_TIME                                         0
CONTRIBUTOR_ID                                  647
OPERATOR_ID                                       5
DATUM_ID                                        100
Level                                              
Flow                                           True
Name: 05PC019, dtype: object
Out[6]:
<matplotlib.text.Text at 0x11961c6a0>

Example: Comparing Levels on Rainy and Namakan Lakes

Use of this function is demonstrated by reading and plotting the history of lake levels for Rainy and Namakan Lakes.

In [7]:
plt.figure(figsize=(10,6))
WSC_LEVELS['05PB007'].plot(color='blue',lw=1)   # RL at Fort Frances
WSC_LEVELS['05PA003'].plot(color='green',lw=1)  # NL at Kettle Falls
WSC_LEVELS['05PA013'].plot(color='green',lw=1)  # NL at Squirrel Island

plt.legend(['Rainy Lake','Namakan Reservoir']);
plt.title('History of Rainy Lake and Namakan Lake Levels, 1911-')
plt.grid()

Example: What was the highest water events on Rainy Lake?

In [8]:
s = '05PB007'
LEVELS = WSC_LEVELS[s]

high_levels = np.asarray([[yr,r.max()] for (yr,r) in LEVELS.groupby(LEVELS.index.year)])
yr,h = high_levels.transpose()

plt.figure(figsize=(10,6))
plt.plot(yr,h)
plt.xlabel('Year')
plt.ylabel('Level [meters]')
plt.title('High Water Mark by Year: ' + WSC_STATIONS['STATION_NAME'].ix[s])
Out[8]:
<matplotlib.text.Text at 0x11b03d710>

Example: Distribution of Inflows on Rainy River

The use of this function is demonstrated by creating a historgram of flows on Rainy River in the period

In [9]:
RR = WSC_FLOWS['05PC019']
RM = WSC_FLOWS['05PC018']
In [10]:
plt.figure(figsize=(10,8))
plt.subplot(2,1,1)

RR.plot(lw=1)
RM.plot(lw=1)
plt.legend([WSC_STATIONS.ix[RR.name,'STATION_NAME'],WSC_STATIONS.ix[RM.name,'STATION_NAME']])

plt.subplot(2,1,2)

RR.hist(bins=100,lw=1,alpha=0.5)
RM.hist(bins=100,lw=1,alpha=0.5)

plt.legend([WSC_STATIONS.ix[RR.name,'STATION_NAME'],WSC_STATIONS.ix[RM.name,'STATION_NAME']])
Out[10]:
<matplotlib.legend.Legend at 0x119649710>

Ungaged Inflows to Rainy River

In [11]:
A = '05PC019'   # Rainy River at Ft. Frances
B = '05PC018'   # Rainy River at Manitou Rapids

FLOW = (WSC_FLOWS[B] - WSC_FLOWS[A]).dropna()

plt.figure(figsize=(10,6))
FLOW['1970':].plot(lw=1)

plt.xlabel('Year')
plt.ylabel('Flow [cubic meters/second]')
plt.title('Difference in Flow on Rainy River between Manitou Rapids and Ft. Frances')
Out[11]:
<matplotlib.text.Text at 0x11b9e5160>
In [12]:
plt.figure(figsize=(10,4))

FLOW['1970':'1999'].hist(bins=100,normed=1,alpha=0.5)
plt.ylim([0,.015])
plt.xlim([-100,800])

FLOW['2000':].hist(bins=100,normed=1,alpha=0.5)
plt.ylim([0,.015])
plt.xlim([-100,800])

plt.title('Distribution of Ungaged Flows into Rainy River')
plt.xlabel('Year')
plt.legend(['1970-1999','2000 - 2014'])
Out[12]:
<matplotlib.legend.Legend at 0x119b829e8>
In [13]:
plt.figure(figsize=(10,6))

FLOW = WSC_FLOWS[B] - WSC_FLOWS[A]

hist,bins = np.histogram([q for q in FLOW['1970':'2000'] if pd.notnull(q)],bins = 100)
chist = np.cumsum(hist[::-1])[::-1]/float(sum(hist))
plt.semilogy(bins[1:],chist)

hist,bins = np.histogram([q for q in FLOW['2000':] if pd.notnull(q)],bins = 100)
chist = np.cumsum(hist[::-1])[::-1]/float(sum(hist))
plt.semilogy(bins[1:],chist)
plt.xlim([0,plt.xlim()[1]])

plt.xlim([0,plt.xlim()[1]])

plt.legend(['1970-1999','2000-2014'])

plt.ylabel('Probability of Exceeding a Given Flowrate')
plt.xlabel('Flow [cubic meters/sec]')
plt.title('Frequency-Flow Diagram for Ungaged Inflows to Rainy River')
Out[13]:
<matplotlib.text.Text at 0x11b2ee5c0>

Example: Flow-Frequency for State-of-Nature Streams in the Rainy River Watershed

In [14]:
stationList = ['05PA006','05PA012','05PB001','05PB003','05PB004','05PB009',
               '05PB014','05PB015','05PB018','05PB019','05PB020','05PB021',
               '05PB022','05PC009','05PC010','05PC016','05PC022']

for s in stationList:
    print(s,end='')
    print("{0:50s}".format(WSC_STATIONS.loc[s]['STATION_NAME']),end='')
    print(WSC_FLOWS[s].dropna().index[0].year,end='')
    print(WSC_FLOWS[s].dropna().index[-1].year)
    
stationList = ['05PA006','05PA012','05PB014','05PB018']
display(mapWSC(stationList))

for s in stationList:
    print(s,end='')
    print("{0:50s}".format(WSC_STATIONS.loc[s]['STATION_NAME']),end='')
    print(WSC_FLOWS[s].dropna().index[0].year,end='')
    print(WSC_FLOWS[s].dropna().index[-1].year)

for s in stationList:
    plt.figure(figsize=(10,4))
    WSC_FLOWS[s].plot(lw=1)
    plt.title(s + ': ' + WSC_STATIONS.loc[s]['STATION_NAME'])
05PA006NAMAKAN RIVER AT OUTLET OF LAC LA CROIX           19212014
05PA012BASSWOOD RIVER NEAR WINTON                        19242010
05PB001SEINE RIVER NEAR LA S