NumPy Demo¶

We'll be using this CSV file of tide level data for this demo. The data are from Battery Park, New York City on October 29-30, 2012, during Hurricane Sandy. The data in the CSV are a reformatted and cleaned version of this data from NOAA.

In [1]:
import numpy as np


Basic read with numpy.genfromtxt. Returns a 2d array of floats.

In [82]:
!head -n 5 BatteryParkTideData.csv

TimeOffsetHours,Pred6,Backup,Acoustc
0.0,1.5900000000000001,4.6799999999999997,4.6500000000000004
0.10000000000000001,1.5,4.5499999999999998,4.54
0.20000000000000001,1.3999999999999999,4.46,4.4400000000000004
0.29999999999999999,1.3100000000000001,4.3600000000000003,4.3300000000000001

In [2]:
data = np.genfromtxt('BatteryParkTideData.csv', delimiter=',', skip_header=1, missing='NA')

In [3]:
data

Out[3]:
array([[  0.  ,   1.59,   4.68,   4.65],
[  0.1 ,   1.5 ,   4.55,   4.54],
[  0.2 ,   1.4 ,   4.46,   4.44],
...,
[ 47.7 ,   3.25,   4.32,   4.5 ],
[ 47.8 ,   3.14,   4.22,   4.39],
[ 47.9 ,   3.03,   4.12,   4.28]])

Array Properties¶

In [4]:
print 'Shape: ', data.shape
print 'Size: ', data.size
print 'Number of dimensions: ', data.ndim
print 'Data type: ', data.dtype

Shape:  (480, 4)
Size:  1920
Number of dimensions:  2
Data type:  float64

In [5]:
data[0]

Out[5]:
array([ 0.  ,  1.59,  4.68,  4.65])
In [6]:
data[0, 1]

Out[6]:
1.5900000000000001
In [7]:
data[:, 1]

Out[7]:
array([ 1.59,  1.5 ,  1.4 ,  1.31,  1.22,  1.13,  1.04,  0.95,  0.87,
0.78,  0.7 ,  0.62,  0.55,  0.48,  0.41,  0.34,  0.28,  0.23,
0.18,  0.14,  0.1 ,  0.08,  0.06,  0.05,  0.05,  0.06,  0.08,
0.11,  0.14,  0.19,  0.25,  0.31,  0.39,  0.47,  0.56,  0.66,
0.76,  0.87,  0.98,  1.1 ,  1.22,  1.35,  1.48,  1.61,  1.74,
1.88,  2.01,  2.15,  2.28,  2.41,  2.55,  2.68,  2.81,  2.94,
3.06,  3.19,  3.31,  3.43,  3.55,  3.66,  3.78,  3.89,  3.99,
4.1 ,  4.2 ,  4.3 ,  4.39,  4.48,  4.57,  4.66,  4.73,  4.81,
4.89,  4.95,  5.02,  5.08,  5.13,  5.18,  5.23,  5.27,  5.3 ,
5.33,  5.35,  5.37,  5.38,  5.38,  5.38,  5.37,  5.36,  5.34,
5.31,  5.28,  5.24,  5.19,  5.13,  5.07,  5.  ,  4.93,  4.85,
4.76,  4.67,  4.57,  4.47,  4.37,  4.26,  4.15,  4.03,  3.92,
3.8 ,  3.68,  3.56,  3.44,  3.32,  3.21,  3.09,  2.97,  2.86,
2.74,  2.63,  2.52,  2.41,  2.3 ,  2.2 ,  2.09,  1.98,  1.88,
1.78,  1.68,  1.57,  1.47,  1.37,  1.27,  1.17,  1.08,  0.98,
0.89,  0.8 ,  0.71,  0.62,  0.54,  0.46,  0.39,  0.32,  0.26,
0.21,  0.16,  0.12,  0.09,  0.07,  0.06,  0.05,  0.06,  0.08,
0.1 ,  0.13,  0.18,  0.23,  0.29,  0.36,  0.44,  0.52,  0.61,
0.71,  0.81,  0.91,  1.02,  1.13,  1.25,  1.36,  1.49,  1.6 ,
1.73,  1.85,  1.97,  2.09,  2.21,  2.33,  2.45,  2.57,  2.68,
2.8 ,  2.91,  3.02,  3.13,  3.23,  3.33,  3.44,  3.53,  3.63,
3.72,  3.81,  3.9 ,  3.98,  4.06,  4.14,  4.21,  4.28,  4.34,
4.41,  4.46,  4.51,  4.56,  4.6 ,  4.64,  4.67,  4.69,  4.71,
4.73,  4.74,  4.74,  4.74,  4.72,  4.71,  4.69,  4.65,  4.62,
4.57,  4.52,  4.46,  4.4 ,  4.33,  4.25,  4.17,  4.08,  3.99,
3.89,  3.79,  3.68,  3.57,  3.46,  3.35,  3.23,  3.11,  3.  ,
2.88,  2.76,  2.65,  2.53,  2.42,  2.31,  2.2 ,  2.1 ,  1.99,
1.89,  1.79,  1.7 ,  1.6 ,  1.51,  1.42,  1.33,  1.25,  1.16,
1.08,  1.  ,  0.92,  0.85,  0.77,  0.7 ,  0.63,  0.56,  0.5 ,
0.44,  0.38,  0.33,  0.29,  0.24,  0.21,  0.18,  0.16,  0.15,
0.14,  0.14,  0.16,  0.18,  0.21,  0.25,  0.3 ,  0.36,  0.43,
0.5 ,  0.59,  0.68,  0.78,  0.89,  1.  ,  1.12,  1.24,  1.37,
1.5 ,  1.63,  1.77,  1.91,  2.05,  2.19,  2.32,  2.46,  2.6 ,
2.73,  2.87,  3.  ,  3.13,  3.25,  3.38,  3.5 ,  3.62,  3.73,
3.84,  3.95,  4.06,  4.16,  4.25,  4.35,  4.44,  4.52,  4.6 ,
4.68,  4.76,  4.83,  4.89,  4.95,  5.01,  5.07,  5.11,  5.16,
5.2 ,  5.23,  5.26,  5.29,  5.31,  5.32,  5.33,  5.33,  5.33,
5.32,  5.31,  5.29,  5.26,  5.23,  5.19,  5.14,  5.1 ,  5.04,
4.97,  4.9 ,  4.83,  4.75,  4.66,  4.57,  4.47,  4.37,  4.27,
4.16,  4.05,  3.93,  3.81,  3.69,  3.57,  3.45,  3.33,  3.21,
3.09,  2.97,  2.86,  2.74,  2.63,  2.51,  2.4 ,  2.29,  2.19,
2.08,  1.98,  1.88,  1.78,  1.68,  1.58,  1.49,  1.39,  1.3 ,
1.21,  1.12,  1.03,  0.94,  0.85,  0.77,  0.69,  0.6 ,  0.53,
0.46,  0.39,  0.33,  0.27,  0.22,  0.18,  0.14,  0.11,  0.09,
0.08,  0.08,  0.09,  0.1 ,  0.13,  0.16,  0.2 ,  0.26,  0.32,
0.39,  0.47,  0.55,  0.64,  0.74,  0.84,  0.95,  1.06,  1.17,
1.29,  1.41,  1.53,  1.66,  1.78,  1.9 ,  2.02,  2.15,  2.27,
2.39,  2.5 ,  2.62,  2.73,  2.84,  2.95,  3.05,  3.16,  3.26,
3.36,  3.45,  3.54,  3.63,  3.71,  3.8 ,  3.87,  3.95,  4.02,
4.09,  4.15,  4.22,  4.27,  4.32,  4.37,  4.42,  4.46,  4.49,
4.52,  4.55,  4.57,  4.58,  4.6 ,  4.6 ,  4.6 ,  4.59,  4.58,
4.56,  4.54,  4.51,  4.48,  4.43,  4.38,  4.33,  4.27,  4.2 ,
4.13,  4.05,  3.96,  3.87,  3.78,  3.68,  3.58,  3.47,  3.37,
3.25,  3.14,  3.03])

recarrays¶

Have genfromtxt grab column names from the header and it returns a recarray. The recarray can be indexed numerically to get row data or with a column name to get a 1d array of data for that column.

In [8]:
data = np.genfromtxt('BatteryParkTideData.csv', delimiter=',', names=True, missing='NA')

In [9]:
data

Out[9]:
array([(0.0, 1.59, 4.68, 4.65), (0.1, 1.5, 4.55, 4.54),
(0.2, 1.4, 4.46, 4.44), (0.3, 1.31, 4.36, 4.33),
(0.4, 1.22, 4.28, 4.26), (0.5, 1.13, 4.21, 4.18),
(0.6, 1.04, 4.15, 4.12), (0.7, 0.95, 4.08, 4.06),
(0.8, 0.87, 3.99, 3.97), (0.9, 0.78, 3.92, 3.89),
(1.0, 0.7, 3.87, 3.85), (1.1, 0.62, 3.86, 3.83),
(1.2, 0.55, 3.8, 3.78), (1.3, 0.48, 3.74, 3.73),
(1.4, 0.41, 3.68, 3.66), (1.5, 0.34, 3.63, 3.62),
(1.6, 0.28, 3.59, 3.58), (1.7, 0.23, 3.55, 3.53),
(1.8, 0.18, 3.5, 3.48), (1.9, 0.14, 3.45, 3.42),
(2.0, 0.1, 3.39, 3.35), (2.1, 0.08, 3.37, 3.34),
(2.2, 0.06, 3.33, 3.31), (2.3, 0.05, 3.31, 3.3),
(2.4, 0.05, 3.29, 3.26), (2.5, 0.06, 3.25, 3.22),
(2.6, 0.08, 3.21, 3.19), (2.7, 0.11, 3.19, 3.17),
(2.8, 0.14, 3.19, 3.17), (2.9, 0.19, 3.2, 3.18),
(3.0, 0.25, 3.23, 3.23), (3.1, 0.31, 3.29, 3.26),
(3.2, 0.39, 3.31, 3.29), (3.3, 0.47, 3.34, 3.32),
(3.4, 0.56, 3.39, 3.37), (3.5, 0.66, 3.44, 3.41),
(3.6, 0.76, 3.49, 3.46), (3.7, 0.87, 3.57, 3.54),
(3.8, 0.98, 3.67, 3.65), (3.9, 1.1, 3.78, 3.76),
(4.0, 1.22, 3.87, 3.85), (4.1, 1.35, 3.96, 3.95),
(4.2, 1.48, 4.09, 4.06), (4.3, 1.61, 4.19, 4.18),
(4.4, 1.74, 4.36, 4.33), (4.5, 1.88, 4.49, 4.46),
(4.6, 2.01, 4.59, 4.57), (4.7, 2.15, 4.68, 4.66),
(4.8, 2.28, 4.79, 4.77), (4.9, 2.41, 4.9, 4.88),
(5.0, 2.55, 5.01, 4.98), (5.1, 2.68, 5.12, 5.11),
(5.2, 2.81, 5.26, 5.24), (5.3, 2.94, 5.38, 5.35),
(5.4, 3.06, 5.52, 5.5), (5.5, 3.19, 5.69, 5.67),
(5.6, 3.31, 5.84, 5.83), (5.7, 3.43, 5.97, 5.96),
(5.8, 3.55, 6.13, 6.11), (5.9, 3.66, 6.26, 6.24),
(6.0, 3.78, 6.39, 6.38), (6.1, 3.89, 6.53, 6.51),
(6.2, 3.99, 6.68, 6.66), (6.3, 4.1, 6.84, 6.82),
(6.4, 4.2, 7.0, 6.98), (6.5, 4.3, 7.13, 7.1),
(6.6, 4.39, 7.25, 7.23), (6.7, 4.48, 7.36, 7.32),
(6.8, 4.57, 7.46, 7.43), (6.9, 4.66, 7.56, 7.53),
(7.0, 4.73, 7.65, 7.62), (7.1, 4.81, 7.71, 7.71),
(7.2, 4.89, 7.8, 7.78), (7.3, 4.95, 7.9, 7.88),
(7.4, 5.02, 8.02, 7.98), (7.5, 5.08, 8.07, 8.04),
(7.6, 5.13, 8.12, 8.1), (7.7, 5.18, 8.26, 8.23),
(7.8, 5.23, 8.36, 8.31), (7.9, 5.27, 8.47, 8.35),
(8.0, 5.3, 8.53, 8.35), (8.1, 5.33, 8.58, 8.35),
(8.2, 5.35, 8.58, 8.35), (8.3, 5.37, 8.6, 8.35),
(8.4, 5.38, 8.67, 8.36), (8.5, 5.38, 8.69, 8.35),
(8.6, 5.38, 8.67, 8.34), (8.7, 5.37, 8.71, 8.34),
(8.8, 5.36, 8.76, 8.34), (8.9, 5.34, 8.79, 8.35),
(9.0, 5.31, 8.8, 8.32), (9.1, 5.28, 8.85, 8.34),
(9.2, 5.24, 8.85, 8.34), (9.3, 5.19, 8.85, 8.29),
(9.4, 5.13, 8.85, 8.32), (9.5, 5.07, 8.81, 8.3),
(9.6, 5.0, 8.78, 8.33), (9.7, 4.93, 8.69, 8.34),
(9.8, 4.85, 8.6, 8.29), (9.9, 4.76, 8.58, 8.28),
(10.0, 4.67, 8.53, 8.25), (10.1, 4.57, 8.5, 8.26),
(10.2, 4.47, nan, nan), (10.3, 4.37, nan, nan),
(10.4, 4.26, nan, nan), (10.5, 4.15, nan, nan),
(10.6, 4.03, nan, nan), (10.7, 3.92, 8.1, 8.06),
(10.8, 3.8, 8.05, 7.99), (10.9, 3.68, 7.94, 7.88),
(11.0, 3.56, 7.83, 7.81), (11.1, 3.44, 7.77, 7.75),
(11.2, 3.32, 7.68, 7.64), (11.3, 3.21, 7.56, 7.53),
(11.4, 3.09, 7.45, 7.41), (11.5, 2.97, 7.33, 7.3),
(11.6, 2.86, 7.22, 7.18), (11.7, 2.74, 7.09, 7.06),
(11.8, 2.63, 6.96, 6.92), (11.9, 2.52, 6.84, 6.81),
(12.0, 2.41, 6.77, 6.73), (12.1, 2.3, 6.65, 6.63),
(12.2, 2.2, 6.58, 6.55), (12.3, 2.09, 6.5, 6.47),
(12.4, 1.98, 6.44, 6.41), (12.5, 1.88, 6.33, 6.3),
(12.6, 1.78, 6.24, 6.21), (12.7, 1.68, 6.16, 6.15),
(12.8, 1.57, 6.1, 6.09), (12.9, 1.47, 6.02, 6.0),
(13.0, 1.37, 5.91, 5.9), (13.1, 1.27, 5.87, 5.83),
(13.2, 1.17, 5.79, 5.76), (13.3, 1.08, 5.69, 5.67),
(13.4, 0.98, 5.61, 5.59), (13.5, 0.89, 5.55, 5.53),
(13.6, 0.8, 5.46, 5.44), (13.7, 0.71, 5.42, 5.39),
(13.8, 0.62, 5.37, 5.34), (13.9, 0.54, 5.33, 5.3),
(14.0, 0.46, 5.3, 5.26), (14.1, 0.39, 5.27, 5.24),
(14.2, 0.32, 5.26, 5.23), (14.3, 0.26, nan, 5.2),
(14.4, 0.21, 5.27, 5.24), (14.5, 0.16, 5.28, 5.25),
(14.6, 0.12, 5.29, 5.27), (14.7, 0.09, 5.38, 5.35),
(14.8, 0.07, 5.43, 5.4), (14.9, 0.06, 5.52, 5.5),
(15.0, 0.05, 5.6, 5.58), (15.1, 0.06, 5.69, 5.69),
(15.2, 0.08, 5.79, 5.77), (15.3, 0.1, 5.97, 5.97),
(15.4, 0.13, 6.11, 6.1), (15.5, 0.18, 6.24, 6.21),
(15.6, 0.23, 6.4, 6.36), (15.7, 0.29, 6.5, 6.46),
(15.8, 0.36, 6.61, 6.56), (15.9, 0.44, 6.7, 6.67),
(16.0, 0.52, 6.85, 6.82), (16.1, 0.61, 7.02, 6.97),
(16.2, 0.71, 7.15, 7.09), (16.3, 0.81, 7.28, 7.23),
(16.4, 0.91, 7.45, 7.4), (16.5, 1.02, 7.57, 7.51),
(16.6, 1.13, 7.75, 7.7), (16.7, 1.25, 7.91, 7.87),
(16.8, 1.36, 8.03, 8.0), (16.9, 1.49, 8.18, 8.14),
(17.0, 1.6, 8.27, 8.23), (17.1, 1.73, 8.38, 8.3),
(17.2, 1.85, 8.48, 8.33), (17.3, 1.97, 8.63, 8.32),
(17.4, 2.09, 8.77, 8.33), (17.5, 2.21, 8.9, 8.31),
(17.6, 2.33, 9.03, 8.32), (17.7, 2.45, 9.19, 8.29),
(17.8, 2.57, 9.33, 8.31), (17.9, 2.68, 9.48, 8.3),
(18.0, 2.8, 9.62, 7.85), (18.1, 2.91, 9.74, 7.83),
(18.2, 3.02, 9.95, 7.07), (18.3, 3.13, 10.1, 6.1),
(18.4, 3.23, 10.22, 6.09), (18.5, 3.33, 10.39, 6.11),
(18.6, 3.44, 10.55, 6.11), (18.7, 3.53, 10.69, 6.12),
(18.8, 3.63, 10.88, 6.6), (18.9, 3.72, 11.07, 6.91),
(19.0, 3.81, 11.25, 7.27), (19.1, 3.9, 11.41, 7.16),
(19.2, 3.98, 11.62, 7.07), (19.3, 4.06, 11.87, 7.31),
(19.4, 4.14, 12.09, 7.06), (19.5, 4.21, 12.33, 7.06),
(19.6, 4.28, 12.54, 7.24), (19.7, 4.34, 12.75, 7.13),
(19.8, 4.41, 12.93, 7.16), (19.9, 4.46, 13.04, 7.09),
(20.0, 4.51, 13.15, 7.16), (20.1, 4.56, 13.2, 7.16),
(20.2, 4.6, 13.26, 7.11), (20.3, 4.64, 13.34, 7.15),
(20.4, 4.67, 13.4, 7.26), (20.5, 4.69, 13.46, 7.13),
(20.6, 4.71, 13.54, 7.0), (20.7, 4.73, 13.65, 6.68),
(20.8, 4.74, 13.72, 6.85), (20.9, 4.74, 13.78, 7.12),
(21.0, 4.74, 13.81, 7.07), (21.1, 4.72, 13.85, 7.3),
(21.2, 4.71, 13.87, 7.3), (21.3, 4.69, 13.87, 7.32),
(21.4, 4.65, 13.88, 7.19), (21.5, 4.62, 13.79, 7.14),
(21.6, 4.57, 13.72, 7.18), (21.7, 4.52, 13.63, 7.03),
(21.8, 4.46, 13.54, 7.32), (21.9, 4.4, 13.41, 7.04),
(22.0, 4.33, 13.3, 7.23), (22.1, 4.25, 13.15, 6.88),
(22.2, 4.17, 12.99, 6.97), (22.3, 4.08, 12.86, 7.19),
(22.4, 3.99, 12.69, 7.1), (22.5, 3.89, 12.5, 7.18),
(22.6, 3.79, 12.27, 7.18), (22.7, 3.68, 12.07, 7.4),
(22.8, 3.57, 11.87, 7.09), (22.9, 3.46, 11.61, 7.03),
(23.0, 3.35, 11.32, 7.18), (23.1, 3.23, 11.04, 7.26),
(23.2, 3.11, 10.78, 7.24), (23.3, 3.0, 10.47, 6.43),
(23.4, 2.88, 10.15, 6.8), (23.5, 2.76, 9.81, 7.67),
(23.6, 2.65, 9.54, 8.14), (23.7, 2.53, 9.22, 8.31),
(23.8, 2.42, 8.92, 8.32), (23.9, 2.31, 8.62, 8.29),
(24.0, 2.2, 8.35, 8.17), (24.1, 2.1, 8.03, 7.94),
(24.2, 1.99, 7.77, 7.71), (24.3, 1.89, 7.58, 7.53),
(24.4, 1.79, 7.42, 7.36), (24.5, 1.7, 7.2, 7.19),
(24.6, 1.6, 7.06, 7.0), (24.7, 1.51, 6.87, 6.84),
(24.8, 1.42, 6.71, 6.68), (24.9, 1.33, 6.54, 6.51),
(25.0, 1.25, 6.4, 6.36), (25.1, 1.16, 6.26, 6.22),
(25.2, 1.08, 6.13, 6.09), (25.3, 1.0, 5.99, 5.96),
(25.4, 0.92, 5.85, 5.84), (25.5, 0.85, 5.77, 5.74),
(25.6, 0.77, 5.64, 5.62), (25.7, 0.7, 5.52, 5.51),
(25.8, 0.63, 5.36, 5.35), (25.9, 0.56, 5.21, 5.2),
(26.0, 0.5, 5.08, 5.08), (26.1, 0.44, 4.94, 4.94),
(26.2, 0.38, 4.8, 4.8), (26.3, 0.33, 4.7, 4.67),
(26.4, 0.29, 4.55, 4.53), (26.5, 0.24, 4.44, 4.41),
(26.6, 0.21, 4.32, 4.31), (26.7, 0.18, 4.21, 4.2),
(26.8, 0.16, 4.1, 4.08), (26.9, 0.15, 4.0, 3.96),
(27.0, 0.14, 3.91, 3.87), (27.1, 0.14, 3.81, 3.79),
(27.2, 0.16, 3.77, 3.75), (27.3, 0.18, 3.71, 3.68),
(27.4, 0.21, 3.67, 3.63), (27.5, 0.25, 3.64, 3.61),
(27.6, 0.3, 3.63, 3.61), (27.7, 0.36, 3.67, 3.64),
(27.8, 0.43, 3.69, 3.67), (27.9, 0.5, 3.72, 3.71),
(28.0, 0.59, 3.81, 3.79), (28.1, 0.68, 3.88, 3.87),
(28.2, 0.78, 4.0, 3.99), (28.3, 0.89, 4.07, 4.05),
(28.4, 1.0, 4.14, 4.14), (28.5, 1.12, 4.21, 4.2),
(28.6, 1.24, 4.3, 4.3), (28.7, 1.37, 4.41, 4.41),
(28.8, 1.5, 4.52, 4.53), (28.9, 1.63, 4.65, 4.66),
(29.0, 1.77, 4.79, 4.81), (29.1, 1.91, 4.95, 4.97),
(29.2, 2.05, 5.08, 5.13), (29.3, 2.19, 5.21, 5.28),
(29.4, 2.32, 5.36, 5.44), (29.5, 2.46, 5.5, 5.57),
(29.6, 2.6, 5.6, 5.71), (29.7, 2.73, 5.75, 5.87),
(29.8, 2.87, 5.89, 6.0), (29.9, 3.0, 5.99, 6.11),
(30.0, 3.13, 6.09, 6.21), (30.1, 3.25, 6.17, 6.27),
(30.2, 3.38, 6.27, 6.4), (30.3, 3.5, 6.37, 6.5),
(30.4, 3.62, 6.43, 6.54), (30.5, 3.73, 6.51, 6.62),
(30.6, 3.84, 6.56, 6.7), (30.7, 3.95, 6.61, 6.75),
(30.8, 4.06, 6.7, 6.8), (30.9, 4.16, 6.74, 6.86),
(31.0, 4.25, 6.81, 6.93), (31.1, 4.35, 6.84, 6.98),
(31.2, 4.44, 6.92, 7.04), (31.3, 4.52, 6.93, 7.07),
(31.4, 4.6, 6.93, 7.09), (31.5, 4.68, 6.96, 7.09),
(31.6, 4.76, 6.95, 7.1), (31.7, 4.83, 6.96, 7.12),
(31.8, 4.89, 6.97, 7.12), (31.9, 4.95, 7.01, 7.15),
(32.0, 5.01, 7.02, 7.15), (32.1, 5.07, 7.02, 7.15),
(32.2, 5.11, 7.03, 7.16), (32.3, 5.16, 7.02, 7.16),
(32.4, 5.2, 7.01, 7.15), (32.5, 5.23, 7.03, 7.17),
(32.6, 5.26, 7.05, 7.21), (32.7, 5.29, 7.13, 7.28),
(32.8, 5.31, 7.18, 7.32), (32.9, 5.32, 7.19, 7.34),
(33.0, 5.33, 7.17, 7.31), (33.1, 5.33, 7.13, 7.26),
(33.2, 5.33, 7.11, 7.24), (33.3, 5.32, 7.12, 7.26),
(33.4, 5.31, 7.12, 7.26), (33.5, 5.29, 7.11, 7.25),
(33.6, 5.26, 7.12, 7.25), (33.7, 5.23, 7.16, 7.3),
(33.8, 5.19, 7.2, 7.36), (33.9, 5.14, 7.24, 7.38),
(34.0, 5.1, 7.28, 7.41), (34.1, 5.04, 7.29, 7.44),
(34.2, 4.97, 7.33, 7.48), (34.3, 4.9, 7.31, 7.47),
(34.4, 4.83, 7.27, 7.42), (34.5, 4.75, 7.24, 7.4),
(34.6, 4.66, 7.19, 7.34), (34.7, 4.57, 7.11, 7.25),
(34.8, 4.47, 6.99, 7.14), (34.9, 4.37, 6.87, 7.01),
(35.0, 4.27, 6.72, 6.88), (35.1, 4.16, 6.64, 6.79),
(35.2, 4.05, 6.56, 6.69), (35.3, 3.93, 6.46, 6.61),
(35.4, 3.81, 6.37, 6.53), (35.5, 3.69, 6.25, 6.41),
(35.6, 3.57, 6.11, 6.28), (35.7, 3.45, 6.01, 6.18),
(35.8, 3.33, 5.91, 6.07), (35.9, 3.21, 5.78, 5.98),
(36.0, 3.09, 5.66, 5.83), (36.1, 2.97, 5.49, 5.64),
(36.2, 2.86, 5.31, 5.49), (36.3, 2.74, 5.15, 5.29),
(36.4, 2.63, 5.03, 5.16), (36.5, 2.51, 4.91, 5.02),
(36.6, 2.4, 4.78, 4.9), (36.7, 2.29, 4.64, 4.74),
(36.8, 2.19, 4.51, 4.59), (36.9, 2.08, 4.36, 4.44),
(37.0, 1.98, 4.21, 4.29), (37.1, 1.88, 4.05, 4.11),
(37.2, 1.78, 3.91, 3.96), (37.3, 1.68, 3.76, 3.81),
(37.4, 1.58, 3.63, 3.68), (37.5, 1.49, 3.51, 3.56),
(37.6, 1.39, 3.4, 3.43), (37.7, 1.3, 3.3, 3.34),
(37.8, 1.21, 3.21, 3.23), (37.9, 1.12, 3.08, 3.09),
(38.0, 1.03, 2.94, 2.96), (38.1, 0.94, 2.83, 2.83),
(38.2, 0.85, 2.73, 2.72), (38.3, 0.77, 2.62, 2.61),
(38.4, 0.69, 2.51, 2.49), (38.5, 0.6, 2.4, 2.39),
(38.6, 0.53, 2.34, 2.31), (38.7, 0.46, 2.23, 2.21),
(38.8, 0.39, 2.14, 2.11), (38.9, 0.33, 2.03, 2.0),
(39.0, 0.27, 1.92, 1.89), (39.1, 0.22, 1.84, 1.83),
(39.2, 0.18, 1.77, 1.77), (39.3, 0.14, 1.71, 1.69),
(39.4, 0.11, 1.67, 1.65), (39.5, 0.09, 1.64, 1.61),
(39.6, 0.08, 1.6, 1.58), (39.7, 0.08, 1.58, 1.56),
(39.8, 0.09, 1.58, 1.54), (39.9, 0.1, 1.56, 1.52),
(40.0, 0.13, 1.53, 1.51), (40.1, 0.16, 1.52, 1.51),
(40.2, 0.2, 1.53, 1.52), (40.3, 0.26, 1.55, 1.53),
(40.4, 0.32, 1.54, 1.53), (40.5, 0.39, 1.58, 1.57),
(40.6, 0.47, 1.64, 1.64), (40.7, 0.55, 1.7, 1.7),
(40.8, 0.64, 1.83, 1.8), (40.9, 0.74, 1.94, 1.93),
(41.0, 0.84, 2.03, 2.02), (41.1, 0.95, 2.14, 2.13),
(41.2, 1.06, 2.26, 2.26), (41.3, 1.17, 2.4, 2.41),
(41.4, 1.29, 2.57, 2.58), (41.5, 1.41, 2.73, 2.73),
(41.6, 1.53, 2.89, 2.89), (41.7, 1.66, 3.03, 3.04),
(41.8, 1.78, 3.17, 3.2), (41.9, 1.9, 3.33, 3.37),
(42.0, 2.02, 3.5, 3.54), (42.1, 2.15, 3.66, 3.71),
(42.2, 2.27, 3.83, 3.9), (42.3, 2.39, 4.01, 4.08),
(42.4, 2.5, 4.17, 4.25), (42.5, 2.62, 4.29, 4.4),
(42.6, 2.73, 4.44, 4.55), (42.7, 2.84, 4.6, 4.71),
(42.8, 2.95, 4.78, 4.88), (42.9, 3.05, 4.92, 5.03),
(43.0, 3.16, 4.99, 5.13), (43.1, 3.26, 5.1, 5.26),
(43.2, 3.36, 5.22, 5.38), (43.3, 3.45, 5.35, 5.5),
(43.4, 3.54, 5.41, 5.58), (43.5, 3.63, 5.5, 5.69),
(43.6, 3.71, 5.58, 5.76), (43.7, 3.8, 5.64, 5.83),
(43.8, 3.87, 5.68, 5.87), (43.9, 3.95, 5.73, 5.93),
(44.0, 4.02, 5.79, 5.98), (44.1, 4.09, 5.79, 6.0),
(44.2, 4.15, 5.81, 6.02), (44.3, 4.22, 5.83, 6.03),
(44.4, 4.27, 5.86, 6.06), (44.5, 4.32, 5.88, 6.08),
(44.6, 4.37, 5.89, 6.09), (44.7, 4.42, 5.88, 6.08),
(44.8, 4.46, 5.88, 6.07), (44.9, 4.49, 5.88, 6.07),
(45.0, 4.52, 5.88, 6.08), (45.1, 4.55, 5.91, 6.1),
(45.2, 4.57, 5.91, 6.1), (45.3, 4.58, 5.9, 6.1),
(45.4, 4.6, 5.9, 6.1), (45.5, 4.6, 5.87, 6.07),
(45.6, 4.6, 5.84, 6.05), (45.7, 4.59, 5.8, 6.01),
(45.8, 4.58, 5.79, 6.0), (45.9, 4.56, 5.77, 5.97),
(46.0, 4.54, 5.72, 5.93), (46.1, 4.51, 5.68, 5.89),
(46.2, 4.48, 5.64, 5.84), (46.3, 4.43, 5.58, 5.78),
(46.4, 4.38, 5.51, 5.71), (46.5, 4.33, 5.44, 5.63),
(46.6, 4.27, 5.38, 5.57), (46.7, 4.2, 5.32, 5.5),
(46.8, 4.13, 5.23, 5.41), (46.9, 4.05, 5.15, 5.32),
(47.0, 3.96, 5.07, 5.24), (47.1, 3.87, 4.96, 5.13),
(47.2, 3.78, 4.86, 5.03), (47.3, 3.68, 4.75, 4.92),
(47.4, 3.58, 4.63, 4.8), (47.5, 3.47, 4.52, 4.69),
(47.6, 3.37, 4.42, 4.6), (47.7, 3.25, 4.32, 4.5),
(47.8, 3.14, 4.22, 4.39), (47.9, 3.03, 4.12, 4.28)],
dtype=[('TimeOffsetHours', '<f8'), ('Pred6', '<f8'), ('Backup', '<f8'), ('Acoustc', '<f8')])
In [10]:
data[0]

Out[10]:
(0.0, 1.59, 4.68, 4.65)
In [11]:
data['Pred6']

Out[11]:
array([ 1.59,  1.5 ,  1.4 ,  1.31,  1.22,  1.13,  1.04,  0.95,  0.87,
0.78,  0.7 ,  0.62,  0.55,  0.48,  0.41,  0.34,  0.28,  0.23,
0.18,  0.14,  0.1 ,  0.08,  0.06,  0.05,  0.05,  0.06,  0.08,
0.11,  0.14,  0.19,  0.25,  0.31,  0.39,  0.47,  0.56,  0.66,
0.76,  0.87,  0.98,  1.1 ,  1.22,  1.35,  1.48,  1.61,  1.74,
1.88,  2.01,  2.15,  2.28,  2.41,  2.55,  2.68,  2.81,  2.94,
3.06,  3.19,  3.31,  3.43,  3.55,  3.66,  3.78,  3.89,  3.99,
4.1 ,  4.2 ,  4.3 ,  4.39,  4.48,  4.57,  4.66,  4.73,  4.81,
4.89,  4.95,  5.02,  5.08,  5.13,  5.18,  5.23,  5.27,  5.3 ,
5.33,  5.35,  5.37,  5.38,  5.38,  5.38,  5.37,  5.36,  5.34,
5.31,  5.28,  5.24,  5.19,  5.13,  5.07,  5.  ,  4.93,  4.85,
4.76,  4.67,  4.57,  4.47,  4.37,  4.26,  4.15,  4.03,  3.92,
3.8 ,  3.68,  3.56,  3.44,  3.32,  3.21,  3.09,  2.97,  2.86,
2.74,  2.63,  2.52,  2.41,  2.3 ,  2.2 ,  2.09,  1.98,  1.88,
1.78,  1.68,  1.57,  1.47,  1.37,  1.27,  1.17,  1.08,  0.98,
0.89,  0.8 ,  0.71,  0.62,  0.54,  0.46,  0.39,  0.32,  0.26,
0.21,  0.16,  0.12,  0.09,  0.07,  0.06,  0.05,  0.06,  0.08,
0.1 ,  0.13,  0.18,  0.23,  0.29,  0.36,  0.44,  0.52,  0.61,
0.71,  0.81,  0.91,  1.02,  1.13,  1.25,  1.36,  1.49,  1.6 ,
1.73,  1.85,  1.97,  2.09,  2.21,  2.33,  2.45,  2.57,  2.68,
2.8 ,  2.91,  3.02,  3.13,  3.23,  3.33,  3.44,  3.53,  3.63,
3.72,  3.81,  3.9 ,  3.98,  4.06,  4.14,  4.21,  4.28,  4.34,
4.41,  4.46,  4.51,  4.56,  4.6 ,  4.64,  4.67,  4.69,  4.71,
4.73,  4.74,  4.74,  4.74,  4.72,  4.71,  4.69,  4.65,  4.62,
4.57,  4.52,  4.46,  4.4 ,  4.33,  4.25,  4.17,  4.08,  3.99,
3.89,  3.79,  3.68,  3.57,  3.46,  3.35,  3.23,  3.11,  3.  ,
2.88,  2.76,  2.65,  2.53,  2.42,  2.31,  2.2 ,  2.1 ,  1.99,
1.89,  1.79,  1.7 ,  1.6 ,  1.51,  1.42,  1.33,  1.25,  1.16,
1.08,  1.  ,  0.92,  0.85,  0.77,  0.7 ,  0.63,  0.56,  0.5 ,
0.44,  0.38,  0.33,  0.29,  0.24,  0.21,  0.18,  0.16,  0.15,
0.14,  0.14,  0.16,  0.18,  0.21,  0.25,  0.3 ,  0.36,  0.43,
0.5 ,  0.59,  0.68,  0.78,  0.89,  1.  ,  1.12,  1.24,  1.37,
1.5 ,  1.63,  1.77,  1.91,  2.05,  2.19,  2.32,  2.46,  2.6 ,
2.73,  2.87,  3.  ,  3.13,  3.25,  3.38,  3.5 ,  3.62,  3.73,
3.84,  3.95,  4.06,  4.16,  4.25,  4.35,  4.44,  4.52,  4.6 ,
4.68,  4.76,  4.83,  4.89,  4.95,  5.01,  5.07,  5.11,  5.16,
5.2 ,  5.23,  5.26,  5.29,  5.31,  5.32,  5.33,  5.33,  5.33,
5.32,  5.31,  5.29,  5.26,  5.23,  5.19,  5.14,  5.1 ,  5.04,
4.97,  4.9 ,  4.83,  4.75,  4.66,  4.57,  4.47,  4.37,  4.27,
4.16,  4.05,  3.93,  3.81,  3.69,  3.57,  3.45,  3.33,  3.21,
3.09,  2.97,  2.86,  2.74,  2.63,  2.51,  2.4 ,  2.29,  2.19,
2.08,  1.98,  1.88,  1.78,  1.68,  1.58,  1.49,  1.39,  1.3 ,
1.21,  1.12,  1.03,  0.94,  0.85,  0.77,  0.69,  0.6 ,  0.53,
0.46,  0.39,  0.33,  0.27,  0.22,  0.18,  0.14,  0.11,  0.09,
0.08,  0.08,  0.09,  0.1 ,  0.13,  0.16,  0.2 ,  0.26,  0.32,
0.39,  0.47,  0.55,  0.64,  0.74,  0.84,  0.95,  1.06,  1.17,
1.29,  1.41,  1.53,  1.66,  1.78,  1.9 ,  2.02,  2.15,  2.27,
2.39,  2.5 ,  2.62,  2.73,  2.84,  2.95,  3.05,  3.16,  3.26,
3.36,  3.45,  3.54,  3.63,  3.71,  3.8 ,  3.87,  3.95,  4.02,
4.09,  4.15,  4.22,  4.27,  4.32,  4.37,  4.42,  4.46,  4.49,
4.52,  4.55,  4.57,  4.58,  4.6 ,  4.6 ,  4.6 ,  4.59,  4.58,
4.56,  4.54,  4.51,  4.48,  4.43,  4.38,  4.33,  4.27,  4.2 ,
4.13,  4.05,  3.96,  3.87,  3.78,  3.68,  3.58,  3.47,  3.37,
3.25,  3.14,  3.03])

Unpacking the Columns¶

Can also have genfromtxt unpack the columns into separate arrays.

In [12]:
time, pred, backup, accoustic = np.genfromtxt('BatteryParkTideData.csv', delimiter=',', skip_header=1, missing='NA', unpack=True)

In [13]:
pred

Out[13]:
array([ 1.59,  1.5 ,  1.4 ,  1.31,  1.22,  1.13,  1.04,  0.95,  0.87,
0.78,  0.7 ,  0.62,  0.55,  0.48,  0.41,  0.34,  0.28,  0.23,
0.18,  0.14,  0.1 ,  0.08,  0.06,  0.05,  0.05,  0.06,  0.08,
0.11,  0.14,  0.19,  0.25,  0.31,  0.39,  0.47,  0.56,  0.66,
0.76,  0.87,  0.98,  1.1 ,  1.22,  1.35,  1.48,  1.61,  1.74,
1.88,  2.01,  2.15,  2.28,  2.41,  2.55,  2.68,  2.81,  2.94,
3.06,  3.19,  3.31,  3.43,  3.55,  3.66,  3.78,  3.89,  3.99,
4.1 ,  4.2 ,  4.3 ,  4.39,  4.48,  4.57,  4.66,  4.73,  4.81,
4.89,  4.95,  5.02,  5.08,  5.13,  5.18,  5.23,  5.27,  5.3 ,
5.33,  5.35,  5.37,  5.38,  5.38,  5.38,  5.37,  5.36,  5.34,
5.31,  5.28,  5.24,  5.19,  5.13,  5.07,  5.  ,  4.93,  4.85,
4.76,  4.67,  4.57,  4.47,  4.37,  4.26,  4.15,  4.03,  3.92,
3.8 ,  3.68,  3.56,  3.44,  3.32,  3.21,  3.09,  2.97,  2.86,
2.74,  2.63,  2.52,  2.41,  2.3 ,  2.2 ,  2.09,  1.98,  1.88,
1.78,  1.68,  1.57,  1.47,  1.37,  1.27,  1.17,  1.08,  0.98,
0.89,  0.8 ,  0.71,  0.62,  0.54,  0.46,  0.39,  0.32,  0.26,
0.21,  0.16,  0.12,  0.09,  0.07,  0.06,  0.05,  0.06,  0.08,
0.1 ,  0.13,  0.18,  0.23,  0.29,  0.36,  0.44,  0.52,  0.61,
0.71,  0.81,  0.91,  1.02,  1.13,  1.25,  1.36,  1.49,  1.6 ,
1.73,  1.85,  1.97,  2.09,  2.21,  2.33,  2.45,  2.57,  2.68,
2.8 ,  2.91,  3.02,  3.13,  3.23,  3.33,  3.44,  3.53,  3.63,
3.72,  3.81,  3.9 ,  3.98,  4.06,  4.14,  4.21,  4.28,  4.34,
4.41,  4.46,  4.51,  4.56,  4.6 ,  4.64,  4.67,  4.69,  4.71,
4.73,  4.74,  4.74,  4.74,  4.72,  4.71,  4.69,  4.65,  4.62,
4.57,  4.52,  4.46,  4.4 ,  4.33,  4.25,  4.17,  4.08,  3.99,
3.89,  3.79,  3.68,  3.57,  3.46,  3.35,  3.23,  3.11,  3.  ,
2.88,  2.76,  2.65,  2.53,  2.42,  2.31,  2.2 ,  2.1 ,  1.99,
1.89,  1.79,  1.7 ,  1.6 ,  1.51,  1.42,  1.33,  1.25,  1.16,
1.08,  1.  ,  0.92,  0.85,  0.77,  0.7 ,  0.63,  0.56,  0.5 ,
0.44,  0.38,  0.33,  0.29,  0.24,  0.21,  0.18,  0.16,  0.15,
0.14,  0.14,  0.16,  0.18,  0.21,  0.25,  0.3 ,  0.36,  0.43,
0.5 ,  0.59,  0.68,  0.78,  0.89,  1.  ,  1.12,  1.24,  1.37,
1.5 ,  1.63,  1.77,  1.91,  2.05,  2.19,  2.32,  2.46,  2.6 ,
2.73,  2.87,  3.  ,  3.13,  3.25,  3.38,  3.5 ,  3.62,  3.73,
3.84,  3.95,  4.06,  4.16,  4.25,  4.35,  4.44,  4.52,  4.6 ,
4.68,  4.76,  4.83,  4.89,  4.95,  5.01,  5.07,  5.11,  5.16,
5.2 ,  5.23,  5.26,  5.29,  5.31,  5.32,  5.33,  5.33,  5.33,
5.32,  5.31,  5.29,  5.26,  5.23,  5.19,  5.14,  5.1 ,  5.04,
4.97,  4.9 ,  4.83,  4.75,  4.66,  4.57,  4.47,  4.37,  4.27,
4.16,  4.05,  3.93,  3.81,  3.69,  3.57,  3.45,  3.33,  3.21,
3.09,  2.97,  2.86,  2.74,  2.63,  2.51,  2.4 ,  2.29,  2.19,
2.08,  1.98,  1.88,  1.78,  1.68,  1.58,  1.49,  1.39,  1.3 ,
1.21,  1.12,  1.03,  0.94,  0.85,  0.77,  0.69,  0.6 ,  0.53,
0.46,  0.39,  0.33,  0.27,  0.22,  0.18,  0.14,  0.11,  0.09,
0.08,  0.08,  0.09,  0.1 ,  0.13,  0.16,  0.2 ,  0.26,  0.32,
0.39,  0.47,  0.55,  0.64,  0.74,  0.84,  0.95,  1.06,  1.17,
1.29,  1.41,  1.53,  1.66,  1.78,  1.9 ,  2.02,  2.15,  2.27,
2.39,  2.5 ,  2.62,  2.73,  2.84,  2.95,  3.05,  3.16,  3.26,
3.36,  3.45,  3.54,  3.63,  3.71,  3.8 ,  3.87,  3.95,  4.02,
4.09,  4.15,  4.22,  4.27,  4.32,  4.37,  4.42,  4.46,  4.49,
4.52,  4.55,  4.57,  4.58,  4.6 ,  4.6 ,  4.6 ,  4.59,  4.58,
4.56,  4.54,  4.51,  4.48,  4.43,  4.38,  4.33,  4.27,  4.2 ,
4.13,  4.05,  3.96,  3.87,  3.78,  3.68,  3.58,  3.47,  3.37,
3.25,  3.14,  3.03])

(Aside: Manually Creating Arrays)¶

As an aside, you can manually create arrays too. A common situation is to want to create an array from a list or other sequence. Easy:

In [14]:
np.array([2.3, 42, 5.6])

Out[14]:
array([  2.3,  42. ,   5.6])

NumPy also has routines for creating arrays of ones and zeros:

In [15]:
np.ones(10)

Out[15]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])
In [16]:
np.zeros((2, 2))

Out[16]:
array([[ 0.,  0.],
[ 0.,  0.]])

And for creating arrays over ranges, either using a step size or a set number of points:

In [17]:
np.arange(10, 20, 1.6)

Out[17]:
array([ 10. ,  11.6,  13.2,  14.8,  16.4,  18. ,  19.6])
In [18]:
np.linspace(10, 20, 16)

Out[18]:
array([ 10.        ,  10.66666667,  11.33333333,  12.        ,
12.66666667,  13.33333333,  14.        ,  14.66666667,
15.33333333,  16.        ,  16.66666667,  17.33333333,
18.        ,  18.66666667,  19.33333333,  20.        ])

And for getting random numbers:

In [19]:
np.random.random((2, 2))

Out[19]:
array([[ 0.76633572,  0.81414299],
[ 0.81736843,  0.27763528]])
In [20]:
np.random.standard_normal((2, 2))

Out[20]:
array([[-1.34426401, -0.8267184 ],
[-0.61234629, -0.9110464 ]])

Array Stats¶

Back to the task at hand! We have these time, pred, backup, and accoustic arrays. What can we learn about them? There are a number of stats methods:

In [21]:
print pred.min()
print pred.max()
print pred.mean()
print pred.std()
print np.median(pred)

0.05
5.38
2.65491666667
1.75053229801
2.74

In [22]:
# peak-to-peak
print pred.ptp()

5.33

In [23]:
backup.max()

Out[23]:
nan

The backup array contains nan values in places where data was missing. nan combined with anything else gives nan so all of our stats methods return nan. To get our data without nan we'll need to do some fancy indexing.

Indexing¶

One of the powerful features of NumPy arrays is the many ways they can be indexed. You can, for example, use a list or array of integers to grab specific elements from an array. The list can contain indices in any order and can even contain repeated indices:

In [24]:
pred[[100, 5, 1, 5, 100]]

Out[24]:
array([ 4.67,  1.13,  1.5 ,  1.13,  4.67])

(Note: The array used to index must have the same number of dimensions as the array being indexed.)

It's also possible to index arrays using boolean expressions, similar to an if statement:

In [25]:
pred[pred > 5]

Out[25]:
array([ 5.02,  5.08,  5.13,  5.18,  5.23,  5.27,  5.3 ,  5.33,  5.35,
5.37,  5.38,  5.38,  5.38,  5.37,  5.36,  5.34,  5.31,  5.28,
5.24,  5.19,  5.13,  5.07,  5.01,  5.07,  5.11,  5.16,  5.2 ,
5.23,  5.26,  5.29,  5.31,  5.32,  5.33,  5.33,  5.33,  5.32,
5.31,  5.29,  5.26,  5.23,  5.19,  5.14,  5.1 ,  5.04])
In [26]:
pred[(pred > 5) | (pred < 0.5)]

Out[26]:
array([ 0.48,  0.41,  0.34,  0.28,  0.23,  0.18,  0.14,  0.1 ,  0.08,
0.06,  0.05,  0.05,  0.06,  0.08,  0.11,  0.14,  0.19,  0.25,
0.31,  0.39,  0.47,  5.02,  5.08,  5.13,  5.18,  5.23,  5.27,
5.3 ,  5.33,  5.35,  5.37,  5.38,  5.38,  5.38,  5.37,  5.36,
5.34,  5.31,  5.28,  5.24,  5.19,  5.13,  5.07,  0.46,  0.39,
0.32,  0.26,  0.21,  0.16,  0.12,  0.09,  0.07,  0.06,  0.05,
0.06,  0.08,  0.1 ,  0.13,  0.18,  0.23,  0.29,  0.36,  0.44,
0.44,  0.38,  0.33,  0.29,  0.24,  0.21,  0.18,  0.16,  0.15,
0.14,  0.14,  0.16,  0.18,  0.21,  0.25,  0.3 ,  0.36,  0.43,
5.01,  5.07,  5.11,  5.16,  5.2 ,  5.23,  5.26,  5.29,  5.31,
5.32,  5.33,  5.33,  5.33,  5.32,  5.31,  5.29,  5.26,  5.23,
5.19,  5.14,  5.1 ,  5.04,  0.46,  0.39,  0.33,  0.27,  0.22,
0.18,  0.14,  0.11,  0.09,  0.08,  0.08,  0.09,  0.1 ,  0.13,
0.16,  0.2 ,  0.26,  0.32,  0.39,  0.47])

How exactly does this work? The truthy expressions with arrays produce another array: an array of boolean values with the same shape as the original array with True where the expression is true, and False elsewhere. Let's see how that looks on a small array:

In [27]:
np.arange(10) > 5

Out[27]:
array([False, False, False, False, False, False,  True,  True,  True,  True], dtype=bool)

These boolean arrays can be saved in their own variables, combined logically with other boolean arrays, and used to index any array with the same shape.

To get the indices where a condition is true use the numpy.where function:

In [28]:
np.where(np.arange(10) > 5)

Out[28]:
(array([6, 7, 8, 9]),)

How does this help with the nan issue? Much like Python's standard library has a math.isnan function that works on floats, there is a numpy.isnan function that works on arrays. (In fact, NumPy has array equivalents to most of the functions in the math module.) Here's a small example of np.isnan:

In [29]:
a = np.array([1, 2, np.nan, 4, 5, np.nan])
np.isnan(a)

Out[29]:
array([False, False,  True, False, False,  True], dtype=bool)

np.isnan returns an array of booleans just like the logical expressions up above, so that looks promising! Let's try it:

In [30]:
a[np.isnan(a)]

Out[30]:
array([ nan,  nan])

Of course, that grabbed the nan values because np.isnan gives True where a has nan values. One thing to do perform a logical flip on the boolean array using the ~ operator:

In [31]:
a[~np.isnan(a)]

Out[31]:
array([ 1.,  2.,  4.,  5.])

Or we could see what other kinds of logical functions there are in NumPy. One is numpy.isfinite:

In [32]:
a[np.isfinite(a)]

Out[32]:
array([ 1.,  2.,  4.,  5.])

We can use these functions, along with the any or all fuctions/methods on arrays, to test for the presence of missing data:

In [33]:
# are there any nan values?
print 'time:', np.isnan(time).any()
print 'backup:', np.isnan(backup).any()

time: False
backup: True

In [34]:
# are all of the values finite?
print 'pred:', np.isfinite(pred).all()
print 'accoustic:', np.isfinite(accoustic).all()

pred: True
accoustic: False


Both backup and accoustic have missing data. (These are the two columns of actual instrument measurements so it shouldn't be too shocking to see missing data.) We can use logical comparisons with arrays to make a boolean array of where backup and accoustic are both good:

In [35]:
not_nan = np.isfinite(backup) & np.isfinite(accoustic)


And then use this to make new copies of time, pred, backup, and accoustic without the rows where backup and accoustic are missing data:

In [36]:
time = time[not_nan]
pred = pred[not_nan]
backup = backup[not_nan]
accoustic = accoustic[not_nan]


How many rows did we lose?

In [37]:
not_nan.size - time.size

Out[37]:
6

Not too bad. Now we can get down to business!

Data Analysis (and Plots)¶

So what are we actually looking at here?

• time: time in hours since the first measurement in the file
• pred: predicted water level
• accoustic: a measured water level
• backup: another measured water level

All water levels are in feet above Mean Lower Low Water. Let's take a quick look:

In [38]:
print time[:5]
print pred[:5]
print accoustic[:5]
print backup[:5]

[ 0.   0.1  0.2  0.3  0.4]
[ 1.59  1.5   1.4   1.31  1.22]
[ 4.65  4.54  4.44  4.33  4.26]
[ 4.68  4.55  4.46  4.36  4.28]


Honestly, looking at a ton of numbers is a great way to get a feal for things. We could compare maxima:

In [39]:
backup.max() - pred.max()

Out[39]:
8.5

But are those at the same time? We can use the argmax method to get the index of the maxima of one and use that index in the other for a more apples-to-apples comparison:

In [40]:
m = backup.argmax()
print m
backup[m] - pred[m]

208

Out[40]:
9.2300000000000004

So at least according to the backup measurements the MLLW was 9 feet higher than predicted at one point during Hurricane Sandy!

But that's just one data point. To really see trends you want a plot. We'll start by turning on the IPython Notebook's inline plotting mode so that plots show up right here in our notebook:

In [41]:
%pylab inline

Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline].


Then conigure plots to display as SVG (default is PNG):

In [42]:
%config InlineBackend.figure_format = 'svg'


Then we'll import matplotlib:

In [43]:
import matplotlib.pyplot as plt


And make a basic plot of our data, using time along the x-axis and plotting the predicted and measured levels as three separate lines:

In [44]:
fig, ax = plt.subplots()
ax.plot(time, pred)
ax.plot(time, accoustic)
ax.plot(time, backup)
ax.set_ylabel('Feet above MLLW')
ax.set_xlabel('Hours Since First Measurement')

Out[44]:
<matplotlib.text.Text at 0x1128fba90>

And to keep the lines straight we can throw in a legend:

In [45]:
fig, ax = plt.subplots()
ax.plot(time, pred, label='Predicted')
ax.plot(time, accoustic, label='Accoustic')
ax.plot(time, backup, label='Backup')
ax.set_ylabel('Feet above MLLW')
ax.set_xlabel('Hours Since First Measurement')
ax.legend(loc='upper right')

Out[45]:
<matplotlib.legend.Legend at 0x11297ca50>

Great! We can see that maybe something got a bit weird with the accoustic measurements and maybe we should trust the backup measurements more. Now maybe we'd like to quantify and plot the difference between the measured and predicted tide levels. Piece of cake:

In [46]:
obs_minus_pred = backup - pred

In [47]:
fig, ax = plt.subplots()
ax.plot(time, pred, label='Predicted')
ax.plot(time, backup, label='Backup')
ax.plot(time, obs_minus_pred, label='Difference')
ax.set_ylabel('Feet above MLLW')
ax.set_xlabel('Hours Since First Measurement')
ax.legend(loc='upper right')

Out[47]:
<matplotlib.legend.Legend at 0x1129b5e50>

Wait a minute, what did we just do there? We just used a minus sign to do an element-wise subtraction of two arrays! Pretty handy. Let's look at array arithmetic for a bit.

Array Math¶

Arrays can be used in arithmetic expressions using the same binary operators we use for numbers: +, -, *, /, **, etc. These expressions return new arrays in which the mathematical operation has been applied elementwise. It's easiest to see this when combining an array and a scalar:

In [61]:
a = np.arange(5, dtype=np.float) # float to avoid integer surprises
print a

[ 0.  1.  2.  3.  4.]

In [62]:
a + 5

Out[62]:
array([ 5.,  6.,  7.,  8.,  9.])
In [63]:
a * 5

Out[63]:
array([  0.,   5.,  10.,  15.,  20.])

To get this same effect with lists you'd need to use a list or comprehension. With arrays it's as simple as a + 5. When combining arrays and scalars the same operation is applied to every element of the array. When combining two arrays it's slightly different:

In [64]:
b = np.arange(10, 20, 2, dtype=np.float)
print b

[ 10.  12.  14.  16.  18.]

In [65]:
b - a

Out[65]:
array([ 10.,  11.,  12.,  13.,  14.])
In [66]:
a / b

Out[66]:
array([ 0.        ,  0.08333333,  0.14285714,  0.1875    ,  0.22222222])

In these cases the first element of a operates with the first element of b, and so on. So long as the two arrays are the same size and shape you will see this behavior. Arrays with different shapes can sometimes be combined with binary operators via an implicit resizing/reshaping called broadcasting, but that's a topic for another day.

More Data Analysis¶

There might be more we could do with this data. When was the peak water height?

In [81]:
time[backup.argmax()]

Out[81]:
21.399999999999999

These are hours past midnight on Oct 29, so the peak water height was sometime around 9:24 PM on Oct. 29. The biggest difference between the predicted and observed tide heights were around the same time:

In [79]:
time[obs_minus_pred.argmax()]

Out[79]:
21.399999999999999

Which is one reason NYC had such bad flooding.