Primer

Unidata Python Workshop

Overview:¶

Teaching: 20 minutes
Exercises: 10 minutes

Questions¶

What are arrays?
How can arrays be manipulated effectively in Python?

Objectives¶

Create an array of ‘data’.
Perform basic calculations on this data using python math functions.
Slice and index the array
Perform a meteorological calculation on an array of data using MetPy.

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

a powerful N-dimensional array object
sophisticated (broadcasting) functions
useful linear algebra, Fourier transform, and random number capabilities

The NumPy array object is the common interface for working with typed arrays of data across a wide-variety of scientific Python packages. NumPy also features a C-API, which enables interfacing existing Fortran/C/C++ libraries with Python and NumPy.

1. Create an array of 'data' and slice/index it¶

The NumPy array represents a contiguous block of memory, holding entries of a given type (and hence fixed size). The entries are laid out in memory according to the shape, or list of dimension sizes.

In [1]:

# Convention for import to get shortened namespace
import numpy as np

In [2]:

# Create a simple array from a list of integers
a = np.array([1, 2, 3])
a

Out[2]:

array([1, 2, 3])

In [3]:

# Print out the shape attribute
a.shape

Out[3]:

(3,)

In [4]:

# Print out the data type attribute
a.dtype

Out[4]:

dtype('int64')

In [5]:

# This time use a list of floats
a = np.array([1., 2., 3., 4., 5.])
a

Out[5]:

array([ 1.,  2.,  3.,  4.,  5.])

In [6]:

# Print out the shape attribute
a.shape

Out[6]:

(5,)

In [7]:

# Print out the data type attribute
a.dtype

Out[7]:

dtype('float64')

NumPy also provides helper functions for generating arrays of data to save you typing for regularly spaced data.

arange(start, stop, interval) creates a range of values in the interval [start,stop) with step spacing.
linspace(start, stop, num) creates a range of num evenly spaced values over the range [start,stop].

arange¶

In [8]:

a = np.arange(5)
print(a)

[0 1 2 3 4]

In [9]:

a = np.arange(3, 11)
print(a)

[ 3  4  5  6  7  8  9 10]

In [10]:

a = np.arange(1, 10, 2)
print(a)

[1 3 5 7 9]

linspace¶

In [11]:

b = np.linspace(5, 15, 5)
print(b)

[  5.    7.5  10.   12.5  15. ]

In [12]:

b = np.linspace(2.5, 10.25, 11)
print(b)

[  2.5     3.275   4.05    4.825   5.6     6.375   7.15    7.925   8.7
   9.475  10.25 ]

2. Perform basic calculations with Python¶

Basic math¶

In core Python, adding the numbers of two lists or arrays element-wise explicitly would look like this:

In [13]:

a = np.arange(5, 10)
b = np.linspace(3, 4.5, 5)

In [14]:

print([x + y for x, y in zip(a,b)])

[8.0, 9.375, 10.75, 12.125, 13.5]

That is very verbose and not very intuitive. Using NumPy this becomes:

In [15]:

a + b

Out[15]:

array([  8.   ,   9.375,  10.75 ,  12.125,  13.5  ])

The four major mathematical operations operate in the same way. They perform an element-by-element calculation of the two arrays. The two must be the same shape though!

In [16]:

a * b

Out[16]:

array([ 15.  ,  20.25,  26.25,  33.  ,  40.5 ])

Constants¶

NumPy proves us access to some useful constants as well - remember you should never be typing these in manually! Other libraries such as SciPy and MetPy have their own set of constants that are more domain specific.

In [17]:

np.pi

Out[17]:

3.141592653589793

In [18]:

np.e

Out[18]:

2.718281828459045

In [19]:

# This makes working with radians effortless!
t = np.arange(0, 2 * np.pi + np.pi / 4, np.pi / 4)
t

Out[19]:

array([ 0.        ,  0.78539816,  1.57079633,  2.35619449,  3.14159265,
        3.92699082,  4.71238898,  5.49778714,  6.28318531])

Array math functions¶

NumPy also has math functions that can operate on arrays. Similar to the math operations, these greatly simplify and speed up these operations. Be sure to checkout the listing of mathematical functions in the NumPy documentation.

In [20]:

# Calculate the sine function
sin_t = np.sin(t)
print(sin_t)

[  0.00000000e+00   7.07106781e-01   1.00000000e+00   7.07106781e-01
   1.22464680e-16  -7.07106781e-01  -1.00000000e+00  -7.07106781e-01
  -2.44929360e-16]

In [21]:

# Round to three decimal places
print(np.round(sin_t, 3))

[ 0.     0.707  1.     0.707  0.    -0.707 -1.    -0.707 -0.   ]

In [22]:

# Calculate the cosine function
cos_t = np.cos(t)
print(cos_t)

[  1.00000000e+00   7.07106781e-01   6.12323400e-17  -7.07106781e-01
  -1.00000000e+00  -7.07106781e-01  -1.83697020e-16   7.07106781e-01
   1.00000000e+00]

In [23]:

# Convert radians to degrees
degrees = np.rad2deg(t)
print(degrees)

[   0.   45.   90.  135.  180.  225.  270.  315.  360.]

In [24]:

# Integrate the sine function with the trapezoidal rule
sine_integral = np.trapz(sin_t, t)
print(np.round(sine_integral, 3))

-0.0

In [25]:

# Sum the values of the cosine
cos_sum = np.sum(cos_t)
print(cos_sum)

1.0

In [26]:

# Calculate the cumulative sum of the cosine
cos_csum = np.cumsum(cos_t)
print(cos_csum)

[  1.00000000e+00   1.70710678e+00   1.70710678e+00   1.00000000e+00
   0.00000000e+00  -7.07106781e-01  -7.07106781e-01  -4.44089210e-16
   1.00000000e+00]

3. Slice and index the array¶

Indexing is how we pull individual data items out of an array. Slicing extends this process to pulling out a regular set of the items.

In [27]:

# Create an array for testing
a = np.arange(12).reshape(3, 4)

In [28]:

Out[28]:

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Indexing in Python is 0-based, so the command below looks for the 2nd item along the first dimension and the 3rd along the second dimension.

In [29]:

a[1, 2]

Out[29]:

Can also just index on one dimension

In [30]:

a[2]

Out[30]:

array([ 8,  9, 10, 11])

Negative indices are also allowed, which permit indexing relative to the end of the array.

In [31]:

a[0, -1]

Out[31]:

Slicing syntax is written as start:stop[:step], where all numbers are optional.

defaults:
- start = 0
- end = len(dim)
- step = 1
The second colon is also optional if no step is used.

It should be noted that end represents one past the last item; one can also think of it as a half open interval: [start, end)

In [32]:

# Get the 2nd and 3rd rows
a[1:3]

Out[32]:

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [33]:

# All rows and 3rd column
a[:, 2]

Out[33]:

array([ 2,  6, 10])

In [34]:

# ... can be used to replace one or more full slices
a[..., 2]

Out[34]:

array([ 2,  6, 10])

In [35]:

# Slice every other row
a[::2]

Out[35]:

array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])

In [36]:

# Slice out every other column
a[:, ::2]

Out[36]:

array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])

In [37]:

# Slice every other item along each dimension -- how would we do this

Out[37]:

array([[ 0,  2],
       [ 8, 10]])

4. Perform a meteorological calculation on an array of data using MetPy.¶

In [38]:

u = np.random.randint(0, 45, 10)
v = np.random.randint(0, 45, 10)

In [39]:

print(u)
print(v)

[35 15 32 26 42 33 30  5 10 37]
[44  0 44 12 27 12  5 40  0 27]

In [40]:

import metpy.calc as mpcalc

speed = mpcalc.get_wind_speed(u, v)
direction = mpcalc.get_wind_dir(u, v)

In [41]:

print(speed)
print(np.rad2deg(direction))

[ 56.22277119  15.          54.40588203  28.63564213  49.92995093
  35.11409973  30.41381265  40.31128874  10.          45.80392996]
[ 218.50065372  270.          216.02737339  245.22485943  237.26477373  250.01689348  260.53767779  187.12501635  270.          233.88065915] degree

In [42]:

print(np.mean(speed))

36.5837377368

In [43]:

print(np.mean(np.rad2deg(direction)))
print(np.std(np.rad2deg(direction)))

238.85779070340064 degree
24.927870963856943 degree

5. Use units from MetPy with calculations¶

In [47]:

# Import MetPy's units registry
from metpy.units import units

In [48]:

length = 8 * units.feet
print(length * length)

64 foot ** 2

In [49]:

distance = 10 * units.mile
time = 15 * units.minute
avg_speed = distance / time
print(avg_speed)
print(avg_speed.to_base_units())
print(avg_speed.to('mph'))

0.6666666666666666 mile / minute
17.8816 meter / second
40.0 mph

In [50]:

# Let's use MetPy to calculate the dewpoint from the current temperature and relative humidity
import metpy.calc as mpcalc
mpcalc.dewpoint_rh(25 * units.degC, 0.75)

Out[50]:

20.264799097790046 degC

In [52]:

# Thanks to units, this can work with Fahrenheit as well
td = mpcalc.dewpoint_rh(77 * units.degF, 0.75)
td

Out[52]:

20.26479888333684 degC

In [53]:

# And you can get it back in Fahrenheit as
td.to('degF')

Out[53]:

68.47663839000624 degF

Exercises¶

Using the function presure_to_height_std, can you calculate the height of the 700 millibar level assuming a standard atmosphere?

In [ ]:

What is the windchill when the temperature is 263K and the winds are blowing at 20 m/s? (Bonus points: find it in Fahrenheit)

In [ ]:

Resources¶

The goal of this tutorial is to provide an overview of the use of the NumPy library. It tries to hit all of the important parts, but it is by no means comprehensive. For more information, try looking at the: