Now we know what these are, let's just import them now. We will be adding to this "essential modules" list as we go through the course.

In [1]:
import numpy as np
import matplotlib.pyplot as plt 
%matplotlib inline 
from IPython.display import Image # just ignore this one for now....  next lecture, you will learn about it.

Lecture 7:

  • Learn more about NumPy and matplotlib
  • Learn more about NumPy arrays.

NumPy and N-dimensional arrays

We briefly mentioned arrays in the last lecture but quickly moved into plotting (because that is more fun). But arrays are essential to our computational happiness, so we need to bite the bullet and learn about them now.

Arrays in Numpy are somewhat similar to lists but there are important differences with advantages and disadvantages. Unlike lists, arrays are usually all of the same data type (dtype), usually numbers (integers or floats) and at times characters. A "feature" of arrays is that the size, shape and type are fixed when it's created.

Remember, we can define a list:

L=[ ]

then append to it as desired using the L.append( ) method. It is more complicated (but still possible) to extend arrays.

Why use arrays when you can use lists? Arrays are far more computationally efficient than lists, particularly for things like matrix math. You can perform calculations on the entire array in one go instead of looping through element by element as for lists.

To make things a little confusing, there are several different data objects that are loosely called arrays, e.g., arrays, character arrays and matrices. These are all subclasses of ndarray (N-dimensional array). We will just worry about arrays in this course.

Apart from reading in a data file with NumPy, as we did in the last lecture, there are many different ways of creating arrays. Here are a few examples:

In [2]:
# define the values with the function array( ). For example a 3x3 array
A= np.array([[1, 2, 3],[4,2,0],[1,1,2]])
print (A)

# notice how there are no commas in arrays  
[[1 2 3]
 [4 2 0]
 [1 1 2]]

As we learned in the last lecture, NumPy can also generate an array using the np.arange( ) function which works in a manner similar to range( ) but creates an array with floats or integers. range( ) makes a list generator (in Python 3).

This is just a reminder from Lecture 6:

In [3]:
# use list(range( )) to generate a one-dimensional (1D) list that ranges 
# from the first arguement up to (but not including) the second, that
# increments by the third:
#we learned that range( ) creates a list generator for integers
B=list(range(10)) 
print ("List made by 'range': ",B)
B_integers=np.arange(0,10,1) #arange( ) is an np function that creates an array of integers
print ("Array made by np.arange( ): ", B_integers)
B_real=np.arange(0,10,.2) #  and with floats
print ("Array with real numbers: \n",B_real) # notice the "\n"? that creates a new line in the text string?
# Notice that while "range" makes a list of integers, arange makes an array of integers 
#   or real numbers.  
List made by 'range':  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Array made by np.arange( ):  [0 1 2 3 4 5 6 7 8 9]
Array with real numbers: 
 [0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8 2.  2.2 2.4 2.6 2.8 3.  3.2 3.4
 3.6 3.8 4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8 6.  6.2 6.4 6.6 6.8 7.
 7.2 7.4 7.6 7.8 8.  8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]

There are several ways to create special arrays, for example, arrays initialized by zeroes, ones, or any other value:

In [4]:
D=np.zeros((2,3)) # Notice the size is specified by a tuple of numbers of rows and columns.
print (D)
[[0. 0. 0.]
 [0. 0. 0.]]
In [5]:
E=np.ones((2,3))
print (E)
[[1. 1. 1.]
 [1. 1. 1.]]

To get any other value, just multiply your "ones" array by whatever number you want:

In [6]:
print (E*42)
[[42. 42. 42.]
 [42. 42. 42.]]

As you might have guessed, np.arange(start, end, step) generates numbers between two endpoints (start and up to but not including end) that are spaced by step.

At times, it is useful to have N numbers equally spaced between two endpoints. For this, we use the function np.linspace(start,end,N) which generates an array starting with $start$, going up to (and including!) $end$ with $N$ linearly spaced elements:

In [7]:
F=np.linspace(0,10,14) # give me 14 numbers between 0 and 10, including 0 and 10.
print  (F)
print (len(F))
[ 0.          0.76923077  1.53846154  2.30769231  3.07692308  3.84615385
  4.61538462  5.38461538  6.15384615  6.92307692  7.69230769  8.46153846
  9.23076923 10.        ]
14

To summarize:

np.linspace( ) creates an array with $N$ evenly spaced elements starting at $start$ and including the $end$ value, while np.arange( ) creates an array with elements at $step$ intervals between $start$ up to but NOT including the $end$ value.

Another trick for creating arrays, is to use the np.loadtext( ) function, which you encountered in Lecture 6. It reads in a data file into an array. This example uses a 'pathname' which we learned about in Lecture 1.

In [8]:
print (np.loadtxt('Datasets/RecentEarthquakes/earthquakeLocations.txt'))
[[  66.07 -149.03]
 [  38.83 -122.8 ]
 [  32.92 -115.47]
 [  47.63 -122.51]
 [  60.97 -151.24]
 [  37.65 -118.85]
 [  53.2  -161.3 ]
 [  19.41 -155.28]
 [  38.84 -122.83]
 [  36.88  -89.12]]

A few words about array types

In the last example, NumPy figured out what array type was required - it decided to make $B\_integers$ an integer array and $B\_real$, a floating point array without our having to specify the type. But what if we wanted a floating point array with numbers from 0. to 9. instead?

There are a few solutions to this. First, we could use floating points in the np.arange( ) call:

In [2]:
np.arange(0.,10.,1.)
Out[2]:
array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

Or, we could specify the array type with the dtype argument, where dtype can be int, float, str, among others.

In [3]:
print (np.arange(0,10,1,dtype='float'))
print (np.arange(0,10,1,dtype='int'))
[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
[0 1 2 3 4 5 6 7 8 9]

So, what is an object array? That would be an array that allows different data types:

In [11]:
np.array([[1, 2, 3],[4,2,0],['Xiao Long','Jill','Jose']],dtype='object')
Out[11]:
array([[1, 2, 3],
       [4, 2, 0],
       ['Xiao Long', 'Jill', 'Jose']], dtype=object)

But object arrays have their own limitations, e.g., you can't multiply the array by anything.

So, what happens if we define an array without initializing it? Let's make a 2x2 array of the dtype float.

In [12]:
G=np.ndarray(shape=(2,2), dtype=float) 
print  (G)   
[[5.e-324 5.e-324]
 [5.e-324 0.e+000]]

So... the array was initialized with teeny tiny numbers but not necessarily zeros.

Array attributes

Like other Python objects we have already encountered, arrays also have attributes and methods. As before, attributes do not have parentheses while methods do.

We will start by looking at array attributes which report on the state of the array.

As an example of the use of an attribute, we can find out what the data type of an array is with the attribute array.dtype:

In [13]:
D.dtype
Out[13]:
dtype('float64')

As you may have already figured out, arrays have dimensions and shape. Dimensions define the number of axes, as in the illustration below.

Rember our first array, $A$? It had two dimensions (axis 0 and 1). We can use the attribute ndim to find find this out:

In [2]:
Image(filename='Figures/ndim.jpg') # just ignore this - i want to show you the pretty picture.
Out[2]:
In [15]:
A= np.array([[1,2,3],[4,2,0],[1,1,2]]) # just to remind you
print ("the dimensions of A are: ",A.ndim)
the dimensions of A are:  2

Notice how np.zeros( ), np.ones( ) and np.ndarray( ) used a shape tuple in order to define the arrays in the examples above. The shape of an array tells us how many elements are along each axis. Python returns a tuple with the shape information if we use the shape attribute:

In [16]:
A.shape
Out[16]:
(3, 3)

Array methods

Arrays, like lists, have a bunch of methods, but the methods are different than the methods we learned about for lists. For example, you can append to an array, but the results may surprise you.

In [17]:
print (np.append(D,[2,2,2]))
[0. 0. 0. 0. 0. 0. 2. 2. 2.]

See how we now have a 1-D array? Not exactly what you expected? We can deal with that problem by reshaping the array, as we shall see. But first, you can also concatenate arrays which may be a simpler way to extend your array:

In [18]:
print (np.concatenate((D,E)))
[[0. 0. 0.]
 [0. 0. 0.]
 [1. 1. 1.]
 [1. 1. 1.]]

To solve the shape problem (2D versus 1D), you can take a 1D array and re-arrange it into a 2D array (as long as the total number of elements is the same). To do that, we use the array.reshape( ) method:

In [19]:
# we can take a 1D array with 50 elements and reshape it into, say a 5 X 10 2-D array:
B_real_2D=B_real.reshape((5,10))
print (B_real)
print (B_real_2D)
[0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8 2.  2.2 2.4 2.6 2.8 3.  3.2 3.4
 3.6 3.8 4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8 6.  6.2 6.4 6.6 6.8 7.
 7.2 7.4 7.6 7.8 8.  8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]
[[0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8]
 [2.  2.2 2.4 2.6 2.8 3.  3.2 3.4 3.6 3.8]
 [4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8]
 [6.  6.2 6.4 6.6 6.8 7.  7.2 7.4 7.6 7.8]
 [8.  8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]]

You can go the other way, taking a 2D (or more) array and turning it into one long 1D array using array.flatten( ).

In [20]:
B_real_1D=B_real_2D.flatten()
print (B_real_1D)
[0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8 2.  2.2 2.4 2.6 2.8 3.  3.2 3.4
 3.6 3.8 4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8 6.  6.2 6.4 6.6 6.8 7.
 7.2 7.4 7.6 7.8 8.  8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]

Another super useful array method is array.transpose() which swaps rows and columns:

In [21]:
print (B_real_2D)
print (B_real_2D.transpose())
[[0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8]
 [2.  2.2 2.4 2.6 2.8 3.  3.2 3.4 3.6 3.8]
 [4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8]
 [6.  6.2 6.4 6.6 6.8 7.  7.2 7.4 7.6 7.8]
 [8.  8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]]
[[0.  2.  4.  6.  8. ]
 [0.2 2.2 4.2 6.2 8.2]
 [0.4 2.4 4.4 6.4 8.4]
 [0.6 2.6 4.6 6.6 8.6]
 [0.8 2.8 4.8 6.8 8.8]
 [1.  3.  5.  7.  9. ]
 [1.2 3.2 5.2 7.2 9.2]
 [1.4 3.4 5.4 7.4 9.4]
 [1.6 3.6 5.6 7.6 9.6]
 [1.8 3.8 5.8 7.8 9.8]]

Slicing and indexing ndarrays

The syntax for slicing an array is similar to that for a list:

In [22]:
B=A[0:2] # access the top two lines  of matrix A 
print (B)
[[1 2 3]
 [4 2 0]]

For many more methods and attributes of ndarrays, visit the NumPy Reference website: http://docs.scipy.org/doc/numpy/reference/.

Converting between Data Structures

We can convert from an array to a list:

In [23]:
L=A.tolist()
print ("Original array: \t", type(A))
print ("List form: \t\t", type(L))
print (A)
print (L)

# notice the commas, the array turned into  a list of three lists
Original array: 	 <class 'numpy.ndarray'>
List form: 		 <class 'list'>
[[1 2 3]
 [4 2 0]
 [1 1 2]]
[[1, 2, 3], [4, 2, 0], [1, 1, 2]]

From a list to an array:

In [24]:
AfromL=np.array(L)# from a list
print ('AfromL: ')
print (AfromL)
AfromL: 
[[1 2 3]
 [4 2 0]
 [1 1 2]]

Or from a tuple to an array:

In [25]:
AfromT=np.array((4,2)) # from a tuple 
print ('AfromT: ')
print (AfromT)
AfromT: 
[4 2]

Saving NumPy arrays as text files

Having created, sliced and diced an array, it is often handy to save the data to a file for later use. We can do that with the command np.savetxt( ).

Let's save our A array to a file called A.txt.

In [26]:
np.savetxt('A.txt',A)
In [27]:
#clean up
!rm A.txt