Since the focus of this notebook is math, we note only one difference between Python 2 and 3: division of integers does not truncate or round down:
1/2
0.5
OK, that's it for Python 3!
NumPy is the standard package for all kinds of operations on multidimensional arrays. Be aware that it originated as a merge between two packages and has evolved a lot over time, and there is a lot of stuff in there that you shouldn't necessarily use. For example, there is a matrix
class, but the general consensus is that you should not use it.
The array class (that is, the one that you should use) is called ndarray
. Every array has a shape, which is a tuple of nonnegative sizes, one for each dimension of the array. For example, a 2x3 matrix has shape (2,3)
.
Dimensions in NumPy are also called axes. Note that they are not the same as the dimensions of a vector space in linear algebra.
import numpy as np
np.zeros((2, 3))
array([[ 0., 0., 0.], [ 0., 0., 0.]])
np.random.uniform(0, 1, (2, 3))
array([[ 0.41841411, 0.18258245, 0.62852128], [ 0.34715359, 0.61273193, 0.08825056]])
np.array([[0., 1., 2.], [3., 4., 5.]]) # default is row-major order
array([[ 0., 1., 2.], [ 3., 4., 5.]])
a = np.array([[0., 1., 2.], [3., 4., 5.]])
a[1, 2]
5.0
a[0]
array([ 0., 1., 2.])
a[0,:] # synonymous with a[0]
array([ 0., 1., 2.])
a[:,0]
array([ 0., 3.])
Note that unlike slices of lists, slices of arrays "point" back to the original array:
b = a[0]
b[0] = 6.
a
array([[ 6., 1., 2.], [ 3., 4., 5.]])
a = np.array([[0., 1., 2.], [3., 4., 5.]])
b = np.array([[6., 7., 8.], [9., 10., 11.]])
a + b
array([[ 6., 8., 10.], [ 12., 14., 16.]])
a - b
array([[-6., -6., -6.], [-6., -6., -6.]])
a * b
array([[ 0., 7., 16.], [ 27., 40., 55.]])
a / b
array([[ 0. , 0.14285714, 0.25 ], [ 0.33333333, 0.4 , 0.45454545]])
np.exp(a)
array([[ 1. , 2.71828183, 7.3890561 ], [ 20.08553692, 54.59815003, 148.4131591 ]])
np.log(b) # natural log
array([[ 1.79175947, 1.94591015, 2.07944154], [ 2.19722458, 2.30258509, 2.39789527]])
np.tanh(a) # yes, we'll actually use this
array([[ 0. , 0.76159416, 0.96402758], [ 0.99505475, 0.9993293 , 0.9999092 ]])
In some cases, it's possible to apply a binary elementwise operation (like +
) to two arrays with different shapes. Namely, if an axis has size 1, it can be "broadcast" to any size. This is easier to demonstrate by example.
a = np.zeros((2, 3))
b = np.array([[1., 2., 3.]])
b.shape
(1, 3)
a + b
array([[ 1., 2., 3.], [ 1., 2., 3.]])
b = np.array([[1.],[2.]])
b.shape
(2, 1)
a + b
array([[ 1., 1., 1.], [ 2., 2., 2.]])
If one array has fewer axes than the other, its shape is left padded with ones:
b = np.array([1., 2., 3.])
b.shape
(3,)
a + b
array([[ 1., 2., 3.], [ 1., 2., 3.]])
Reductions perform some operation along an axis: for example, on all the rows or all the columns of a matrix. (If you don't specify an axis, the operation will be performed on the entire array.)
a
array([[ 0., 0., 0.], [ 0., 0., 0.]])
np.sum(a, axis=0) # all the columns
array([ 0., 0., 0.])
np.sum(a, axis=1) # all the rows
array([ 0., 0.])
np.max(a, axis=0)
array([ 0., 0., 0.])
np.min(a, axis=0)
array([ 0., 0., 0.])
np.argmax(a, axis=0) # which element is the max?
array([0, 0, 0])
np.argmin(a, axis=0) # which element is the min?
array([0, 0, 0])
a.T
array([[ 0., 0.], [ 0., 0.], [ 0., 0.]])
Transpose operations don't create new arrays; they create views.
b = a.T
b[0,1] = 7.
a
array([[ 0., 0., 0.], [ 7., 0., 0.]])
c = np.zeros((2,3,4,5))
I hope you don't need it, but more complex rearrangements of axes are possible too:
d = np.moveaxis(c, 1, 2) # move axis 1 to become axis 2
d.shape
(2, 4, 3, 5)
Arrays can be reshaped arbitrarily, but I can only imagine that you'll ever need to add or remove axes of size one, in order to make broadcasting work the way you want.
e = np.expand_dims(c, 0) # add new axis 0 (with size 1)
e.shape
(1, 2, 3, 4, 5)
f = np.squeeze(e, 0)
f.shape
(2, 3, 4, 5)
Again, these operations don't create new arrays; they create views.
To do matrix multiplication, don't use *
; instead use:
a = np.random.uniform(0, 1, (2,3))
b = np.random.uniform(0, 1, (3,4))
a @ b # Python 3.5 and NumPy 1.10
array([[ 0.35177723, 0.29972786, 0.50640218, 0.4781246 ], [ 0.77974407, 0.44074346, 0.57848922, 0.84393576]])
np.dot(a, b) # All versions
array([[ 0.35177723, 0.29972786, 0.50640218, 0.4781246 ], [ 0.77974407, 0.44074346, 0.57848922, 0.84393576]])
These two functions (@
calls np.matmul
) behave differently for arrays with more than 2 axes. Hopefully, you will not need to know the difference for this class.
The same operator/function works for vector-vector dot (inner) products, matrix-vector products, and (row) vector-matrix products.