See https://docs.scipy.org/doc/numpy/user/quickstart.html for similar material.
(a,) = [1]
c, d = [1, 2]
begin, *middle, end = range(5)
print(a)
print(begin)
print(middle)
Numpy arrays are iterable, so we'll use this a few times in the upcoming material. You may also have seen it in my tests, since it's a handy way to get values from an iterable while also enforcing the length. This is also the basis for the swap idiom in Python:
c, d = d, c
Numpy is the most important numerical library available for Python. Several pieces of Python syntax were added explicitly to support Numpy, such as the Ellipsis and the matrix multiplication operator.
The most common way to import numpy is as follows:
import numpy as np
The numpy library is build around the array. Arrays are different from Python lists:
.shape
dtype
) onlyIf you have used Matlab before, note that arrays in Python support 0 and 1 dimensions besides the normal 2+, and that operations are elementwise.
Arrays can be created from lists and tuples (not all iterables, though - Numpy looks explicitly for nested lists and tuples to make multiple dimensions)
arr = np.array([1, 2, 3, 4])
arr
The shape of an array is a tuple, with one element per dimension
arr.shape
You can also access the dtype (type of the data):
arr.dtype
Numpy tries to pick the best dtype based on your input data; it will pick int if there are no floats:
print(np.array([1, 1]).dtype)
print(np.array([1, 1.0]).dtype)
print(np.array([1, 1], dtype=np.float64).dtype)
The good news: dtype is automatically upgraded if needed unless you are doing in-place operations
Operations on an array are defined as elementwise (with the exceptions of the logic operators - numpy uses bitwise operators instead).
arr ** 2
If you want to use logic operators, use the bitwise operators instead, and lots of parenthesis:
(arr == 1) | (arr == 2)
You can produce new arrays many different ways:
np.zeros(3)
np.zeros(3, dtype=np.int8)
np.ones([2, 2])
Multidiminsional arrays are supported, too. (If you know about array ordering, Numpy defaults to C order, and can do Fortran order also)
np.eye(3)
np.random.rand(2, 2)
np.random.randint(2, 5, (3, 4))
Numpy also provides UFuncs; functions that apply element-wise to an array. They do so without a Python loop, making them very fast. There are also a collection of other useful functions that do things like sum
without Python loops.
Technically, UFuncs must produce the same size output array. Ones that produce different output array sizes are called Generalized UFuncs, but that distinction will not matter to us for now.
np.sin(arr)
np.exp(arr)
np.sum(arr)
We can take an array and reshape it:
a = np.eye(4)
a
a.reshape(2, 8)
a.reshape(2, -1)
a.reshape(1, 16)
a.reshape(16)
a.flatten()
We can also add new empty axis:
b = np.array([1, 2, 3])
b[:, np.newaxis]
b[np.newaxis, :]
np.uint8(1)
np.float16(12.123456789123456789123456789)
Numpy provides very rich indexing tools. Let's see a few.
arr = np.arange(16).reshape(4, 4)
arr
Unlike matlab, a single index returns a row (or the remaining N-1 dimensions):
arr[0]
You could nest calls (arr[0][0]
), but Numpy uses Python's rich index support to make this simpler and faster:
arr[0, 0]
Slices are supported:
arr[:, 0]
Just like normal Python, a slice retains a dimension, while a single number does not.
Unless you work with large numbers of dimensions, this probably won't be useful, but ...
means "all the remaining dimensions":
...
for what it was intended for!arr[..., 0]
Boolean arrays also can index! (You can also uses lists, like I am doing in this example - but usually you will use a np.array, and it does not work with tuples)
np.arange(4)[[True, False, True, False]]
More realistic example: (note that due to the selection, output is always 1D)
arr[arr < 8]
Remember that arr[...] =
was one of the (few) allowed assignment expressions!
arr[arr < 8] = 1
arr
Remember I told you assignment operations were special? What do you think this might do? And why did I try adding the [:]
?
b = arr[arr < 8]
b[:] = 2
arr
The expression a[...] = b
turns into
a.__setitem__(..., b)
While a = b
unceremoniously replaces b
, no methods called. And, for b = a[...]
, a.__getitem__(...)
is called.
Note for the curious: the stuff in the
...
part gets turned into a tuple, unless it already is one. This is why tuples behave different than lists in indexing.
If you try to perform an operation, and a dimension is incompatible, if one of the incompatible values is length 1, it is replicated as necessary!
a = np.arange(4).reshape(2, 2)
b = np.array([1, 2])[:, np.newaxis]
print(a)
print(b)
print(a * b)
Suggestion: don't try to mix arrays that don't have matching dimension: broadcasting is not clear in intent there. The one exception: scalars (0D)
a + 2
x, y = np.mgrid[:3, :3]
# np.meshgrid(np.arange(3), np.arange(3))
x * 2 + y
x, y = np.ogrid[:3, :3]
x * 2 + y
This example is a famous fractal. It is quite simple to construct, but can be quite beautiful. The algorithm is:
Make a grid of complex values. Compute the value
$$ z_{n+1}=z_{n}^{2}+c $$While $z_n$ is bounded, save that iteration number.
We start by setting up the data:
import matplotlib.pyplot as plt
import numpy as np
# Size of the grid to evaluate on
size = (500, 500)
xs = np.linspace(-2, 2, size[0])
ys = np.linspace(-2, 2, size[1])
# We need to make a grid where values x = real and y = imag
x, y = np.meshgrid(xs, ys)
# We could also write: x, y = np.mgrid[-2:2:100j, -2:2:100j]
# Or use open grids and broadcasting
c = x + y * 1j
# Verify that we have correctly constructed the real and imaginary parts
fig, axs = plt.subplots(1, 2)
axs[0].imshow(c.real)
axs[1].imshow(c.imag)
z = np.zeros(size, dtype=np.complex)
it_matrix = np.zeros(size, dtype=np.int)
for n in range(30):
z = z ** 2 + c
it_matrix[np.abs(z) < 2] = n
plt.figure(figsize=(10, 10))
plt.imshow(it_matrix)
Let's rewrite this to continiously color the figure by the value when it diverged. We'll also be a bit more clever about avoiding warnings from numpy.
z = np.zeros(size, dtype=np.complex)
it_matrix = np.zeros(size, dtype=np.double)
for n in range(50):
z[it_matrix == 0] = z[it_matrix == 0] ** 2 + c[it_matrix == 0]
filt = (it_matrix == 0) & (np.abs(z) > 2)
it_matrix[filt] = n + 1 - np.log(np.log(np.abs(z[filt]))) / np.log(2)
plt.figure(figsize=(10, 10))
plt.imshow(it_matrix)
See also https://www.ibm.com/developerworks/community/blogs/jfp/entry/How_To_Compute_Mandelbrodt_Set_Quickly?lang=en for way too many ways to do mandelbrots in Python, along with performance measurements.