You can read an overview of this Numerical Linear Algebra course in this blog post. The course was originally taught in the University of San Francisco MS in Analytics graduate program. Course lecture videos are available on YouTube (note that the notebook numbers and video numbers do not line up, since some notebooks took longer than 1 video to cover).
You can ask questions about the course on our fast.ai forums.
The term broadcasting describes how arrays with different shapes are treated during arithmetic operations. The term broadcasting was first used by Numpy, although is now used in other libraries such as Tensorflow and Matlab; the rules can vary by library.
From the Numpy Documentation:
Broadcasting provides a means of vectorizing array operations so that looping
occurs in C instead of Python. It does this without making needless copies of data
and usually leads to efficient algorithm implementations.
The simplest example of broadcasting occurs when multiplying an array by a scalar.
a = np.array([1.0, 2.0, 3.0])
b = 2.0
a * b
array([ 2., 4., 6.])
v=np.array([1,2,3])
print(v, v.shape)
[1 2 3] (3,)
m=np.array([v,v*2,v*3]); m, m.shape
(array([[1, 2, 3], [2, 4, 6], [3, 6, 9]]), (3, 3))
n = np.array([m*1, m*5])
n
array([[[ 1, 2, 3], [ 2, 4, 6], [ 3, 6, 9]], [[ 5, 10, 15], [10, 20, 30], [15, 30, 45]]])
n.shape, m.shape
((2, 3, 3), (3, 3))
We can use broadcasting to add a matrix and an array:
m+v
array([[ 2, 4, 6], [ 3, 6, 9], [ 4, 8, 12]])
Notice what happens if we transpose the array:
v1=np.expand_dims(v,-1); v1, v1.shape
(array([[1], [2], [3]]), (3, 1))
m+v1
array([[ 2, 3, 4], [ 4, 6, 8], [ 6, 9, 12]])
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when
Arrays do not need to have the same number of dimensions. For example, if you have a $256 \times 256 \times 3$ array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with 3 values. Lining up the sizes of the trailing axes of these arrays according to the broadcast rules, shows that they are compatible:
Image (3d array): 256 x 256 x 3
Scale (1d array): 3
Result (3d array): 256 x 256 x 3
v = np.array([1,2,3,4])
m = np.array([v,v*2,v*3])
A = np.array([5*m, -1*m])
v.shape, m.shape, A.shape
((4,), (3, 4), (2, 3, 4))
Will the following operations work?
A
array([[[ 5, 10, 15], [10, 20, 30], [15, 30, 45]], [[-1, -2, -3], [-2, -4, -6], [-3, -6, -9]]])
A + v
array([[[ 6, 12, 18], [11, 22, 33], [16, 32, 48]], [[ 0, 0, 0], [-1, -2, -3], [-2, -4, -6]]])
A
array([[[ 5, 10, 15, 20], [ 10, 20, 30, 40], [ 15, 30, 45, 60]], [[ -1, -2, -3, -4], [ -2, -4, -6, -8], [ -3, -6, -9, -12]]])
A.T.shape
(4, 3, 2)
A.T
array([[[ 5, -1], [ 10, -2], [ 15, -3]], [[ 10, -2], [ 20, -4], [ 30, -6]], [[ 15, -3], [ 30, -6], [ 45, -9]], [[ 20, -4], [ 40, -8], [ 60, -12]]])
A matrix with lots of zeros is called sparse (the opposite of sparse is dense). For sparse matrices, you can save a lot of memory by only storing the non-zero values.
Another example of a large, sparse matrix:
[Source](https://commons.wikimedia.org/w/index.php?curid=2245335)There are the most common sparse storage formats:
Let's walk through these examples
There are actually many more formats as well.
A class of matrices (e.g, diagonal) is generally called sparse if the number of non-zero elements is proportional to the number of rows (or columns) instead of being proportional to the product rows x columns.
Scipy Implementation
From the Scipy Sparse Matrix Documentation
"Can Maths really save your life? Of course it can!!" (lovely article)
(CAT and CT scan refer to the same procedure. CT scan is the more modern term)
This lesson is based off the Scikit-Learn example Compressive sensing: tomography reconstruction with L1 prior (Lasso)
Take the readings from a CT scan and construct what the original looks like.
For each x-ray (at a particular position and particular angle), we get a single measurement. We need to construct the original picture just from these measurements. Also, we don't want the patient to experience a ton of radiation, so we are gathering less data than the area of the picture.
In the previous lesson, we used Robust PCA for background removal of a surveillance video. We saw that this could be written as the optimization problem:
$$ minimize\; \lVert L \rVert_* + \lambda\lVert S \rVert_1 \\ subject\;to\; L + S = M$$Question: Do you remember what is special about the L1 norm?
We will see that:
Resources: Compressed Sensing
%matplotlib inline
import numpy as np, matplotlib.pyplot as plt, math
from scipy import ndimage, sparse
np.set_printoptions(suppress=True)
We will use generated data today (not real CT scans). There is some interesting numpy and linear algebra involved in generating the data, and we will return to that later.
Code is from this Scikit-Learn example Compressive sensing: tomography reconstruction with L1 prior (Lasso)
def generate_synthetic_data():
rs = np.random.RandomState(0)
n_pts = 36
x, y = np.ogrid[0:l, 0:l]
mask_outer = (x - l / 2) ** 2 + (y - l / 2) ** 2 < (l / 2) ** 2
mx,my = rs.randint(0, l, (2,n_pts))
mask = np.zeros((l, l))
mask[mx,my] = 1
mask = ndimage.gaussian_filter(mask, sigma=l / n_pts)
res = (mask > mask.mean()) & mask_outer
return res ^ ndimage.binary_erosion(res)
l = 128
data = generate_synthetic_data()
plt.figure(figsize=(5,5))
plt.imshow(data, cmap=plt.cm.gray);
l=8; n_pts=5
rs = np.random.RandomState(0)
x, y = np.ogrid[0:l, 0:l]; x,y
(array([[0], [1], [2], [3], [4], [5], [6], [7]]), array([[0, 1, 2, 3, 4, 5, 6, 7]]))
x + y
array([[ 0, 1, 2, 3, 4, 5, 6, 7], [ 1, 2, 3, 4, 5, 6, 7, 8], [ 2, 3, 4, 5, 6, 7, 8, 9], [ 3, 4, 5, 6, 7, 8, 9, 10], [ 4, 5, 6, 7, 8, 9, 10, 11], [ 5, 6, 7, 8, 9, 10, 11, 12], [ 6, 7, 8, 9, 10, 11, 12, 13], [ 7, 8, 9, 10, 11, 12, 13, 14]])
(x - l/2) ** 2
array([[ 16.], [ 9.], [ 4.], [ 1.], [ 0.], [ 1.], [ 4.], [ 9.]])
(x - l/2) ** 2 + (y - l/2) ** 2
array([[ 32., 25., 20., 17., 16., 17., 20., 25.], [ 25., 18., 13., 10., 9., 10., 13., 18.], [ 20., 13., 8., 5., 4., 5., 8., 13.], [ 17., 10., 5., 2., 1., 2., 5., 10.], [ 16., 9., 4., 1., 0., 1., 4., 9.], [ 17., 10., 5., 2., 1., 2., 5., 10.], [ 20., 13., 8., 5., 4., 5., 8., 13.], [ 25., 18., 13., 10., 9., 10., 13., 18.]])
mask_outer = (x - l/2) ** 2 + (y - l/2) ** 2 < (l/2) ** 2; mask_outer
array([[False, False, False, False, False, False, False, False], [False, False, True, True, True, True, True, False], [False, True, True, True, True, True, True, True], [False, True, True, True, True, True, True, True], [False, True, True, True, True, True, True, True], [False, True, True, True, True, True, True, True], [False, True, True, True, True, True, True, True], [False, False, True, True, True, True, True, False]], dtype=bool)
plt.imshow(mask_outer, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd9303278>
mask = np.zeros((l, l))
mx,my = rs.randint(0, l, (2,n_pts))
mask[mx,my] = 1; mask
array([[ 0., 1., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 1.], [ 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0., 0.]])
plt.imshow(mask, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd9293940>
mask = ndimage.gaussian_filter(mask, sigma=l / n_pts)
plt.imshow(mask, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd922c0b8>
res = np.logical_and(mask > mask.mean(), mask_outer)
plt.imshow(res, cmap='gray');
plt.imshow(ndimage.binary_erosion(res), cmap='gray');
plt.imshow(res ^ ndimage.binary_erosion(res), cmap='gray');
def _weights(x, dx=1, orig=0):
x = np.ravel(x)
floor_x = np.floor((x - orig) / dx)
alpha = (x - orig - floor_x * dx) / dx
return np.hstack((floor_x, floor_x + 1)), np.hstack((1 - alpha, alpha))
def _generate_center_coordinates(l_x):
X, Y = np.mgrid[:l_x, :l_x].astype(np.float64)
center = l_x / 2.
X += 0.5 - center
Y += 0.5 - center
return X, Y
def build_projection_operator(l_x, n_dir):
X, Y = _generate_center_coordinates(l_x)
angles = np.linspace(0, np.pi, n_dir, endpoint=False)
data_inds, weights, camera_inds = [], [], []
data_unravel_indices = np.arange(l_x ** 2)
data_unravel_indices = np.hstack((data_unravel_indices,
data_unravel_indices))
for i, angle in enumerate(angles):
Xrot = np.cos(angle) * X - np.sin(angle) * Y
inds, w = _weights(Xrot, dx=1, orig=X.min())
mask = (inds >= 0) & (inds < l_x)
weights += list(w[mask])
camera_inds += list(inds[mask] + i * l_x)
data_inds += list(data_unravel_indices[mask])
proj_operator = sparse.coo_matrix((weights, (camera_inds, data_inds)))
return proj_operator
l = 128
proj_operator = build_projection_operator(l, l // 7)
proj_operator
<2304x16384 sparse matrix of type '<class 'numpy.float64'>' with 555378 stored elements in COOrdinate format>
dimensions: angles (l//7), positions (l), image for each (l x l)
proj_t = np.reshape(proj_operator.todense().A, (l//7,l,l,l))
The first coordinate refers to the angle of the line, and the second coordinate refers to the location of the line.
The lines for the angle indexed with 3:
plt.imshow(proj_t[3,0], cmap='gray');
plt.imshow(proj_t[3,1], cmap='gray');
plt.imshow(proj_t[3,2], cmap='gray');
plt.imshow(proj_t[3,40], cmap='gray');
Other lines at vertical location 40:
plt.imshow(proj_t[4,40], cmap='gray');
plt.imshow(proj_t[15,40], cmap='gray');
plt.imshow(proj_t[17,40], cmap='gray');
Next, we want to see how the line intersects with our data. Remember, this is what the data looks like:
plt.figure(figsize=(5,5))
plt.imshow(data, cmap=plt.cm.gray)
plt.axis('off')
plt.savefig("images/data.png")
proj = proj_operator @ data.ravel()[:, np.newaxis]
An x-ray at angle 17, location 40 passing through the data:
plt.figure(figsize=(5,5))
plt.imshow(data + proj_t[17,40], cmap=plt.cm.gray)
plt.axis('off')
plt.savefig("images/data_xray.png")
Where they intersect:
both = data + proj_t[17,40]
plt.imshow((both > 1.1).astype(int), cmap=plt.cm.gray);
The intensity of an x-ray at angle 17, location 40 passing through the data:
np.resize(proj, (l//7,l))[17,40]
6.4384498372605989
The intensity of an x-ray at angle 3, location 14 passing through the data:
plt.imshow(data + proj_t[3,14], cmap=plt.cm.gray);
Where they intersect:
both = data + proj_t[3,14]
plt.imshow((both > 1.1).astype(int), cmap=plt.cm.gray);
The measurement from the CT scan would be a small number here:
np.resize(proj, (l//7,l))[3,14]
2.1374953737965541
proj += 0.15 * np.random.randn(*proj.shape)
a = [1,2,3]
b= [4,5,6]
c = list(zip(a, b))
c
[(1, 4), (2, 5), (3, 6)]
list(zip(*c))
[(1, 2, 3), (4, 5, 6)]
plt.figure(figsize=(7,7))
plt.imshow(np.resize(proj, (l//7,l)), cmap='gray')
plt.axis('off')
plt.savefig("images/proj.png")
Now we will try to recover the data just from the projections (the measurements of the CT scan)
Our matrix $A$ is the projection operator. This was our 4d matrix above (angle, location, x, y) of the different x-rays:
plt.figure(figsize=(12,12))
plt.title("A: Projection Operator")
plt.imshow(proj_operator.todense().A, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd4199a20>
We are solving for $x$, the original data. We (un)ravel the 2D data into a single column.
plt.figure(figsize=(5,5))
plt.title("x: Image")
plt.imshow(data, cmap='gray')
plt.figure(figsize=(4,12))
# I am tiling the column so that it's easier to see
plt.imshow(np.tile(data.ravel(), (80,1)).T, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd3903be0>
Our vector $b$ is the (un)raveled matrix of measurements:
plt.figure(figsize=(8,8))
plt.imshow(np.resize(proj, (l//7,l)), cmap='gray')
plt.figure(figsize=(10,10))
plt.imshow(np.tile(proj.ravel(), (20,1)).T, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd3bebf28>
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
# Reconstruction with L2 (Ridge) penalization
rgr_ridge = Ridge(alpha=0.2)
rgr_ridge.fit(proj_operator, proj.ravel())
rec_l2 = rgr_ridge.coef_.reshape(l, l)
plt.imshow(rec_l2, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd453d5c0>
18*128
2304
18 x 128 x 128 x 128
proj_operator.shape
(2304, 16384)
# Reconstruction with L1 (Lasso) penalization
# the best value of alpha was determined using cross validation
# with LassoCV
rgr_lasso = Lasso(alpha=0.001)
rgr_lasso.fit(proj_operator, proj.ravel())
rec_l1 = rgr_lasso.coef_.reshape(l, l)
plt.imshow(rec_l1, cmap='gray')
<matplotlib.image.AxesImage at 0x7efcd4919cf8>
The L1 penalty works significantly better than the L2 penalty here!