Matrix decomposition is a powerful tool for many machine learning problems and which has been widely used in data compression, dimensionality reduction, and sparsity learning, to name but a few. In many cases, for purposes of approximating a data matrix by a low-rank structure, Singular Value Decomposition (SVD) is often verified as the best choice. However, the accurate and efficient SVD of large data matrices (e.g., 8k-by-10k matrix) is computationally challenging. To resolve the SVD in this situation, there are many algorithms have been developed by applying randomized linear algebra methods. One of the most important algorithms is randomized SVD, which is competitively efficient for decomposing any large matrix with a relatively low rank.
This post will introduce the preliminary and essential idea of the randomized SVD. To help readers gain a better understanding of randomized SVD, we also provide the corresponding Python implementation in this post.
For reproducing this notebook, please clone or download the tensor-learning repository (https://github.com/xinychen/tensor-learning) on your computer first.
We start by recalling the concept of SVD. As you may already know, SVD is one of the most important decomposition formula in linear algebra. For any given matrix $\boldsymbol{A}$, SVD has the form of \begin{equation} \boldsymbol{A}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^\top \end{equation} where the matrices $\boldsymbol{U}$ and $\boldsymbol{V}$ consist of left and right singular vectors, respectively. The diagonal entries of $\boldsymbol{\Sigma}$ are singular values.
We will give a sufficiently detailed understanding with a small worked example. The problem is a simple SVD of 3-by-3 matrix, i.e., $$\boldsymbol{A}=\left(\begin{array}{cccc} 1 & 3 & 2 \\ 5 & 3 & 1 \\ 3 & 4 & 5 \\ \end{array}\right)\in\mathbb{R}^{3\times 3}.$$
Take this 3-by-3 matrix for example, we can compute the SVD by using numpy.linalg.svd()
in Python. Let us take a look:
import numpy as np
A = np.array([[1, 3, 2],
[5, 3, 1],
[3, 4, 5]])
u, s, v = np.linalg.svd(A, full_matrices = 0)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()
Left singular vectors: [[-0.37421754 0.28475648 -0.88253894] [-0.56470638 -0.82485997 -0.02669705] [-0.7355732 0.48838486 0.46948087]] Singular values: [9.34265841 3.24497827 1.08850813] Right singular vectors: [[-0.57847229 -0.61642675 -0.53421706] [-0.73171177 0.10269066 0.67383419] [ 0.36051032 -0.78068732 0.51045041]]
In this case, the singular values are 9.3427, 3.2450, and 1.0885.
Randomized SVD can be broken into three steps. For any given $m$-by-$n$ matrix $\boldsymbol{A}$, if we impose a target rank $k$ with $k < \min\{m, n\}$, then the first step is to
Note that the first step needs to return the $m$-by-$k$ matrix $\boldsymbol{Q}$.
Then, the second step as shown in Figure 3 is to
Note that the singular values (i.e., $\boldsymbol{\Sigma}$)and right singular vectors (i.e., $\boldsymbol{V}$) of the matrix $\boldsymbol{B}$ are also the singular values and right singular vectors of the matrix $\boldsymbol{A}$.
As shown in Figure 3, if we combine the matrix $\boldsymbol{Q}$ derived in the first step with the left singular vectors of $\boldsymbol{B}$, we can get the left singular vectors (i.e., $\boldsymbol{U}$) of the matrix $\boldsymbol{A}$ in the third step.
Even though we have learned the essential idea of randomized SVD in above, it would not be really clear if there is no intuitive example. To this end, we follow the aforementioned small matrix SVD.
First, let us try to write a Python function for randomized SVD. Here, we will use two Numpy functions, i.e., np.linalg.qr()
and np.linalg.svd()
.
import numpy as np
def rsvd(A, Omega):
Y = A @ Omega
Q, _ = np.linalg.qr(Y)
B = Q.T @ A
u_tilde, s, v = np.linalg.svd(B, full_matrices = 0)
u = Q @ u_tilde
return u, s, v
Now, let us test it with 3-by-3 matrix (rank = 2
for indicating $k$ with $k < \min(m, n)$):
np.random.seed(1000)
A = np.array([[1, 3, 2],
[5, 3, 1],
[3, 4, 5]])
rank = 2
Omega = np.random.randn(A.shape[1], rank)
u, s, v = rsvd(A, Omega)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()
Left singular vectors: [[ 0.38070859 0.60505354] [ 0.56830191 -0.74963644] [ 0.72944767 0.26824507]] Singular values: [9.34224023 3.02039888] Right singular vectors: [[ 0.57915029 0.61707064 0.53273704] [-0.77420021 0.21163814 0.59650929]]
Recall that the singular values of this matrix are 9.3427, 3.2450, and 1.0885. In this case, randomized SVD has the first two singular values as 9.3422 and 3.0204.
We can see that the first singular values computed by these two SVD algorithms are extremely close. However, the second singular value of randomized SVD has a slight bias. Is there any other method to improve this result? And how?
The answer is yes!
To improve the quality of randomized SVD, power iteration method can be used directly. For more detail about power iteration, please see the page 39 in [1] and there is also a Matlab implementation in the page 40.
In the following Python codes, power_iteration()
is the function for computing the $m$-by-$k$ matrix $\boldsymbol{Y}$ iteratively (the default power_iter
is 3) and then derive the $m$-by-$k$ matrix $\boldsymbol{Q}$ by QR decomposition.
import numpy as np
def power_iteration(A, Omega, power_iter = 3):
Y = A @ Omega
for q in range(power_iter):
Y = A @ (A.T @ Y)
Q, _ = np.linalg.qr(Y)
return Q
def rsvd(A, Omega):
Q = power_iteration(A, Omega)
B = Q.T @ A
u_tilde, s, v = np.linalg.svd(B, full_matrices = 0)
u = Q @ u_tilde
return u, s, v
Let us test our new rsvd()
function:
np.random.seed(1000)
A = np.array([[1, 3, 2],
[5, 3, 1],
[3, 4, 5]])
rank = 2
Omega = np.random.randn(A.shape[1], rank)
u, s, v = rsvd(A, Omega)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()
Left singular vectors: [[ 0.37421757 0.28528579] [ 0.56470638 -0.82484381] [ 0.73557319 0.48810317]] Singular values: [9.34265841 3.24497775] Right singular vectors: [[ 0.57847229 0.61642675 0.53421706] [-0.73178429 0.10284774 0.67373147]]
Recall that:
As you can see, the randomized SVD with power iteration provides extremely accurate singular values.
As mentioned above, it is possible to compress (low-rank) signal matrix using the SVD or randomized SVD. In fact, the way to compress an image using the SVD is rather simple: taking the SVD of the image directly and only keeping the dominant singular values and left/right singular vectors. In terms of randomized SVD, we can predefine the number of dominant singular values first, and then obtain the singular values and left/right singular vectors by the randomized SVD.
For our evaluation, we choose the color image of Lena as our data. The size of this image is $256\times 256\times 3$. Here, we build a matrix $\boldsymbol{A}$ of size $256\times 256$ by only selecting the green chanel.
import numpy as np
import imageio
image = imageio.imread('../datasets/color-images/lena.bmp')
A = image[:, :, 1]
%time u, s, v = np.linalg.svd(A, full_matrices = 0)
CPU times: user 28.3 ms, sys: 3.24 ms, total: 31.5 ms Wall time: 16.3 ms
import matplotlib.pyplot as plt
%matplotlib inline
plt.subplots(1, 2, figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(A)
plt.title('The original Lena at the green chanel.')
plt.axis('off')
plt.subplot(1, 2, 2)
rank = 50
plt.imshow(u[:, : rank] @ np.diag(s[: rank]) @ v[: rank, :])
plt.title('The rank-50 Lena at the green chanel.')
plt.axis('off')
plt.show()
plt.subplots(1, 2, figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(u[:, : rank] @ np.diag(s[: rank]) @ v[: rank, :])
plt.title('The rank-50 Lena using SVD.')
plt.axis('off')
rank = 50
Omega = np.random.randn(A.shape[1], rank)
%time u, s, v = rsvd(A, Omega)
plt.subplot(1, 2, 2)
plt.imshow(u[:, : rank] @ np.diag(s[: rank]) @ v[: rank, :])
plt.title('The rank-50 Lena using randomized SVD.')
plt.axis('off')
plt.show()
CPU times: user 10.5 ms, sys: 1.05 ms, total: 11.6 ms Wall time: 6.17 ms
We can see from this image compression experiment that:
rank = 50
).rank = 50
, the total CPU times of the randomized SVD is 11.6 ms, while the total CPU times of the SVD is 31.5 ms.In this post, you discovered the randomized linear algebra method for SVD. Specifically, you learned:
Do you have any question? Ask your question by creating an issue at the tensor-learning repository (https://github.com/xinychen/tensor-learning) and I will do my best to answer. If you find these codes useful, please star (★) this repository.
[1] Steven L. Brunton, J. Nathan Kutz (2019). Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Page 37–41.
[2] N. Benjamin Erichson, Sergey Voronin, Steven L. Brunton, J. Nathan Kutz (2016). Randomized Matrix Decompositions Using R. arXiv:1608.02148. [PDF]