It is worth noting that the SVD
api in numpy, tensorflow and pytorch are different in several aspects, which I will give a short demo here.
import numpy as np
import tensorflow as tf
import torch
import platform, sys
platform.platform(), sys.version # OS, python system info
('Linux-4.15.0-43-generic-x86_64-with-Ubuntu-16.04-xenial', '3.5.2 (default, Nov 12 2018, 13:43:14) \n[GCC 5.4.0 20160609]')
np.__version__, tf.__version__, torch.__version__ # library version info
('1.16.2', '1.13.1', '1.0.1.post2')
tf.enable_eager_execution()
General SVD theory: A=USVH, where H is for transpose conjugate, U,V is unitary matrix. The corresponding shape of the decomposion is n×n,n×m,m×m, where the shape of A is n×m. S only has nonzero element in the main diagonal, which are called single values.
Furthermore, WLOG, assume n>m, we can recast the decomposition as An×m=U′n×mS′m×mVHm×m, where U′ and S′ is just simple cut of original matrix. Some library gives this reduced decomposition by default, such a feature is usually controlled by full_matrices
flag.
a = np.random.rand(4,3) # the matrix to be SVD decomposed in this work
a
array([[0.05845521, 0.45514259, 0.84723484], [0.20908514, 0.26327405, 0.40786231], [0.48486814, 0.04937339, 0.69909792], [0.00927412, 0.52368466, 0.05372105]])
u_np, s_np, vh_np = np.linalg.svd(a, full_matrices=True, compute_uv=True) # default paramters for svd in np
u_np.shape, s_np.shape, vh_np.shape
((4, 4), (3,), (3, 3))
u_np, s_np, vh_np = np.linalg.svd(a, full_matrices=False, compute_uv=True)
u_np.shape, s_np.shape, vh_np.shape
((4, 3), (3,), (3, 3))
np.isclose((u_np*s_np)@vh_np, a), np.isclose((u_np*s_np)@vh_np.T, a) ## the given v matrix is already vh
(array([[ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]]), array([[False, False, False], [False, False, False], [False, False, False], [False, False, False]]))
np.linalg.svd(a, compute_uv=False) # gives only single value, no zero placehold for u and v matrix
array([1.51996147, 0.55862926, 0.07539824])
np.svd ## no alias
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-66-e5798aee4f5f> in <module>() ----> 1 np.svd AttributeError: module 'numpy' has no attribute 'svd'
s_tf, u_tf, v_tf = tf.linalg.svd(a, full_matrices=False, compute_uv=True)
# default parameters for svd in np, the default value is different from numpy API
# the sorted order of s u v is also different
s_tf, u_tf, v_tf = s_tf.numpy(), u_tf.numpy(), v_tf.numpy()
s_tf.shape, u_tf.shape, v_tf.shape
((3,), (4, 3), (3, 3))
np.isclose((u_tf*s_tf)@v_tf, a), np.isclose((u_tf*s_tf)@v_tf.T, a) ## the given v matrix is not vh
(array([[False, False, False], [False, False, False], [False, False, False], [False, False, False]]), array([[ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]]))
tf.linalg.svd(a, compute_uv=False) ## return only singular value when compute_uv is false
<tf.Tensor: id=16, shape=(3,), dtype=float64, numpy=array([1.34306156, 0.57709753, 0.2727537 ])>
tf.svd ## alias exists
<function tensorflow.python.ops.linalg_ops.svd(tensor, full_matrices=False, compute_uv=True, name=None)>
u_pt, s_pt, v_pt = torch.svd(torch.tensor(a), some=True, compute_uv=True)
# default paramter in pytorch
# some is the exact opposite thing as full_matrices above, i.e. if some is true by default, returned u is reduced
# besides, one must give a input with torch.tensor format, a numpy.array will raise error, which is not the case in tensorflow
u_pt, s_pt, v_pt = u_pt.numpy(), s_pt.numpy(), v_pt.numpy()
u_pt.shape, s_pt.shape, v_pt.shape ## reduced form due to some = True
((4, 3), (3,), (3, 3))
np.isclose((u_pt*s_pt)@v_pt, a), np.isclose((u_pt*s_pt)@v_pt.T, a) ## the given v matrix is not vh
(array([[False, False, False], [False, False, False], [False, False, False], [False, False, False]]), array([[ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]]))
torch.svd(torch.tensor(a), compute_uv=False) ## u, v matrices is still returned as full 0 matrix,
# some is ignored, since u is always n times n
(tensor([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]], dtype=torch.float64), tensor([1.3431, 0.5771, 0.2728], dtype=torch.float64), tensor([[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]], dtype=torch.float64))