About this Notebook

In this notebook, we provide the tensor factorization implementation using an iterative Alternating Least Square (ALS), which is a good starting point for understanding tensor factorization.

In [1]:
import numpy as np
from numpy.linalg import inv as inv

Part 1: Matrix Computation Concepts

1) Kronecker product

  • Definition:

Given two matrices $A\in\mathbb{R}^{m_1\times n_1}$ and $B\in\mathbb{R}^{m_2\times n_2}$, then, the Kronecker product between these two matrices is defined as

$$A\otimes B=\left[ \begin{array}{cccc} a_{11}B & a_{12}B & \cdots & a_{1m_2}B \\ a_{21}B & a_{22}B & \cdots & a_{2m_2}B \\ \vdots & \vdots & \ddots & \vdots \\ a_{m_11}B & a_{m_12}B & \cdots & a_{m_1m_2}B \\ \end{array} \right]$$

where the symbol $\otimes$ denotes Kronecker product, and the size of resulted $A\otimes B$ is $(m_1m_2)\times (n_1n_2)$ (i.e., $m_1\times m_2$ columns and $n_1\times n_2$ rows).

  • Example:

If $A=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]$ and $B=\left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10 \\ \end{array} \right]$, then, we have

$$A\otimes B=\left[ \begin{array}{cc} 1\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] & 2\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] \\ 3\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] & 4\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] \\ \end{array} \right]$$$$=\left[ \begin{array}{cccccc} 5 & 6 & 7 & 10 & 12 & 14 \\ 8 & 9 & 10 & 16 & 18 & 20 \\ 15 & 18 & 21 & 20 & 24 & 28 \\ 24 & 27 & 30 & 32 & 36 & 40 \\ \end{array} \right]\in\mathbb{R}^{4\times 6}.$$

2) Khatri-Rao product (kr_prod)

  • Definition:

Given two matrices $A=\left( \boldsymbol{a}_1,\boldsymbol{a}_2,...,\boldsymbol{a}_r \right)\in\mathbb{R}^{m\times r}$ and $B=\left( \boldsymbol{b}_1,\boldsymbol{b}_2,...,\boldsymbol{b}_r \right)\in\mathbb{R}^{n\times r}$ with same number of columns, then, the Khatri-Rao product (or column-wise Kronecker product) between $A$ and $B$ is given as follows,

$$A\odot B=\left( \boldsymbol{a}_1\otimes \boldsymbol{b}_1,\boldsymbol{a}_2\otimes \boldsymbol{b}_2,...,\boldsymbol{a}_r\otimes \boldsymbol{b}_r \right)\in\mathbb{R}^{(mn)\times r},$$

where the symbol $\odot$ denotes Khatri-Rao product, and $\otimes$ denotes Kronecker product.

  • Example:

If $A=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]=\left( \boldsymbol{a}_1,\boldsymbol{a}_2 \right) $ and $B=\left[ \begin{array}{cc} 5 & 6 \\ 7 & 8 \\ 9 & 10 \\ \end{array} \right]=\left( \boldsymbol{b}_1,\boldsymbol{b}_2 \right) $, then, we have

$$A\odot B=\left( \boldsymbol{a}_1\otimes \boldsymbol{b}_1,\boldsymbol{a}_2\otimes \boldsymbol{b}_2 \right) $$$$=\left[ \begin{array}{cc} \left[ \begin{array}{c} 1 \\ 3 \\ \end{array} \right]\otimes \left[ \begin{array}{c} 5 \\ 7 \\ 9 \\ \end{array} \right] & \left[ \begin{array}{c} 2 \\ 4 \\ \end{array} \right]\otimes \left[ \begin{array}{c} 6 \\ 8 \\ 10 \\ \end{array} \right] \\ \end{array} \right]$$$$=\left[ \begin{array}{cc} 5 & 12 \\ 7 & 16 \\ 9 & 20 \\ 15 & 24 \\ 21 & 32 \\ 27 & 40 \\ \end{array} \right]\in\mathbb{R}^{6\times 2}.$$
In [2]:
def kr_prod(a, b):
    return np.einsum('ir, jr -> ijr', a, b).reshape(a.shape[0] * b.shape[0], -1)
In [3]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8], [9, 10]])
print(kr_prod(A, B))
[[ 5 12]
 [ 7 16]
 [ 9 20]
 [15 24]
 [21 32]
 [27 40]]

3) CP decomposition

CP Combination (cp_combination)

  • Definition:

The CP decomposition factorizes a tensor into a sum of outer products of vectors. For example, for a third-order tensor $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$, the CP decomposition can be written as

$$\hat{\mathcal{Y}}=\sum_{s=1}^{r}\boldsymbol{u}_{s}\circ\boldsymbol{v}_{s}\circ\boldsymbol{x}_{s},$$

or element-wise,

$$\hat{y}_{ijt}=\sum_{s=1}^{r}u_{is}v_{js}x_{ts},\forall (i,j,t),$$

where vectors $\boldsymbol{u}_{s}\in\mathbb{R}^{m},\boldsymbol{v}_{s}\in\mathbb{R}^{n},\boldsymbol{x}_{s}\in\mathbb{R}^{f}$ are columns of factor matrices $U\in\mathbb{R}^{m\times r},V\in\mathbb{R}^{n\times r},X\in\mathbb{R}^{f\times r}$, respectively. The symbol $\circ$ denotes vector outer product.

  • Example:

Given matrices $U=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]\in\mathbb{R}^{2\times 2}$, $V=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ 5 & 6 \\ \end{array} \right]\in\mathbb{R}^{3\times 2}$ and $X=\left[ \begin{array}{cc} 1 & 5 \\ 2 & 6 \\ 3 & 7 \\ 4 & 8 \\ \end{array} \right]\in\mathbb{R}^{4\times 2}$, then if $\hat{\mathcal{Y}}=\sum_{s=1}^{r}\boldsymbol{u}_{s}\circ\boldsymbol{v}_{s}\circ\boldsymbol{x}_{s}$, then, we have

$$\hat{Y}_1=\hat{\mathcal{Y}}(:,:,1)=\left[ \begin{array}{ccc} 31 & 42 & 65 \\ 63 & 86 & 135 \\ \end{array} \right],$$$$\hat{Y}_2=\hat{\mathcal{Y}}(:,:,2)=\left[ \begin{array}{ccc} 38 & 52 & 82 \\ 78 & 108 & 174 \\ \end{array} \right],$$$$\hat{Y}_3=\hat{\mathcal{Y}}(:,:,3)=\left[ \begin{array}{ccc} 45 & 62 & 99 \\ 93 & 130 & 213 \\ \end{array} \right],$$$$\hat{Y}_4=\hat{\mathcal{Y}}(:,:,4)=\left[ \begin{array}{ccc} 52 & 72 & 116 \\ 108 & 152 & 252 \\ \end{array} \right].$$
In [4]:
def cp_combine(U, V, X):
    return np.einsum('is, js, ts -> ijt', U, V, X)
In [5]:
U = np.array([[1, 2], [3, 4]])
V = np.array([[1, 3], [2, 4], [5, 6]])
X = np.array([[1, 5], [2, 6], [3, 7], [4, 8]])
print(cp_combine(U, V, X))
print()
print('tensor size:')
print(cp_combine(U, V, X).shape)
[[[ 31  38  45  52]
  [ 42  52  62  72]
  [ 65  82  99 116]]

 [[ 63  78  93 108]
  [ 86 108 130 152]
  [135 174 213 252]]]

tensor size:
(2, 3, 4)

4) Tensor Unfolding (ten2mat)

Using numpy reshape to perform 3rd rank tensor unfold operation. [link]

In [6]:
def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')
In [7]:
X = np.array([[[1, 2, 3, 4], [3, 4, 5, 6]], 
              [[5, 6, 7, 8], [7, 8, 9, 10]], 
              [[9, 10, 11, 12], [11, 12, 13, 14]]])
print('tensor size:')
print(X.shape)
print('original tensor:')
print(X)
print()
print('(1) mode-1 tensor unfolding:')
print(ten2mat(X, 0))
print()
print('(2) mode-2 tensor unfolding:')
print(ten2mat(X, 1))
print()
print('(3) mode-3 tensor unfolding:')
print(ten2mat(X, 2))
tensor size:
(3, 2, 4)
original tensor:
[[[ 1  2  3  4]
  [ 3  4  5  6]]

 [[ 5  6  7  8]
  [ 7  8  9 10]]

 [[ 9 10 11 12]
  [11 12 13 14]]]

(1) mode-1 tensor unfolding:
[[ 1  3  2  4  3  5  4  6]
 [ 5  7  6  8  7  9  8 10]
 [ 9 11 10 12 11 13 12 14]]

(2) mode-2 tensor unfolding:
[[ 1  5  9  2  6 10  3  7 11  4  8 12]
 [ 3  7 11  4  8 12  5  9 13  6 10 14]]

(3) mode-3 tensor unfolding:
[[ 1  5  9  3  7 11]
 [ 2  6 10  4  8 12]
 [ 3  7 11  5  9 13]
 [ 4  8 12  6 10 14]]

Part 2: Tensor CP Factorization using ALS (TF-ALS)

Regarding CP factorization as a machine learning problem, we could perform a learning task by minimizing the loss function over factor matrices, that is,

$$\min _{U, V, X} \sum_{(i, j, t) \in \Omega}\left(y_{i j t}-\sum_{r=1}^{R}u_{ir}v_{jr}x_{tr}\right)^{2}.$$

Within this optimization problem, multiplication among three factor matrices (acted as parameters) makes this problem difficult. Alternatively, we apply the ALS algorithm for CP factorization.

In particular, the optimization problem for each row $\boldsymbol{u}_{i}\in\mathbb{R}^{R},\forall i\in\left\{1,2,...,M\right\}$ of factor matrix $U\in\mathbb{R}^{M\times R}$ is given by

$$\min _{\boldsymbol{u}_{i}} \sum_{j,t:(i, j, t) \in \Omega}\left[y_{i j t}-\boldsymbol{u}_{i}^\top\left(\boldsymbol{x}_{t}\odot\boldsymbol{v}_{j}\right)\right]\left[y_{i j t}-\boldsymbol{u}_{i}^\top\left(\boldsymbol{x}_{t}\odot\boldsymbol{v}_{j}\right)\right]^\top.$$

The least square for this optimization is

$$u_{i} \Leftarrow\left(\sum_{j, t, i, j, t ) \in \Omega} \left(x_{t} \odot v_{j}\right)\left(x_{t} \odot v_{j}\right)^{\top}\right)^{-1}\left(\sum_{j, t :(i, j, t) \in \Omega} y_{i j t} \left(x_{t} \odot v_{j}\right)\right), \forall i \in\{1,2, \ldots, M\}.$$

The alternating least squares for $V\in\mathbb{R}^{N\times R}$ and $X\in\mathbb{R}^{T\times R}$ are

$$\boldsymbol{v}_{j}\Leftarrow\left(\sum_{i,t:(i,j,t)\in\Omega}\left(\boldsymbol{x}_{t}\odot\boldsymbol{u}_{i}\right)\left(\boldsymbol{x}_{t}\odot\boldsymbol{u}_{i}\right)^\top\right)^{-1}\left(\sum_{i,t:(i,j,t)\in\Omega}y_{ijt}\left(\boldsymbol{x}_{t}\odot\boldsymbol{u}_{i}\right)\right),\forall j\in\left\{1,2,...,N\right\},$$$$\boldsymbol{x}_{t}\Leftarrow\left(\sum_{i,j:(i,j,t)\in\Omega}\left(\boldsymbol{v}_{j}\odot\boldsymbol{u}_{i}\right)\left(\boldsymbol{v}_{j}\odot\boldsymbol{u}_{i}\right)^\top\right)^{-1}\left(\sum_{i,j:(i,j,t)\in\Omega}y_{ijt}\left(\boldsymbol{v}_{j}\odot\boldsymbol{u}_{i}\right)\right),\forall t\in\left\{1,2,...,T\right\}.$$
In [8]:
def CP_ALS(sparse_tensor, rank, maxiter):
    dim1, dim2, dim3 = sparse_tensor.shape
    dim = np.array([dim1, dim2, dim3])
    
    U = 0.1 * np.random.rand(dim1, rank)
    V = 0.1 * np.random.rand(dim2, rank)
    X = 0.1 * np.random.rand(dim3, rank)
    
    pos = np.where(sparse_tensor != 0)
    binary_tensor = np.zeros((dim1, dim2, dim3))
    binary_tensor[pos] = 1
    tensor_hat = np.zeros((dim1, dim2, dim3))
    
    for iters in range(maxiter):
        for order in range(dim.shape[0]):
            if order == 0:
                var1 = kr_prod(X, V).T
            elif order == 1:
                var1 = kr_prod(X, U).T
            else:
                var1 = kr_prod(V, U).T
            var2 = kr_prod(var1, var1)
            var3 = np.matmul(var2, ten2mat(binary_tensor, order).T).reshape([rank, rank, dim[order]])
            var4 = np.matmul(var1, ten2mat(sparse_tensor, order).T)
            for i in range(dim[order]):
                var_Lambda = var3[ :, :, i]
                inv_var_Lambda = inv((var_Lambda + var_Lambda.T)/2 + 10e-12 * np.eye(rank))
                vec = np.matmul(inv_var_Lambda, var4[:, i])
                if order == 0:
                    U[i, :] = vec.copy()
                elif order == 1:
                    V[i, :] = vec.copy()
                else:
                    X[i, :] = vec.copy()

        tensor_hat = cp_combine(U, V, X)
        mape = np.sum(np.abs(sparse_tensor[pos] - tensor_hat[pos])/sparse_tensor[pos])/sparse_tensor[pos].shape[0]
        rmse = np.sqrt(np.sum((sparse_tensor[pos] - tensor_hat[pos]) ** 2)/sparse_tensor[pos].shape[0])
        
        if (iters + 1) % 100 == 0:
            print('Iter: {}'.format(iters + 1))
            print('Training MAPE: {:.6}'.format(mape))
            print('Training RMSE: {:.6}'.format(rmse))
            print()
    
    return tensor_hat, U, V, X

Part 3: Data Organization

1) Matrix Structure

We consider a dataset of $m$ discrete time series $\boldsymbol{y}_{i}\in\mathbb{R}^{f},i\in\left\{1,2,...,m\right\}$. The time series may have missing elements. We express spatio-temporal dataset as a matrix $Y\in\mathbb{R}^{m\times f}$ with $m$ rows (e.g., locations) and $f$ columns (e.g., discrete time intervals),

$$Y=\left[ \begin{array}{cccc} y_{11} & y_{12} & \cdots & y_{1f} \\ y_{21} & y_{22} & \cdots & y_{2f} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \cdots & y_{mf} \\ \end{array} \right]\in\mathbb{R}^{m\times f}.$$

2) Tensor Structure

We consider a dataset of $m$ discrete time series $\boldsymbol{y}_{i}\in\mathbb{R}^{nf},i\in\left\{1,2,...,m\right\}$. The time series may have missing elements. We partition each time series into intervals of predifined length $f$. We express each partitioned time series as a matrix $Y_{i}$ with $n$ rows (e.g., days) and $f$ columns (e.g., discrete time intervals per day),

$$Y_{i}=\left[ \begin{array}{cccc} y_{11} & y_{12} & \cdots & y_{1f} \\ y_{21} & y_{22} & \cdots & y_{2f} \\ \vdots & \vdots & \ddots & \vdots \\ y_{n1} & y_{n2} & \cdots & y_{nf} \\ \end{array} \right]\in\mathbb{R}^{n\times f},i=1,2,...,m,$$

therefore, the resulting structure is a tensor $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$.

How to transform a data set into something we can use for time series imputation?

Part 4: Experiments on Guangzhou Data Set

In [9]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.2

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario:
# binary_tensor = np.zeros(dense_tensor.shape)
# for i1 in range(dense_tensor.shape[0]):
#     for i2 in range(dense_tensor.shape[1]):
#         binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

Question: Given only the partially observed data $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$, how can we impute the unknown missing values?

The main influential factors for such imputation model are:

  • rank.

  • maxiter.

In [10]:
import time
start = time.time()
rank = 80
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.0809251
Training RMSE: 3.47736

Iter: 200
Training MAPE: 0.0805399
Training RMSE: 3.46261

Iter: 300
Training MAPE: 0.0803688
Training RMSE: 3.45631

Iter: 400
Training MAPE: 0.0802661
Training RMSE: 3.45266

Iter: 500
Training MAPE: 0.0801768
Training RMSE: 3.44986

Iter: 600
Training MAPE: 0.0800948
Training RMSE: 3.44755

Iter: 700
Training MAPE: 0.0800266
Training RMSE: 3.4456

Iter: 800
Training MAPE: 0.0799675
Training RMSE: 3.44365

Iter: 900
Training MAPE: 0.07992
Training RMSE: 3.4419

Iter: 1000
Training MAPE: 0.079885
Training RMSE: 3.44058

Final Imputation MAPE: 0.0833307
Final Imputation RMSE: 3.59283

Running time: 2908 seconds

Experiment results of missing data imputation using TF-ALS:

scenario rank maxiter mape rmse
20%, RM 80 1000 0.0833 3.5928
40%, RM 80 1000 0.0837 3.6190
20%, NM 10 1000 0.1027 4.2960
40%, NM 10 1000 0.1028 4.3274

Part 5: Experiments on Birmingham Data Set

In [31]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Birmingham-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']
random_tensor = scipy.io.loadmat('../datasets/Birmingham-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.3

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario:
# binary_tensor = np.zeros(dense_tensor.shape)
# for i1 in range(dense_tensor.shape[0]):
#     for i2 in range(dense_tensor.shape[1]):
#         binary_tensor[i1, i2, :] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [32]:
import time
start = time.time()
rank = 30
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.0509401
Training RMSE: 15.3163

Iter: 200
Training MAPE: 0.0498774
Training RMSE: 14.9599

Iter: 300
Training MAPE: 0.0490062
Training RMSE: 14.768

Iter: 400
Training MAPE: 0.0481006
Training RMSE: 14.6343

Iter: 500
Training MAPE: 0.0474233
Training RMSE: 14.5365

Iter: 600
Training MAPE: 0.0470442
Training RMSE: 14.4642

Iter: 700
Training MAPE: 0.0469617
Training RMSE: 14.4082

Iter: 800
Training MAPE: 0.0470459
Training RMSE: 14.3623

Iter: 900
Training MAPE: 0.0472333
Training RMSE: 14.3235

Iter: 1000
Training MAPE: 0.047408
Training RMSE: 14.2898

Final Imputation MAPE: 0.0583358
Final Imputation RMSE: 18.9148

Running time: 38 seconds

Experiment results of missing data imputation using TF-ALS:

scenario rank maxiter mape rmse
10%, RM 30 1000 0.0615 18.5005
30%, RM 30 1000 0.0583 18.9148
10%, NM 10 1000 0.1447 41.6710
30%, NM 10 1000 0.1765 63.8465

Part 6: Experiments on Hangzhou Data Set

In [39]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')
dense_tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']
random_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

missing_rate = 0.4

# =============================================================================
### Random missing (RM) scenario:
binary_tensor = np.round(random_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario:
# binary_tensor = np.zeros(dense_tensor.shape)
# for i1 in range(dense_tensor.shape[0]):
#     for i2 in range(dense_tensor.shape[1]):
#         binary_tensor[i1, i2, :] = np.round(random_matrix[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [40]:
import time
start = time.time()
rank = 50
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.176548
Training RMSE: 17.0263

Iter: 200
Training MAPE: 0.174888
Training RMSE: 16.8609

Iter: 300
Training MAPE: 0.175056
Training RMSE: 16.7835

Iter: 400
Training MAPE: 0.174988
Training RMSE: 16.7323

Iter: 500
Training MAPE: 0.175013
Training RMSE: 16.6942

Iter: 600
Training MAPE: 0.174928
Training RMSE: 16.6654

Iter: 700
Training MAPE: 0.174722
Training RMSE: 16.6441

Iter: 800
Training MAPE: 0.174565
Training RMSE: 16.6284

Iter: 900
Training MAPE: 0.174454
Training RMSE: 16.6159

Iter: 1000
Training MAPE: 0.174409
Training RMSE: 16.6054

Final Imputation MAPE: 0.209776
Final Imputation RMSE: 100.315

Running time: 279 seconds

Experiment results of missing data imputation using TF-ALS:

scenario rank maxiter mape rmse
20%, RM 50 1000 0.1991 111.303
40%, RM 50 1000 0.2098 100.315
20%, NM 5 1000 0.2837 42.6136
40%, NM 5 1000 0.2811 38.4201

Part 7: Experiments on New York Data Set

In [9]:
import scipy.io

tensor = scipy.io.loadmat('../datasets/NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../datasets/NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../datasets/NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.1

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
# binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        for i3 in range(61):
            binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [10]:
import time
start = time.time()
rank = 30
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.511739
Training RMSE: 4.07981

Iter: 200
Training MAPE: 0.501094
Training RMSE: 4.0612

Iter: 300
Training MAPE: 0.504264
Training RMSE: 4.05578

Iter: 400
Training MAPE: 0.507211
Training RMSE: 4.05119

Iter: 500
Training MAPE: 0.509956
Training RMSE: 4.04623

Iter: 600
Training MAPE: 0.51046
Training RMSE: 4.04129

Iter: 700
Training MAPE: 0.509797
Training RMSE: 4.03294

Iter: 800
Training MAPE: 0.509531
Training RMSE: 4.02976

Iter: 900
Training MAPE: 0.509265
Training RMSE: 4.02861

Iter: 1000
Training MAPE: 0.508873
Training RMSE: 4.02796

Final Imputation MAPE: 0.540363
Final Imputation RMSE: 5.66633

Running time: 742 seconds

Experiment results of missing data imputation using TF-ALS:

scenario rank maxiter mape rmse
10%, RM 30 1000 0.5262 6.2444
30%, RM 30 1000 0.5488 6.8968
10%, NM 30 1000 0.5170 5.9863
30%, NM 30 100 - -

Part 8: Experiments on Seattle Data Set

In [9]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.2

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [10]:
import time
start = time.time()
rank = 50
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.0749497
Training RMSE: 4.47036

Iter: 200
Training MAPE: 0.0745197
Training RMSE: 4.44713

Iter: 300
Training MAPE: 0.0741685
Training RMSE: 4.43496

Iter: 400
Training MAPE: 0.0739049
Training RMSE: 4.42523

Iter: 500
Training MAPE: 0.0737243
Training RMSE: 4.41692

Iter: 600
Training MAPE: 0.0735726
Training RMSE: 4.40994

Iter: 700
Training MAPE: 0.073415
Training RMSE: 4.4034

Iter: 800
Training MAPE: 0.0732484
Training RMSE: 4.3976

Iter: 900
Training MAPE: 0.0731012
Training RMSE: 4.393

Iter: 1000
Training MAPE: 0.0729675
Training RMSE: 4.38843

Final Imputation MAPE: 0.0741996
Final Imputation RMSE: 4.49292

Running time: 2594 seconds
In [11]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
RM_mat = pd.read_csv('../datasets/Seattle-data-set/RM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
RM_mat = RM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])
RM_tensor = RM_mat.reshape([RM_mat.shape[0], 28, 288])

missing_rate = 0.4

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(RM_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [12]:
import time
start = time.time()
rank = 50
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.0747491
Training RMSE: 4.4544

Iter: 200
Training MAPE: 0.074195
Training RMSE: 4.42842

Iter: 300
Training MAPE: 0.0738864
Training RMSE: 4.4162

Iter: 400
Training MAPE: 0.0735955
Training RMSE: 4.40699

Iter: 500
Training MAPE: 0.0733999
Training RMSE: 4.40083

Iter: 600
Training MAPE: 0.0732636
Training RMSE: 4.3959

Iter: 700
Training MAPE: 0.0731835
Training RMSE: 4.39241

Iter: 800
Training MAPE: 0.0731367
Training RMSE: 4.3899

Iter: 900
Training MAPE: 0.0730982
Training RMSE: 4.38779

Iter: 1000
Training MAPE: 0.0730573
Training RMSE: 4.38571

Final Imputation MAPE: 0.0757934
Final Imputation RMSE: 4.5574

Running time: 2706 seconds
In [14]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.2

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [15]:
import time
start = time.time()
rank = 10
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.100191
Training RMSE: 5.60051

Iter: 200
Training MAPE: 0.0981896
Training RMSE: 5.51405

Iter: 300
Training MAPE: 0.0969386
Training RMSE: 5.46377

Iter: 400
Training MAPE: 0.0967974
Training RMSE: 5.45581

Iter: 500
Training MAPE: 0.0966243
Training RMSE: 5.44397

Iter: 600
Training MAPE: 0.0960368
Training RMSE: 5.42168

Iter: 700
Training MAPE: 0.0958292
Training RMSE: 5.41295

Iter: 800
Training MAPE: 0.0957371
Training RMSE: 5.40865

Iter: 900
Training MAPE: 0.0956582
Training RMSE: 5.40568

Iter: 1000
Training MAPE: 0.095595
Training RMSE: 5.40339

Final Imputation MAPE: 0.0994999
Final Imputation RMSE: 5.63311

Running time: 351 seconds
In [16]:
import pandas as pd

dense_mat = pd.read_csv('../datasets/Seattle-data-set/mat.csv', index_col = 0)
NM_mat = pd.read_csv('../datasets/Seattle-data-set/NM_mat.csv', index_col = 0)
dense_mat = dense_mat.values
NM_mat = NM_mat.values
dense_tensor = dense_mat.reshape([dense_mat.shape[0], 28, 288])

missing_rate = 0.4

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros((dense_mat.shape[0], 28, 288))
for i1 in range(binary_tensor.shape[0]):
    for i2 in range(binary_tensor.shape[1]):
        binary_tensor[i1, i2, :] = np.round(NM_mat[i1, i2] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [17]:
import time
start = time.time()
rank = 10
maxiter = 1000
tensor_hat, U, V, X = CP_ALS(sparse_tensor, rank, maxiter)
pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
print('Final Imputation MAPE: {:.6}'.format(final_mape))
print('Final Imputation RMSE: {:.6}'.format(final_rmse))
print()
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
Training MAPE: 0.0996282
Training RMSE: 5.55963

Iter: 200
Training MAPE: 0.0992568
Training RMSE: 5.53825

Iter: 300
Training MAPE: 0.0986723
Training RMSE: 5.51806

Iter: 400
Training MAPE: 0.0967838
Training RMSE: 5.46447

Iter: 500
Training MAPE: 0.0962312
Training RMSE: 5.44762

Iter: 600
Training MAPE: 0.0961017
Training RMSE: 5.44322

Iter: 700
Training MAPE: 0.0959531
Training RMSE: 5.43927

Iter: 800
Training MAPE: 0.0958815
Training RMSE: 5.43619

Iter: 900
Training MAPE: 0.0958781
Training RMSE: 5.4344

Iter: 1000
Training MAPE: 0.0958921
Training RMSE: 5.43266

Final Imputation MAPE: 0.10038
Final Imputation RMSE: 5.7034

Running time: 304 seconds

Experiment results of missing data imputation using TF-ALS:

scenario rank maxiter mape rmse
20%, RM 50 1000 0.0742 4.4929
40%, RM 50 1000 0.0758 4.5574
20%, NM 10 1000 0.0995 5.6331
40%, NM 10 1000 0.1004 5.7034