Part 1: What Does TRTF Look Like?

Tensor factorizationfor any given tensor $\mathcal{Y}\in\mathbb{R}^{M\times N\times T}$ with rank $R$: $$y_{ijt}\approx\sum_{r=1}^{R}u_{ir}v_{jr}x_{tr}\\=\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}\boldsymbol{x}_{t}\\=\left(\boldsymbol{u}_{i}\circledast\boldsymbol{x}_{t}\right)^{\top}\boldsymbol{v}_{j}\\=\left(\boldsymbol{v}_{j}\circledast\boldsymbol{x}_{t}\right)^{\top}\boldsymbol{u}_{i}$$

Temporal regularized tensor factorization (TRMF): $$\min_{U,V,X}~~\sum_{(i,j,t)\in\Omega}\left(y_{ijt}-\sum_{r=1}^{R}u_{ir}v_{jr}x_{tr}\right)^2\\ +\lambda_{u}\sum_{i=1}^{M}\left\|\boldsymbol{u}_{i}\right\|_{2}^{2}+\lambda_{v}\sum_{j=1}^{N}\left\|\boldsymbol{v}_{j}\right\|_{2}^{2}+\lambda_{x}\sum_{t=1}^{T}\left\|\boldsymbol{x}_{t}\right\|_{2}^{2}\\ +\lambda_{ar}\sum_{t=h_d+1}^{T}\left\|\boldsymbol{x}_{t}-\sum_{k=1}^{d}\boldsymbol{\theta}_{k}\circledast\boldsymbol{x}_{t-h_k}\right\|_{2}^{2}+\lambda_{\theta}\sum_{l\in\mathcal{L}}\left\|\boldsymbol{\theta}_{l}\right\|_{2}^{2}$$

Part 2: Alternative Minimization for TRTF

Optimizing $\boldsymbol{u}_{i},i\in\left\{1,2,...,M\right\}$:

Optimization problem: $$\min_{\boldsymbol{u}_{i}}\sum_{j,t:(i,j,t)\in\Omega}\left(y_{ijt}-\left(\boldsymbol{v}_{j}\circledast\boldsymbol{x}_{t}\right)^{\top}\boldsymbol{u}_{i}\right)^{\top}\left(y_{ijt}-\left(\boldsymbol{v}_{j}\circledast\boldsymbol{x}_{t}\right)^{\top}\boldsymbol{u}_{i}\right)+\lambda_{u}\boldsymbol{u}_{i}^{\top}\boldsymbol{u}_{i}$$

Solution: $$\boldsymbol{u}_{i}\Leftarrow\left(\sum_{j,t:(i,j,t)\in\Omega}\left(\boldsymbol{v}_{j}\circledast\boldsymbol{x}_{t}\right)\left(\boldsymbol{v}_{j}\circledast\boldsymbol{x}_{t}\right)^{\top}+\lambda_{u}I_{R}\right)^{-1}\sum_{j,t:(i,j,t)\in\Omega}\left(\boldsymbol{v}_{j}\circledast\boldsymbol{x}_{t}\right)y_{ijt}$$

Optimizing $\boldsymbol{v}_{j},j\in\left\{1,2,...,N\right\}$:

Optimization problem: $$\min_{\boldsymbol{v}_{j}}\sum_{i,t:(i,j,t)\in\Omega}\left(y_{ijt}-\left(\boldsymbol{u}_{i}\circledast\boldsymbol{x}_{t}\right)^{\top}\boldsymbol{v}_{j}\right)^{\top}\left(y_{ijt}-\left(\boldsymbol{u}_{i}\circledast\boldsymbol{x}_{t}\right)^{\top}\boldsymbol{v}_{j}\right)+\lambda_{v}\boldsymbol{v}_{j}^{\top}\boldsymbol{v}_{j}$$

Solution: $$\boldsymbol{v}_{j}\Leftarrow\left(\sum_{i,t:(i,j,t)\in\Omega}\left(\boldsymbol{u}_{i}\circledast\boldsymbol{x}_{t}\right)\left(\boldsymbol{u}_{i}\circledast\boldsymbol{x}_{t}\right)^{\top}+\lambda_{v}I_{R}\right)^{-1}\sum_{i,t:(i,j,t)\in\Omega}\left(\boldsymbol{u}_{i}\circledast\boldsymbol{x}_{t}\right)y_{ijt}$$

Optimizing $\boldsymbol{x}_{t},t\in\left\{1,2,...,T\right\}$:

Case #1: $t\in\left\{1,2,...,h_d\right\}$

Optimization problem: $$\min_{\boldsymbol{x}_{t}}\sum_{i,j:(i,j,t)\in\Omega}\left(y_{ijt}-\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}\boldsymbol{x}_{t}\right)^{\top}\left(y_{ijt}-\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}\boldsymbol{x}_{t}\right)+\lambda_{x}\boldsymbol{x}_{t}^{\top}\boldsymbol{x}_{t}$$

Solution: $$\boldsymbol{x}_{t}\Leftarrow\left(\sum_{i,j:(i,j,t)\in\Omega}\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}+\lambda_{x}I_{R}\right)^{-1}\sum_{i,j:(i,j,t)\in\Omega}\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)y_{ijt}$$

Case #2: $t\in\left\{h_d+1,h_d+2,...,T\right\}$

Optimization problem: $$\min_{\boldsymbol{x}_{t}}\sum_{i,j:(i,j,t)\in\Omega}\left(y_{ijt}-\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}\boldsymbol{x}_{t}\right)^{\top}\left(y_{ijt}-\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}\boldsymbol{x}_{t}\right)+\lambda_{x}\boldsymbol{x}_{t}^{\top}\boldsymbol{x}_{t}\\ +\lambda_{ar}\sum_{k=1,t+h_{k}\leq T}^{d}\left(\boldsymbol{x}_{t+h_k}-\sum_{l=1}^{d}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t+h_{k}-h_l}\right)^{\top}\left(\boldsymbol{x}_{t+h_k}-\sum_{l=1}^{d}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t+h_{k}-h_l}\right)$$

Solution: $$\boldsymbol{x}_{t}\Leftarrow\left(\sum_{i,j:(i,j,t)\in\Omega}\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)^{\top}+\lambda_{x}I_{R}+\lambda_{ar}\sum_{k=1,t+h_k\leq T}^{d}\text{diag}\left(\boldsymbol{\theta}_{k}\circledast\boldsymbol{\theta}_{k}\right)\right)^{-1}\\ \times\left(\sum_{i,j:(i,j,t)\in\Omega}\left(\boldsymbol{u}_{i}\circledast\boldsymbol{v}_{j}\right)y_{ijt}+\lambda_{ar}\sum_{k=1,t+h_k\leq T}^{d}\text{diag}\left(\boldsymbol{\theta}_{k}\right)\boldsymbol{\psi}_{t+h_k}\right)$$ where $$\boldsymbol{\psi}_{t+h_{k}}=\boldsymbol{x}_{t+h_k}-\sum_{l=1,l\neq k}^{d}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t+h_k-h_l}.$$

Part 3: Matrix/Tensor Computation Concepts

Khatri-Rao product (kr_prod)

  • Definition:

Given two matrices $A=\left( \boldsymbol{a}_1,\boldsymbol{a}_2,...,\boldsymbol{a}_r \right)\in\mathbb{R}^{m\times r}$ and $B=\left( \boldsymbol{b}_1,\boldsymbol{b}_2,...,\boldsymbol{b}_r \right)\in\mathbb{R}^{n\times r}$ with same number of columns, then, the Khatri-Rao product (or column-wise Kronecker product) between $A$ and $B$ is given as follows,

$$A\odot B=\left( \boldsymbol{a}_1\otimes \boldsymbol{b}_1,\boldsymbol{a}_2\otimes \boldsymbol{b}_2,...,\boldsymbol{a}_r\otimes \boldsymbol{b}_r \right)\in\mathbb{R}^{(mn)\times r}$$

where the symbol $\odot$ denotes Khatri-Rao product, and $\otimes$ denotes Kronecker product.

  • Example:

If $A=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]=\left( \boldsymbol{a}_1,\boldsymbol{a}_2 \right) $ and $B=\left[ \begin{array}{cc} 5 & 6 \\ 7 & 8 \\ 9 & 10 \\ \end{array} \right]=\left( \boldsymbol{b}_1,\boldsymbol{b}_2 \right) $, then, we have

$$A\odot B=\left( \boldsymbol{a}_1\otimes \boldsymbol{b}_1,\boldsymbol{a}_2\otimes \boldsymbol{b}_2 \right) $$$$=\left[ \begin{array}{cc} \left[ \begin{array}{c} 1 \\ 3 \\ \end{array} \right]\otimes \left[ \begin{array}{c} 5 \\ 7 \\ 9 \\ \end{array} \right] & \left[ \begin{array}{c} 2 \\ 4 \\ \end{array} \right]\otimes \left[ \begin{array}{c} 6 \\ 8 \\ 10 \\ \end{array} \right] \\ \end{array} \right]$$$$=\left[ \begin{array}{cc} 5 & 12 \\ 7 & 16 \\ 9 & 20 \\ 15 & 24 \\ 21 & 32 \\ 27 & 40 \\ \end{array} \right]\in\mathbb{R}^{6\times 2}.$$

CP combination (cp_combine)

  • Definition:

The CP decomposition factorizes a tensor into a sum of outer products of vectors. For example, for a third-order tensor $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$, the CP decomposition can be written as

$$\hat{\mathcal{Y}}=\sum_{s=1}^{r}\boldsymbol{u}_{s}\circ\boldsymbol{v}_{s}\circ\boldsymbol{x}_{s},$$

or element-wise,

$$\hat{y}_{ijt}=\sum_{s=1}^{r}u_{is}v_{js}x_{ts},\forall (i,j,t),$$

where vectors $\boldsymbol{u}_{s}\in\mathbb{R}^{m},\boldsymbol{v}_{s}\in\mathbb{R}^{n},\boldsymbol{x}_{s}\in\mathbb{R}^{f}$ are columns of factor matrices $U\in\mathbb{R}^{m\times r},V\in\mathbb{R}^{n\times r},X\in\mathbb{R}^{f\times r}$, respectively. The symbol $\circ$ denotes vector outer product.

  • Example:

Given matrices $U=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]\in\mathbb{R}^{2\times 2}$, $V=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ 5 & 6 \\ \end{array} \right]\in\mathbb{R}^{3\times 2}$ and $X=\left[ \begin{array}{cc} 1 & 5 \\ 2 & 6 \\ 3 & 7 \\ 4 & 8 \\ \end{array} \right]\in\mathbb{R}^{4\times 2}$, then if $\hat{\mathcal{Y}}=\sum_{s=1}^{r}\boldsymbol{u}_{s}\circ\boldsymbol{v}_{s}\circ\boldsymbol{x}_{s}$, then, we have

$$\hat{Y}_1=\hat{\mathcal{Y}}(:,:,1)=\left[ \begin{array}{ccc} 31 & 42 & 65 \\ 63 & 86 & 135 \\ \end{array} \right],$$$$\hat{Y}_2=\hat{\mathcal{Y}}(:,:,2)=\left[ \begin{array}{ccc} 38 & 52 & 82 \\ 78 & 108 & 174 \\ \end{array} \right],$$$$\hat{Y}_3=\hat{\mathcal{Y}}(:,:,3)=\left[ \begin{array}{ccc} 45 & 62 & 99 \\ 93 & 130 & 213 \\ \end{array} \right],$$$$\hat{Y}_4=\hat{\mathcal{Y}}(:,:,4)=\left[ \begin{array}{ccc} 52 & 72 & 116 \\ 108 & 152 & 252 \\ \end{array} \right].$$

Tensor unfolding (ten2mat)

Using numpy reshape to perform 3rd rank tensor unfold operation. [link]

In [1]:
import numpy as np
from numpy.linalg import inv as inv
In [2]:
def kr_prod(a, b):
    return np.einsum('ir, jr -> ijr', a, b).reshape(a.shape[0] * b.shape[0], -1)
In [3]:
def cp_combine(U, V, X):
    return np.einsum('is, js, ts -> ijt', U, V, X)
In [4]:
def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')
In [5]:
def TRTF(dense_tensor, sparse_tensor, U, V, X, theta, time_lags, 
         lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter):
    dim1, dim2, dim3 = dense_tensor.shape
    binary_tensor = np.zeros((dim1, dim2, dim3))
    position = np.where(sparse_tensor > 0)
    binary_tensor[position] = 1
    pos = np.where((dense_tensor > 0) & (sparse_tensor == 0))
    d = len(time_lags)
    rank = U.shape[1]
    
    for iters in range(maxiter):
        var1 = kr_prod(X, V).T
        var2 = kr_prod(var1, var1)
        var3 = np.matmul(var2, ten2mat(binary_tensor, 0).T).reshape([rank, rank, 
                                                                     dim1]) + np.dstack([lambda_u * np.eye(rank)] * dim1)
        var4 = np.matmul(var1, ten2mat(sparse_tensor, 0).T)
        for i in range(dim1):
            var_Lambda1 = var3[ :, :, i]
            inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)
            U[i, :] = np.matmul(inv_var_Lambda1, var4[:, i])

        var1 = kr_prod(X, U).T
        var2 = kr_prod(var1, var1)
        var3 = np.matmul(var2, ten2mat(binary_tensor, 1).T).reshape([rank, rank, 
                                                                     dim2]) + np.dstack([lambda_v * np.eye(rank)] * dim2)
        var4 = np.matmul(var1, ten2mat(sparse_tensor, 1).T)
        for j in range(dim2):
            var_Lambda1 = var3[ :, :, j]
            inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)
            V[j, :] = np.matmul(inv_var_Lambda1, var4[:, j])

        var1 = kr_prod(V, U).T
        var2 = kr_prod(var1, var1)
        var3 = np.matmul(var2, ten2mat(binary_tensor, 2).T).reshape([rank, rank, dim3])
        var4 = np.matmul(var1, ten2mat(sparse_tensor, 2).T)
        for t in range(dim3):
            Mt = np.zeros((rank, rank))
            Nt = np.zeros(rank)
            if t < max(time_lags):
                Pt = np.zeros((rank, rank))
                Qt = np.zeros(rank)
            else:
                Pt = np.eye(rank)
                Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :])
            if t < dim3 - np.min(time_lags):
                if t >= np.max(time_lags) and t < dim3 - np.max(time_lags):
                    index = list(range(0, d))
                else:
                    index = list(np.where((t + time_lags >= np.max(time_lags)) & (t + time_lags < dim3)))[0]
                for k in index:
                    theta0 = theta.copy()
                    theta0[k, :] = 0
                    Mt = Mt + np.diag(theta[k, :] ** 2);
                    Nt = Nt + np.multiply(theta[k, :], (X[t + time_lags[k], :] 
                                                        - np.einsum('ij, ij -> j', theta0,
                                                                    X[t + time_lags[k] - time_lags, :])))
                X[t, :] = np.matmul(np.linalg.inv(var3[:, :, t]
                                                  + lambda_ar * Pt + lambda_ar * Mt + lambda_ar * eta * np.eye(rank)),
                                    (var4[:, t] + lambda_ar * Qt + lambda_ar * Nt))
            elif t >= dim3 - np.min(time_lags):
                X[t, :] = np.matmul(np.linalg.inv(var3[:, :, t]
                                                  + lambda_ar * Pt +
                                                  lambda_ar * eta * np.eye(rank)), (var4[:, t] + Qt))

        for k in range(d):
            var1 = X[np.max(time_lags) - time_lags[k] : dim3 - time_lags[k], :]
            var2 = np.linalg.inv(np.diag(np.einsum('ij, ij -> j', var1, var1))
                                 + (lambda_theta / lambda_ar) * np.eye(rank))
            var3 = np.zeros(rank)
            for t in range(np.max(time_lags) - time_lags[k], dim3 - time_lags[k]):
                var3 += np.multiply(X[t, :], (X[t + time_lags[k], :] 
                                              - np.einsum('ij, ij -> j', theta, X[t + time_lags[k] - time_lags, :])
                                              + np.multiply(theta[k, :], X[t, :])))
            theta[k, :] = np.matmul(var2, var3)
            
        tensor_hat = cp_combine(U, V, X)
        mape = np.sum(np.abs(dense_tensor[pos] - 
                             tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
        rmse = np.sqrt(np.sum((dense_tensor[pos] -
                               tensor_hat[pos])**2)/dense_tensor[pos].shape[0])
        
        if (iters + 1) % 100 == 0:
            print('Iter: {}'.format(iters + 1))
            print('MAPE: {:.6}'.format(mape))
            print('RMSE: {:.6}'.format(rmse))
            print()

    return U, V, X, theta
In [6]:
def OnlineTRTF(sparse_mat, init, lambda_x, time_lags):
    U = init["U"]
    V = init["V"]
    X = init["X"]
    theta = init["theta"]
    dim1, dim2 = sparse_mat.shape
    t, rank = X.shape
    sparse_tensor = np.zeros((dim1, dim2, 1))
    sparse_tensor[:, :, 0] = sparse_mat
    position = np.where(sparse_tensor != 0)
    binary_tensor = np.zeros(sparse_tensor.shape)
    binary_tensor[position] = 1
    
    xt_tilde = np.einsum('ij, ij -> j', theta, X[t - 1 - time_lags, :])
    var1 = kr_prod(V, U).T
    var2 = kr_prod(var1, var1)
    var_mu = np.matmul(var1, ten2mat(sparse_tensor, 2).reshape([dim1 * dim2])) + lambda_x * xt_tilde
    inv_var_Lambda = inv(np.matmul(var2, ten2mat(binary_tensor, 2).reshape([dim1 * dim2])).reshape([rank, rank]) + lambda_x * np.eye(rank))
    X[t - 1, :] = np.matmul(inv_var_Lambda, var_mu)    
    return X
In [7]:
def st_prediction(dense_tensor, sparse_tensor, pred_time_steps, rank, time_lags, 
                  lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter):
    start_time = dense_tensor.shape[2] - pred_time_steps
    dense_tensor0 = dense_tensor[:, :, 0 : start_time]
    sparse_tensor0 = sparse_tensor[:, :, 0 : start_time]
    dim1, dim2, dim3 = sparse_tensor0.shape
    tensor_hat = np.zeros((dim1, dim2, pred_time_steps))
    
    for t in range(pred_time_steps):
        if t == 0:
            U0 = 0.1 * np.random.rand(dim1, rank)
            V0 = 0.1 * np.random.rand(dim2, rank)
            X0 = 0.1 * np.random.rand(dim3, rank)
            theta0 = 0.1 * np.random.rand(time_lags.shape[0], rank)
            U, V, X, theta = TRTF(dense_tensor0, sparse_tensor0, U0, V0, X0, theta0, time_lags, 
                                  lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter)
            U0 = U.copy()
            V0 = V.copy()
            X0 = np.zeros((dim3 + t + 1, rank))
            X0[0 : dim3 + t, :] = X.copy()
            X0[dim3 + t, :] = np.einsum('ij, ij -> j', theta, X0[dim3 + t - time_lags, :])
            theta0 = theta.copy()
            
        else:
            sparse_mat = sparse_tensor[:, :, start_time + t - 1]
            if np.where(sparse_mat > 0)[0].shape[0] > rank:
                init = {"U": U, "V": V, "X": X0[- np.max(time_lags) - 1 :, :], "theta": theta}
                X = OnlineTRTF(sparse_mat, init, lambda_ar/dim3, time_lags)
                X0 = np.zeros((np.max(time_lags) + 1, rank))
                X0[0 : np.max(time_lags), :] = X[1 :, :].copy()
                X0[np.max(time_lags), :] = np.einsum('ij, ij -> j', theta, X0[np.max(time_lags) - time_lags, :])
            else:
                X0 = np.zeros((np.max(time_lags) + 1, rank))
                X0[0 : np.max(time_lags), :] = X[1 :, :]
                X0[np.max(time_lags), :] = np.einsum('ij, ij -> j', theta, X0[np.max(time_lags) - time_lags, :])
        tensor_hat[:, :, t] = np.einsum('ir, jr, r -> ij', U, V, X0[-1, :])
        if (t + 1) % 40 == 0:
            print('Time step: {}'.format(t + 1))
            
    small_dense_tensor = dense_tensor[:, :, start_time : dense_tensor.shape[2]]
    pos = np.where(small_dense_tensor > 0)
    final_mape = np.sum(np.abs(small_dense_tensor[pos]
                               - tensor_hat[pos])/small_dense_tensor[pos])/small_dense_tensor[pos].shape[0]
    final_rmse = np.sqrt(np.sum((small_dense_tensor[pos]
                                 - tensor_hat[pos]) ** 2)/small_dense_tensor[pos].shape[0])
    print('Final Prediction MAPE: {:.6}'.format(final_mape))
    print('Final Prediction RMSE: {:.6}'.format(final_rmse))
    print()
    return tensor_hat
In [12]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.1

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
# binary_tensor = np.zeros(dense_tensor.shape)
# for i1 in range(dense_tensor.shape[0]):
#     for i2 in range(dense_tensor.shape[1]):
#         for i3 in range(61):
#             binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [17]:
import time
start = time.time()
pred_time_steps = 24 * 7
rank = 30
time_lags = np.array([1, 2, 24])
maxiter = 200
theta = 0.1 * np.random.rand(time_lags.shape[0], rank)
lambda_u = 500
lambda_v = 500
lambda_ar = 500
eta = 2e-2
lambda_theta = 100
tensor_hat = st_prediction(dense_tensor, sparse_tensor, pred_time_steps, rank, time_lags, 
                        lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
MAPE: 0.506813
RMSE: 4.76706

Iter: 200
MAPE: 0.501324
RMSE: 4.77889

Time step: 40
Time step: 80
Time step: 120
Time step: 160
Final Prediction MAPE: 0.557681
Final Prediction RMSE: 5.78127

Running time: 236 seconds
In [8]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.3

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
# binary_tensor = np.zeros(dense_tensor.shape)
# for i1 in range(dense_tensor.shape[0]):
#     for i2 in range(dense_tensor.shape[1]):
#         for i3 in range(61):
#             binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [10]:
import time
start = time.time()
pred_time_steps = 24 * 7
rank = 30
time_lags = np.array([1, 2, 24])
maxiter = 200
theta = 0.1 * np.random.rand(time_lags.shape[0], rank)
lambda_u = 500
lambda_v = 500
lambda_ar = 500
eta = 2e-2
lambda_theta = 100
tensor_hat = st_prediction(dense_tensor, sparse_tensor, pred_time_steps, rank, time_lags, 
                        lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
MAPE: 0.5329
RMSE: 4.87829

Iter: 200
MAPE: 0.518681
RMSE: 4.85493

Time step: 40
Time step: 80
Time step: 120
Time step: 160
Final Prediction MAPE: 0.590883
Final Prediction RMSE: 6.19747

Running time: 165 seconds
In [11]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.1

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
# binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        for i3 in range(61):
            binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [12]:
import time
start = time.time()
pred_time_steps = 24 * 7
rank = 30
time_lags = np.array([1, 2, 24])
maxiter = 200
theta = 0.1 * np.random.rand(time_lags.shape[0], rank)
lambda_u = 500
lambda_v = 500
lambda_ar = 500
eta = 2e-2
lambda_theta = 100
tensor_hat = st_prediction(dense_tensor, sparse_tensor, pred_time_steps, rank, time_lags, 
                        lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
MAPE: 0.511527
RMSE: 4.92398

Iter: 200
MAPE: 0.507012
RMSE: 4.87647

Time step: 40
Time step: 80
Time step: 120
Time step: 160
Final Prediction MAPE: 0.555969
Final Prediction RMSE: 5.75592

Running time: 156 seconds
In [13]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.3

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
# binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        for i3 in range(61):
            binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [14]:
import time
start = time.time()
pred_time_steps = 24 * 7
rank = 30
time_lags = np.array([1, 2, 24])
maxiter = 200
theta = 0.1 * np.random.rand(time_lags.shape[0], rank)
lambda_u = 500
lambda_v = 500
lambda_ar = 500
eta = 2e-2
lambda_theta = 100
tensor_hat = st_prediction(dense_tensor, sparse_tensor, pred_time_steps, rank, time_lags, 
                        lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100
MAPE: 0.516335
RMSE: 5.08794

Iter: 200
MAPE: 0.513973
RMSE: 5.08438

Time step: 40
Time step: 80
Time step: 120
Time step: 160
Final Prediction MAPE: 0.579093
Final Prediction RMSE: 6.06689

Running time: 164 seconds
In [15]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.0

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
# binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        for i3 in range(61):
            binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [16]:
import time
start = time.time()
pred_time_steps = 24 * 7
rank = 30
time_lags = np.array([1, 2, 24])
maxiter = 200
theta = 0.1 * np.random.rand(time_lags.shape[0], rank)
lambda_u = 500
lambda_v = 500
lambda_ar = 500
eta = 2e-2
lambda_theta = 100
tensor_hat = st_prediction(dense_tensor, sparse_tensor, pred_time_steps, rank, time_lags, 
                        lambda_u, lambda_v, lambda_ar, eta, lambda_theta, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
/Users/xinyuchen/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:78: RuntimeWarning: invalid value encountered in double_scalars
/Users/xinyuchen/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:80: RuntimeWarning: invalid value encountered in double_scalars
Iter: 100
MAPE: nan
RMSE: nan

Iter: 200
MAPE: nan
RMSE: nan

Time step: 40
Time step: 80
Time step: 120
Time step: 160
Final Prediction MAPE: 0.586943
Final Prediction RMSE: 5.6758

Running time: 160 seconds

Experiment results of rolling prediction with missing values using TRTF:

scenario lambda_u lambda_v lambda_ar lambda_theta eta rank time_lags maxiter mape rmse
Original data 500 500 500 100 0.02 30 (1,2,24) 200 0.5869 5.68
10%, RM 500 500 500 100 0.02 30 (1,2,24) 200 0.5577 5.78
30%, RM 500 500 500 100 0.02 30 (1,2,24) 200 0.5909 6.20
10%, NM 500 500 500 100 0.02 30 (1,2,24) 200 0.5560 5.76
30%, NM 500 500 500 100 0.02 30 (1,2,24) 200 0.5791 6.07