About this Notebook


Bayesian probabilistic tensor factorization (or BPTF for short) is a type of Bayesian tensor decomposition that achieves state-of-the-art results on challenging the missing data imputation problem, which particularly posed a temporal smoothness on factor matrix. If you want to understand what is BPTF and its modeling tricks in detail, one paper is for you:

L. Xiong, X. Chen, T.-K. Huang, et al., 2010. Temporal collaborative filtering with Bayesian Probabilistic tensor factorization.

Quick Run

This notebook is publicly available for any usage at our data imputation project. Please click transdim.

In [1]:
import numpy as np
from numpy.random import multivariate_normal as mvnrnd
from scipy.stats import wishart
from numpy.linalg import inv as inv

Part 1: Matrix Computation Concepts

1) Kronecker product

  • Definition:

Given two matrices $A\in\mathbb{R}^{m_1\times n_1}$ and $B\in\mathbb{R}^{m_2\times n_2}$, then, the Kronecker product between these two matrices is defined as

$$A\otimes B=\left[ \begin{array}{cccc} a_{11}B & a_{12}B & \cdots & a_{1m_2}B \\ a_{21}B & a_{22}B & \cdots & a_{2m_2}B \\ \vdots & \vdots & \ddots & \vdots \\ a_{m_11}B & a_{m_12}B & \cdots & a_{m_1m_2}B \\ \end{array} \right]$$

where the symbol $\otimes$ denotes Kronecker product, and the size of resulted $A\otimes B$ is $(m_1m_2)\times (n_1n_2)$ (i.e., $m_1\times m_2$ columns and $n_1\times n_2$ rows).

  • Example:

If $A=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]$ and $B=\left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10 \\ \end{array} \right]$, then, we have

$$A\otimes B=\left[ \begin{array}{cc} 1\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] & 2\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] \\ 3\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] & 4\times \left[ \begin{array}{ccc} 5 & 6 & 7\\ 8 & 9 & 10\\ \end{array} \right] \\ \end{array} \right]$$$$=\left[ \begin{array}{cccccc} 5 & 6 & 7 & 10 & 12 & 14 \\ 8 & 9 & 10 & 16 & 18 & 20 \\ 15 & 18 & 21 & 20 & 24 & 28 \\ 24 & 27 & 30 & 32 & 36 & 40 \\ \end{array} \right]\in\mathbb{R}^{4\times 6}.$$

2) Khatri-Rao product (kr_prod)

  • Definition:

Given two matrices $A=\left( \boldsymbol{a}_1,\boldsymbol{a}_2,...,\boldsymbol{a}_r \right)\in\mathbb{R}^{m\times r}$ and $B=\left( \boldsymbol{b}_1,\boldsymbol{b}_2,...,\boldsymbol{b}_r \right)\in\mathbb{R}^{n\times r}$ with same number of columns, then, the Khatri-Rao product (or column-wise Kronecker product) between $A$ and $B$ is given as follows,

$$A\odot B=\left( \boldsymbol{a}_1\otimes \boldsymbol{b}_1,\boldsymbol{a}_2\otimes \boldsymbol{b}_2,...,\boldsymbol{a}_r\otimes \boldsymbol{b}_r \right)\in\mathbb{R}^{(mn)\times r},$$

where the symbol $\odot$ denotes Khatri-Rao product, and $\otimes$ denotes Kronecker product.

  • Example:

If $A=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]=\left( \boldsymbol{a}_1,\boldsymbol{a}_2 \right) $ and $B=\left[ \begin{array}{cc} 5 & 6 \\ 7 & 8 \\ 9 & 10 \\ \end{array} \right]=\left( \boldsymbol{b}_1,\boldsymbol{b}_2 \right) $, then, we have

$$A\odot B=\left( \boldsymbol{a}_1\otimes \boldsymbol{b}_1,\boldsymbol{a}_2\otimes \boldsymbol{b}_2 \right) $$$$=\left[ \begin{array}{cc} \left[ \begin{array}{c} 1 \\ 3 \\ \end{array} \right]\otimes \left[ \begin{array}{c} 5 \\ 7 \\ 9 \\ \end{array} \right] & \left[ \begin{array}{c} 2 \\ 4 \\ \end{array} \right]\otimes \left[ \begin{array}{c} 6 \\ 8 \\ 10 \\ \end{array} \right] \\ \end{array} \right]$$$$=\left[ \begin{array}{cc} 5 & 12 \\ 7 & 16 \\ 9 & 20 \\ 15 & 24 \\ 21 & 32 \\ 27 & 40 \\ \end{array} \right]\in\mathbb{R}^{6\times 2}.$$
In [2]:
def kr_prod(a, b):
    return np.einsum('ir, jr -> ijr', a, b).reshape(a.shape[0] * b.shape[0], -1)
In [3]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8], [9, 10]])
print(kr_prod(A, B))
[[ 5 12]
 [ 7 16]
 [ 9 20]
 [15 24]
 [21 32]
 [27 40]]

3) CP decomposition

CP Combination (cp_combination)

  • Definition:

The CP decomposition factorizes a tensor into a sum of outer products of vectors. For example, for a third-order tensor $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$, the CP decomposition can be written as

$$\hat{\mathcal{Y}}=\sum_{s=1}^{r}\boldsymbol{u}_{s}\circ\boldsymbol{v}_{s}\circ\boldsymbol{x}_{s},$$

or element-wise,

$$\hat{y}_{ijt}=\sum_{s=1}^{r}u_{is}v_{js}x_{ts},\forall (i,j,t),$$

where vectors $\boldsymbol{u}_{s}\in\mathbb{R}^{m},\boldsymbol{v}_{s}\in\mathbb{R}^{n},\boldsymbol{x}_{s}\in\mathbb{R}^{f}$ are columns of factor matrices $U\in\mathbb{R}^{m\times r},V\in\mathbb{R}^{n\times r},X\in\mathbb{R}^{f\times r}$, respectively. The symbol $\circ$ denotes vector outer product.

  • Example:

Given matrices $U=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} \right]\in\mathbb{R}^{2\times 2}$, $V=\left[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ 5 & 6 \\ \end{array} \right]\in\mathbb{R}^{3\times 2}$ and $X=\left[ \begin{array}{cc} 1 & 5 \\ 2 & 6 \\ 3 & 7 \\ 4 & 8 \\ \end{array} \right]\in\mathbb{R}^{4\times 2}$, then if $\hat{\mathcal{Y}}=\sum_{s=1}^{r}\boldsymbol{u}_{s}\circ\boldsymbol{v}_{s}\circ\boldsymbol{x}_{s}$, then, we have

$$\hat{Y}_1=\hat{\mathcal{Y}}(:,:,1)=\left[ \begin{array}{ccc} 31 & 42 & 65 \\ 63 & 86 & 135 \\ \end{array} \right],$$$$\hat{Y}_2=\hat{\mathcal{Y}}(:,:,2)=\left[ \begin{array}{ccc} 38 & 52 & 82 \\ 78 & 108 & 174 \\ \end{array} \right],$$$$\hat{Y}_3=\hat{\mathcal{Y}}(:,:,3)=\left[ \begin{array}{ccc} 45 & 62 & 99 \\ 93 & 130 & 213 \\ \end{array} \right],$$$$\hat{Y}_4=\hat{\mathcal{Y}}(:,:,4)=\left[ \begin{array}{ccc} 52 & 72 & 116 \\ 108 & 152 & 252 \\ \end{array} \right].$$
In [4]:
def cp_combine(U, V, X):
    return np.einsum('is, js, ts -> ijt', U, V, X)
In [5]:
U = np.array([[1, 2], [3, 4]])
V = np.array([[1, 3], [2, 4], [5, 6]])
X = np.array([[1, 5], [2, 6], [3, 7], [4, 8]])
print(cp_combine(U, V, X))
print()
print('tensor size:')
print(cp_combine(U, V, X).shape)
[[[ 31  38  45  52]
  [ 42  52  62  72]
  [ 65  82  99 116]]

 [[ 63  78  93 108]
  [ 86 108 130 152]
  [135 174 213 252]]]

tensor size:
(2, 3, 4)

4) Tensor Unfolding (ten2mat)

Using numpy reshape to perform 3rd rank tensor unfold operation. [link]

In [6]:
def ten2mat(tensor, mode):
    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')
In [7]:
X = np.array([[[1, 2, 3, 4], [3, 4, 5, 6]], 
              [[5, 6, 7, 8], [7, 8, 9, 10]], 
              [[9, 10, 11, 12], [11, 12, 13, 14]]])
print('tensor size:')
print(X.shape)
print('original tensor:')
print(X)
print()
print('(1) mode-1 tensor unfolding:')
print(ten2mat(X, 0))
print()
print('(2) mode-2 tensor unfolding:')
print(ten2mat(X, 1))
print()
print('(3) mode-3 tensor unfolding:')
print(ten2mat(X, 2))
tensor size:
(3, 2, 4)
original tensor:
[[[ 1  2  3  4]
  [ 3  4  5  6]]

 [[ 5  6  7  8]
  [ 7  8  9 10]]

 [[ 9 10 11 12]
  [11 12 13 14]]]

(1) mode-1 tensor unfolding:
[[ 1  3  2  4  3  5  4  6]
 [ 5  7  6  8  7  9  8 10]
 [ 9 11 10 12 11 13 12 14]]

(2) mode-2 tensor unfolding:
[[ 1  5  9  2  6 10  3  7 11  4  8 12]
 [ 3  7 11  4  8 12  5  9 13  6 10 14]]

(3) mode-3 tensor unfolding:
[[ 1  5  9  3  7 11]
 [ 2  6 10  4  8 12]
 [ 3  7 11  5  9 13]
 [ 4  8 12  6 10 14]]

5) Computing Covariance Matrix (cov_mat)

For any matrix $X\in\mathbb{R}^{m\times n}$, cov_mat can return a $n\times n$ covariance matrix for special use in the following.

In [8]:
def cov_mat(mat):
    dim1, dim2 = mat.shape
    new_mat = np.zeros((dim2, dim2))
    mat_bar = np.mean(mat, axis = 0)
    for i in range(dim1):
        new_mat += np.einsum('i, j -> ij', mat[i, :] - mat_bar, mat[i, :] - mat_bar)
    return new_mat

Part 2: Bayesian probabilistic tensor factorization (BPTF)

1) Model Description

Gaussian assumption

Given a matrix $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$ which suffers from missing values, then the factorization can be applied to reconstruct the missing values within $\mathcal{Y}$ by

$$y_{ijt}\sim\mathcal{N}\left(\sum_{s=1}^{r}u_{is} v_{js} x_{ts},\tau^{-1}\right),\forall (i,j,t),$$

where vectors $\boldsymbol{u}_{s}\in\mathbb{R}^{m},\boldsymbol{v}_{s}\in\mathbb{R}^{n},\boldsymbol{x}_{s}\in\mathbb{R}^{f}$ are columns of latent factor matrices, and $u_{is},v_{js},x_{ts}$ are their elements. The precision term $\tau$ is an inverse of Gaussian variance.

Bayesian framework

Based on the Gaussian assumption over tensor elements $y_{ijt},(i,j,t)\in\Omega$ (where $\Omega$ is a index set indicating observed tensor elements), the conjugate priors of model parameters (i.e., latent factors and precision term) and hyperparameters are given as

$$\boldsymbol{u}_{i}\sim\mathcal{N}\left(\boldsymbol{\mu}_{u},\Lambda_{u}^{-1}\right),\forall i,$$$$\boldsymbol{v}_{j}\sim\mathcal{N}\left(\boldsymbol{\mu}_{v},\Lambda_{v}^{-1}\right),\forall j,$$$$\boldsymbol{x}_{t}\sim\mathcal{N}\left(\boldsymbol{x}_{t-1},\Lambda_{x}^{-1}\right),t>1,$$$$\boldsymbol{x}_{t}\sim\mathcal{N}\left(\boldsymbol{\mu}_{x},\Lambda_{x}^{-1}\right),t=1,$$$$\tau\sim\text{Gamma}\left(a_0,b_0\right),$$$$\boldsymbol{\mu}_{u}\sim\mathcal{N}\left(\boldsymbol{\mu}_0,\left(\beta_0\Lambda_u\right)^{-1}\right),\Lambda_u\sim\mathcal{W}\left(W_0,\nu_0\right),$$$$\boldsymbol{\mu}_{v}\sim\mathcal{N}\left(\boldsymbol{\mu}_0,\left(\beta_0\Lambda_v\right)^{-1}\right),\Lambda_v\sim\mathcal{W}\left(W_0,\nu_0\right),$$$$\boldsymbol{\mu}_{x}\sim\mathcal{N}\left(\boldsymbol{\mu}_0,\left(\beta_0\Lambda_x\right)^{-1}\right),\Lambda_x\sim\mathcal{W}\left(W_0,\nu_0\right).$$

2) Posterior Inference

Please read the paper written by Xiong et al. (2010). In their paper, they have presented the result of posterior inference.

In [9]:
def BPTF(dense_tensor, sparse_tensor, init, rank, maxiter1, maxiter2):
    """Bayesian probabilistic tensor factorization."""
    dim1, dim2, dim3 = sparse_tensor.shape
    binary_tensor = np.zeros((dim1, dim2, dim3))
    dim = np.array([dim1, dim2, dim3])
    pos = np.where((dense_tensor != 0) & (sparse_tensor == 0))
    position = np.where(sparse_tensor != 0)
    binary_tensor[position] = 1
    
    U = init["U"]
    V = init["V"]
    X = init["X"]

    beta0 = 1
    nu0 = rank
    mu0 = np.zeros((rank))
    W0 = np.eye(rank)
    tau = 1
    alpha = 1e-6
    beta = 1e-6
    rho = 0.1 * np.zeros((rank))
    
    U_plus = np.zeros((dim1, rank))
    V_plus = np.zeros((dim2, rank))
    X_plus = np.zeros((dim3, rank))
    tensor_hat_plus = np.zeros((dim1, dim2, dim3))
    for iters in range(maxiter1):
        for order in range(2):
            if order == 0:
                mat = U.copy()
            elif order == 1:
                mat = V.copy()
            mat_bar = np.mean(mat, axis = 0)
            var_mu_hyper = (dim[order] * mat_bar + beta0 * mu0)/(dim[order] + beta0)
            var_W_hyper = inv(inv(W0) + cov_mat(mat) + dim[order] * beta0/(dim[order] + beta0)
                              * np.outer(mat_bar - mu0, mat_bar - mu0))
            var_Lambda_hyper = wishart(df = dim[order] + nu0, scale = var_W_hyper, seed = None).rvs()
            var_mu_hyper = mvnrnd(var_mu_hyper, inv((dim[order] + beta0) * var_Lambda_hyper))

            if order == 0:
                var1 = kr_prod(X, V).T
            elif order == 1:
                var1 = kr_prod(X, U).T
            var2 = kr_prod(var1, var1)
            var3 = (tau * np.matmul(var2, ten2mat(binary_tensor, order).T).reshape([rank, rank, dim[order]])
                    + np.dstack([var_Lambda_hyper] * dim[order]))
            var4 = (tau * np.matmul(var1, ten2mat(sparse_tensor, order).T) 
                    + np.dstack([np.matmul(var_Lambda_hyper, var_mu_hyper)] * dim[order])[0, :, :])
            for i in range(dim[order]):
                var_Lambda = var3[ :, :, i]
                inv_var_Lambda = inv((var_Lambda + var_Lambda.T)/2)
                vec = mvnrnd(np.matmul(inv_var_Lambda, var4[:, i]), inv_var_Lambda)
                if order == 0:
                    U[i, :] = vec.copy()
                elif order == 1:
                    V[i, :] = vec.copy()
        
        var_mu_hyper = (beta0 * rho + X[0, :])/(beta0 + 1)
        var_W_hyper = inv(inv(W0) + np.matmul((X[1 : dim3, :] - X[0 : dim3 - 1, :]).T, 
                                              X[1 : dim3, :] - X[0 : dim3 - 1, :]) 
                          + (beta0 * np.outer(X[0, :] - rho, X[0, :] - rho))/(1 + beta0))
        var_Lambda_hyper = wishart(df = dim3 + nu0, scale = var_W_hyper, seed = None).rvs()
        var_mu_hyper = mvnrnd(var_mu_hyper, inv((1 + beta0) * var_Lambda_hyper))
        
        var1 = kr_prod(V, U).T
        var2 = kr_prod(var1, var1)
        var3 = (tau * np.matmul(var2, ten2mat(binary_tensor, 2).T).reshape([rank, rank, dim3])
               + np.dstack([var_Lambda_hyper] * dim3))
        var4 = tau * np.matmul(var1, ten2mat(sparse_tensor, 2).T)
        for t in range(dim3):
            if t == 0:
                var_mu = (X[t + 1, :] + var_mu_hyper)/2
                var_Lambda = var_Lambda_hyper + var3[:, :, t]
                inv_var_Lambda = inv((var_Lambda + var_Lambda.T)/2)
            elif t == dim3 - 1:
                inv_var_Lambda = inv((var3[:, :, t] + var3[:, :, t].T)/2)
                var_mu = np.matmul(inv_var_Lambda, var4[:, t] + np.matmul(var_Lambda_hyper, X[t - 1, :]))
            else:
                var_Lambda = var_Lambda_hyper + var3[:, :, t]
                inv_var_Lambda = inv((var_Lambda + var_Lambda.T)/2)
                var_mu = np.matmul(inv_var_Lambda, var4[:, t] 
                                   + np.matmul(var_Lambda_hyper, X[t + 1, :] + X[t - 1, :]))
            X[t, :] = mvnrnd(var_mu, inv_var_Lambda)

        if iters + 1 > maxiter1 - maxiter2:
            U_plus += U
            V_plus += V
            X_plus += X

        tensor_hat = cp_combine(U, V, X)
        if iters + 1 > maxiter1 - maxiter2:
            tensor_hat_plus += tensor_hat
        rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
        
        var_alpha = alpha + 0.5 * sparse_tensor[position].shape[0]
        error = sparse_tensor - tensor_hat
        var_beta = beta + 0.5 * np.sum(error[position] ** 2)
        tau = np.random.gamma(var_alpha, 1/var_beta)
        
        if (iters + 1) % 200 == 0 and iters < maxiter1 - maxiter2:
            print('Iter: {}'.format(iters + 1))
            print('RMSE: {:.6}'.format(rmse))
            print()
        
    U = U_plus/maxiter2
    V = V_plus/maxiter2
    X = X_plus/maxiter2
    tensor_hat = tensor_hat_plus/maxiter2
    final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos])/dense_tensor[pos])/dense_tensor[pos].shape[0]
    final_rmse = np.sqrt(np.sum((dense_tensor[pos] - tensor_hat[pos]) ** 2)/dense_tensor[pos].shape[0])
    print('Final MAPE: {:.6}'.format(final_mape))
    print('Final RMSE: {:.6}'.format(final_rmse))
    print()

How to transform a data set into something we can use for time series imputation?

In [10]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.1

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)

Question: Given only the partially observed data $\mathcal{Y}\in\mathbb{R}^{m\times n\times f}$, how can we impute the unknown missing values?

The main influential factors for such imputation model are:

  • rank.

  • maxiter1.

  • maxiter2.

In [11]:
import time
start = time.time()
dim1, dim2, dim3 = sparse_tensor.shape
rank = 30
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
init = {"U": 0.1 * np.random.rand(dim1, rank),
       "V": 0.1 * np.random.rand(dim2, rank),
       "X": 0.1 * np.random.rand(dim3, rank)}
maxiter1 = 1100
maxiter2 = 100
BPTF(dense_tensor, sparse_tensor, init, rank, maxiter1, maxiter2)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 200
RMSE: 5.07312

Iter: 400
RMSE: 5.01909

Iter: 600
RMSE: 4.96905

Iter: 800
RMSE: 4.91407

Iter: 1000
RMSE: 4.91866

Final MAPE: 0.518978
Final RMSE: 4.68414

Running time: 1281 seconds
In [12]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.3

# =============================================================================
### Random missing (RM) scenario
### Set the RM scenario by:
binary_tensor = np.round(rm_tensor + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [13]:
import time
start = time.time()
dim1, dim2, dim3 = sparse_tensor.shape
rank = 30
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
init = {"U": 0.1 * np.random.rand(dim1, rank),
       "V": 0.1 * np.random.rand(dim2, rank),
       "X": 0.1 * np.random.rand(dim3, rank)}
maxiter1 = 1100
maxiter2 = 100
BPTF(dense_tensor, sparse_tensor, init, rank, maxiter1, maxiter2)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 200
RMSE: 5.13092

Iter: 400
RMSE: 5.07661

Iter: 600
RMSE: 5.05075

Iter: 800
RMSE: 5.03971

Iter: 1000
RMSE: 5.01414

Final MAPE: 0.52724
Final RMSE: 4.77209

Running time: 1393 seconds
In [14]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.1

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        for i3 in range(61):
            binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [15]:
import time
start = time.time()
dim1, dim2, dim3 = sparse_tensor.shape
rank = 30
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
init = {"U": 0.1 * np.random.rand(dim1, rank),
       "V": 0.1 * np.random.rand(dim2, rank),
       "X": 0.1 * np.random.rand(dim3, rank)}
maxiter1 = 1100
maxiter2 = 100
BPTF(dense_tensor, sparse_tensor, init, rank, maxiter1, maxiter2)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 200
RMSE: 5.69116

Iter: 400
RMSE: 5.38608

Iter: 600
RMSE: 5.52483

Iter: 800
RMSE: 5.38108

Iter: 1000
RMSE: 5.39472

Final MAPE: 0.527338
Final RMSE: 5.00885

Running time: 1375 seconds
In [16]:
import scipy.io

tensor = scipy.io.loadmat('../NYC-data-set/tensor.mat')
dense_tensor = tensor['tensor']
rm_tensor = scipy.io.loadmat('../NYC-data-set/rm_tensor.mat')
rm_tensor = rm_tensor['rm_tensor']
nm_tensor = scipy.io.loadmat('../NYC-data-set/nm_tensor.mat')
nm_tensor = nm_tensor['nm_tensor']

missing_rate = 0.3

# =============================================================================
### Non-random missing (NM) scenario
### Set the NM scenario by:
binary_tensor = np.zeros(dense_tensor.shape)
for i1 in range(dense_tensor.shape[0]):
    for i2 in range(dense_tensor.shape[1]):
        for i3 in range(61):
            binary_tensor[i1, i2, i3 * 24 : (i3 + 1) * 24] = np.round(nm_tensor[i1, i2, i3] + 0.5 - missing_rate)
# =============================================================================

sparse_tensor = np.multiply(dense_tensor, binary_tensor)
In [17]:
import time
start = time.time()
dim1, dim2, dim3 = sparse_tensor.shape
rank = 30
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
init = {"U": 0.1 * np.random.rand(dim1, rank),
       "V": 0.1 * np.random.rand(dim2, rank),
       "X": 0.1 * np.random.rand(dim3, rank)}
maxiter1 = 1100
maxiter2 = 100
BPTF(dense_tensor, sparse_tensor, init, rank, maxiter1, maxiter2)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 200
RMSE: 5.79731

Iter: 400
RMSE: 5.70744

Iter: 600
RMSE: 5.69342

Iter: 800
RMSE: 5.65908

Iter: 1000
RMSE: 5.53753

Final MAPE: 0.525283
Final RMSE: 5.25305

Running time: 1285 seconds

Experiment results of missing data imputation using BPTF:

scenario rank maxiter1 maxiter2 mape rmse
0.1, RM 30 1100 100 0.5190 4.6841
0.3, RM 30 1100 100 0.5272 4.7721
0.1, NM 30 1100 100 0.5273 5.0089
0.3, NM 30 1100 100 0.5253 5.2531