Temporal Regularized Matrix Factorization (TRMF) is an effective tool for imputing missing data within a given multivariate time series and forecasting time series with missing values. This approach is from the following literature:
Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon, 2016. Temporal regularized matrix factorization for high-dimensional time series prediction. 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.
Acknowledgement: We would like to thank
for providing helpful suggestion and discussion. Thank you!
In this post, we consider a dataset of $m$ discrete time series $\boldsymbol{y}_{i}\in\mathbb{R}^{f},i\in\left\{1,2,...,m\right\}$. The time series may have missing elements. We express spatio-temporal dataset as a matrix $Y\in\mathbb{R}^{m\times f}$ with $m$ rows (e.g., locations) and $f$ columns (e.g., discrete time intervals),
$$Y=\left[ \begin{array}{cccc} y_{11} & y_{12} & \cdots & y_{1f} \\ y_{21} & y_{22} & \cdots & y_{2f} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \cdots & y_{mf} \\ \end{array} \right]\in\mathbb{R}^{m\times f}.$$Temporal Regularized Matrix Factorization (TRMF) is an approach to incorporate temporal dependencies into commonly-used matrix factorization model. The temporal dependencies are described among ${\boldsymbol{x}_t}$ explicitly. Such approach takes the form:
$$\boldsymbol{x}_{t}\approx\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l},$$where this autoregressive (AR) is specialized by a lag set $\mathcal{L}=\left\{l_1,l_2,...,l_d\right\}$ (e.g., $\mathcal{L}=\left\{1,2,144\right\}$) and weights $\boldsymbol{\theta}_{l}\in\mathbb{R}^{r},\forall l$, and we further define
$$\mathcal{R}_{AR}\left(X\mid \mathcal{L},\Theta,\eta\right)=\frac{1}{2}\sum_{t=l_d+1}^{f}\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)^\top\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)+\frac{\eta}{2}\sum_{t=1}^{f}\boldsymbol{x}_{t}^\top\boldsymbol{x}_{t}.$$Thus, TRMF-AR is given by solving
$$\min_{W,X,\Theta}\frac{1}{2}\underbrace{\sum_{(i,t)\in\Omega}\left(y_{it}-\boldsymbol{w}_{i}^T\boldsymbol{x}_{t}\right)^2}_{\text{sum of squared residual errors}}+\lambda_{w}\underbrace{\mathcal{R}_{w}\left(W\right)}_{W-\text{regularizer}}+\lambda_{x}\underbrace{\mathcal{R}_{AR}\left(X\mid \mathcal{L},\Theta,\eta\right)}_{\text{AR-regularizer}}+\lambda_{\theta}\underbrace{\mathcal{R}_{\theta}\left(\Theta\right)}_{\Theta-\text{regularizer}}$$where $\mathcal{R}_{w}\left(W\right)=\frac{1}{2}\sum_{i=1}^{m}\boldsymbol{w}_{i}^\top\boldsymbol{w}_{i}$ and $\mathcal{R}_{\theta}\left(\Theta\right)=\frac{1}{2}\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}^\top\boldsymbol{\theta}_{l}$ are regularization terms.
Numpy
¶Observing the optimization problem of TRMF model as mentioned above, we categorize the parameters within this model as parameters (i.e., init_para
in the TRMF function) and hyperparameters (i.e., init_hyper
).
We write Python code for updating spatial matrix as follows,
for i in range(dim1):
pos0 = np.where(sparse_mat[i, :] != 0)
Xt = X[pos0[0], :]
vec0 = Xt.T @ sparse_mat[i, pos0[0]]
mat0 = inv(Xt.T @ Xt + lambda_w * np.eye(rank))
W[i, :] = mat0 @ vec0
For your better understanding of these codes, let us see what happened in each line. Recall that the equation for updating $W$ is $$\boldsymbol{w}_{i} \Leftarrow\left(\sum_{t:(i, t) \in \Omega} \boldsymbol{x}_{t} \boldsymbol{x}_{t}^{T}+\lambda_{w} I\right)^{-1} \sum_{t:(i, t) \in \Omega} y_{i t} \boldsymbol{x}_{t}$$ from the optimizization problem: $$\min _{W} \frac{1}{2} \underbrace{\sum_{(i, t) \in \Omega}\left(y_{i t}-\boldsymbol{w}_{i}^{T} \boldsymbol{x}_{t}\right)^{2}}_{\text {sum of squared residual errors }}+\frac{1}{2} \lambda_{w} \underbrace{\sum_{i=1}^{m} \boldsymbol{w}_{i}^{T} \boldsymbol{w}_{i}}_{\text{sum of squared entries}}.$$
As can be seen,
vec0 = Xt.T @ sparse_mat[i, pos0[0]])
corresponds to $$\sum_{t:(i, t) \in \Omega} y_{i t} \boldsymbol{x}_{t}.$$
mat0 = inv(Xt.T @ Xt + lambda_w * np.eye(rank))
corresponds to $$\left(\sum_{t:(i, t) \in \Omega} \boldsymbol{x}_{t} \boldsymbol{x}_{t}^{T}+\lambda_{w} I\right)^{-1}.$$
W[i, :] = mat0 @ vec0
corresponds to the update:
We write Python code for updating temporal matrix as follows,
for t in range(dim2):
pos0 = np.where(sparse_mat[:, t] != 0)
Wt = W[pos0[0], :]
Mt = np.zeros((rank, rank))
Nt = np.zeros(rank)
if t < np.max(time_lags):
Pt = np.zeros((rank, rank))
Qt = np.zeros(rank)
else:
Pt = np.eye(rank)
Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :])
if t < dim2 - np.min(time_lags):
if t >= np.max(time_lags) and t < dim2 - np.max(time_lags):
index = list(range(0, d))
else:
index = list(np.where((t + time_lags >= np.max(time_lags)) & (t + time_lags < dim2)))[0]
for k in index:
Ak = theta[k, :]
Mt += np.diag(Ak ** 2)
theta0 = theta.copy()
theta0[k, :] = 0
Nt += np.multiply(Ak, X[t + time_lags[k], :]
- np.einsum('ij, ij -> j', theta0, X[t + time_lags[k] - time_lags, :]))
vec0 = Wt.T @ sparse_mat[pos0[0], t] + lambda_x * Nt + lambda_x * Qt
mat0 = inv(Wt.T @ Wt + lambda_x * Mt + lambda_x * Pt + lambda_x * eta * np.eye(rank))
X[t, :] = mat0 @ vec0
These codes seem to be very complicated. Let us first see the optimization problem for getting a closed-form update of $X$: $$\min_{W,X,\Theta}\frac{1}{2}\underbrace{\sum_{(i,t)\in\Omega}\left(y_{it}-\boldsymbol{w}_{i}^T\boldsymbol{x}_{t}\right)^2}_{\text{sum of squared residual errors}}+\underbrace{\frac{1}{2}\lambda_{x}\sum_{t=l_d+1}^{f}\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)^\top\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)+\frac{1}{2}\lambda_{x}\eta\sum_{t=1}^{f}\boldsymbol{x}_{t}^\top\boldsymbol{x}_{t}}_{\text{AR-term}}+\underbrace{\frac{1}{2}\lambda_{\theta}\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}^\top\boldsymbol{\theta}_{l}}_{\Theta-\text{term}}.$$
Then, as can be seen,
Mt += np.diag(Ak ** 2)
corresponds to $$\sum_{h\in\mathcal{L},t+h \leq T}\text{diag}(\boldsymbol{\theta}_{h}\circledast\boldsymbol{\theta}_{h}).$$
Nt += np.multiply(Ak, X[t + time_lags[k], :] - np.einsum('ij, ij -> j', theta0, X[t + time_lags[k] - time_lags, :]))
corresponds to $$\sum_{h\in\mathcal{L},t+h \leq T}\boldsymbol{\theta}_{h}\circledast\boldsymbol{\psi}_{t+h}.$$
Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :])
corresponds to $$\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}.$$
-X[t, :] = mat0 @ vec0
corresponds to the update of $X$.
We write Python code for updating temporal matrix as follows,
for k in range(d):
theta0 = theta.copy()
theta0[k, :] = 0
mat0 = np.zeros((dim2 - np.max(time_lags), rank))
for L in range(d):
mat0 += X[np.max(time_lags) - time_lags[L] : dim2 - time_lags[L] , :] @ np.diag(theta0[L, :])
VarPi = X[np.max(time_lags) : dim2, :] - mat0
var1 = np.zeros((rank, rank))
var2 = np.zeros(rank)
for t in range(np.max(time_lags), dim2):
B = X[t - time_lags[k], :]
var1 += np.diag(np.multiply(B, B))
var2 += np.diag(B) @ VarPi[t - np.max(time_lags), :]
theta[k, :] = inv(var1 + lambda_theta * np.eye(rank) / lambda_x) @ var2
For your better understanding of these codes, let us see what happened in each line. Recall that the equation for updating $\theta$ is $$ \color{red} {\boldsymbol{\theta}_{h}\Leftarrow\left(\sum_{t=l_d+1}^{f}\text{diag}(\boldsymbol{x}_{t-h}\circledast \boldsymbol{x}_{t-h})+\frac{\lambda_{\theta}}{\lambda_x}I\right)^{-1}\left(\sum_{t=l_d+1}^{f}{\boldsymbol{\pi}_{t}^{h}}\circledast \boldsymbol{x}_{t-h}\right)} $$ where $\boldsymbol{\pi}_{t}^{h}=\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L},l\neq h}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}$ from the optimizization problem: $$ \min_{\Theta}\frac{1}{2}\lambda_{x}\underbrace{\sum_{t=l_d+1}^{f}\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)^\top\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)}_{\text{sum of squared residual errors}}+\frac{1}{2}\lambda_{\theta}\underbrace{\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}^\top\boldsymbol{\theta}_{l}}_{\text{sum of squared entries}} $$
As can be seen,
mat0 += X[np.max(time_lags) - time_lags[L] : dim2 - time_lags[L] , :] @ np.diag(theta0[L, :])
corresponds to $$\sum_{l\in\mathcal{L},l\neq h}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}$$.
var1 += np.diag(np.multiply(B, B))
corresponds to $$\sum_{t=l_d+1}^{f}\text{diag}(\boldsymbol{x}_{t-h}\circledast \boldsymbol{x}_{t-h}).$$
var2 += np.diag(B) @ VarPi[t - np.max(time_lags), :]
corresponds to $$\sum_{t=l_d+1}^{f}{\boldsymbol{\pi}_{t}^{h}}\circledast \boldsymbol{x}_{t-h}.$$
import numpy as np
from numpy.linalg import inv as inv
def TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter):
"""Temporal Regularized Matrix Factorization, TRMF."""
## Initialize parameters
W = init_para["W"]
X = init_para["X"]
theta = init_para["theta"]
## Set hyperparameters
lambda_w = init_hyper["lambda_w"]
lambda_x = init_hyper["lambda_x"]
lambda_theta = init_hyper["lambda_theta"]
eta = init_hyper["eta"]
dim1, dim2 = sparse_mat.shape
pos_train = np.where(sparse_mat != 0)
pos_test = np.where((dense_mat != 0) & (sparse_mat == 0))
binary_mat = sparse_mat.copy()
binary_mat[pos_train] = 1
d, rank = theta.shape
for it in range(maxiter):
## Update spatial matrix W
for i in range(dim1):
pos0 = np.where(sparse_mat[i, :] != 0)
Xt = X[pos0[0], :]
vec0 = Xt.T @ sparse_mat[i, pos0[0]]
mat0 = inv(Xt.T @ Xt + lambda_w * np.eye(rank))
W[i, :] = mat0 @ vec0
## Update temporal matrix X
for t in range(dim2):
pos0 = np.where(sparse_mat[:, t] != 0)
Wt = W[pos0[0], :]
Mt = np.zeros((rank, rank))
Nt = np.zeros(rank)
if t < np.max(time_lags):
Pt = np.zeros((rank, rank))
Qt = np.zeros(rank)
else:
Pt = np.eye(rank)
Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :])
if t < dim2 - np.min(time_lags):
if t >= np.max(time_lags) and t < dim2 - np.max(time_lags):
index = list(range(0, d))
else:
index = list(np.where((t + time_lags >= np.max(time_lags)) & (t + time_lags < dim2)))[0]
for k in index:
Ak = theta[k, :]
Mt += np.diag(Ak ** 2)
theta0 = theta.copy()
theta0[k, :] = 0
Nt += np.multiply(Ak, X[t + time_lags[k], :]
- np.einsum('ij, ij -> j', theta0, X[t + time_lags[k] - time_lags, :]))
vec0 = Wt.T @ sparse_mat[pos0[0], t] + lambda_x * Nt + lambda_x * Qt
mat0 = inv(Wt.T @ Wt + lambda_x * Mt + lambda_x * Pt + lambda_x * eta * np.eye(rank))
X[t, :] = mat0 @ vec0
## Update AR coefficients theta
for k in range(d):
theta0 = theta.copy()
theta0[k, :] = 0
mat0 = np.zeros((dim2 - np.max(time_lags), rank))
for L in range(d):
mat0 += X[np.max(time_lags) - time_lags[L] : dim2 - time_lags[L] , :] @ np.diag(theta0[L, :])
VarPi = X[np.max(time_lags) : dim2, :] - mat0
var1 = np.zeros((rank, rank))
var2 = np.zeros(rank)
for t in range(np.max(time_lags), dim2):
B = X[t - time_lags[k], :]
var1 += np.diag(np.multiply(B, B))
var2 += np.diag(B) @ VarPi[t - np.max(time_lags), :]
theta[k, :] = inv(var1 + lambda_theta * np.eye(rank) / lambda_x) @ var2
mat_hat = W @ X.T
mape = np.sum(np.abs(dense_mat[pos_test] - mat_hat[pos_test])
/ dense_mat[pos_test]) / dense_mat[pos_test].shape[0]
rmse = np.sqrt(np.sum((dense_mat[pos_test] - mat_hat[pos_test]) ** 2)/dense_mat[pos_test].shape[0])
if (it + 1) % 100 == 0:
print('Iter: {}'.format(it + 1))
print('Imputation MAPE: {:.6}'.format(mape))
print('Imputation RMSE: {:.6}'.format(rmse))
print()
return mat_hat
In the following, we apply the above defined TRMF function to the task of missing data imputation task on the following spatiotemporal multivariate time series datasets/matrices:
The original data sets have been adapted into our experiments, and it is now available at the fold of datasets
.
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor']
dim = dense_tensor.shape
missing_rate = 0.4 # Non-random missing (NM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1])[:, :, np.newaxis] + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 10
time_lags = np.array([1, 2, 144])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.103494 Imputation RMSE: 4.35683 Iter: 200 Imputation MAPE: 0.103585 Imputation RMSE: 4.36487 Running time: 438 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor']
dim = dense_tensor.shape
missing_rate = 0.4 # Random missing (RM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 80
time_lags = np.array([1, 2, 144])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0782638 Imputation RMSE: 3.27797 Iter: 200 Imputation MAPE: 0.0776718 Imputation RMSE: 3.25724 Running time: 888 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor']
dim = dense_tensor.shape
missing_rate = 0.6 # Random missing (RM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 80
time_lags = np.array([1, 2, 144])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0867978 Imputation RMSE: 3.59856 Iter: 200 Imputation MAPE: 0.0847537 Imputation RMSE: 3.51839 Running time: 857 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor']
dim = dense_tensor.shape
missing_rate = 0.4 # Non-random missing (NM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1])[:, :, np.newaxis] + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 30
time_lags = np.array([1, 2, 108])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.257291 Imputation RMSE: 38.1766 Iter: 200 Imputation MAPE: 0.260397 Imputation RMSE: 37.864 Running time: 179 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor']
dim = dense_tensor.shape
missing_rate = 0.4 # Random missing (RM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 30
time_lags = np.array([1, 2, 108])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.232536 Imputation RMSE: 35.868 Iter: 200 Imputation MAPE: 0.232085 Imputation RMSE: 36.6294 Running time: 153 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor']
dim = dense_tensor.shape
missing_rate = 0.6 # Random missing (RM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 30
time_lags = np.array([1, 2, 108])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.242496 Imputation RMSE: 40.4021 Iter: 200 Imputation MAPE: 0.244181 Imputation RMSE: 41.1514 Running time: 138 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Seattle-data-set/tensor.npz')['arr_0']
dim = dense_tensor.shape
missing_rate = 0.4 # Non-random missing (NM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1])[:, :, np.newaxis] + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 10
time_lags = np.array([1, 2, 288])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0919508 Imputation RMSE: 5.29839 Iter: 200 Imputation MAPE: 0.0919374 Imputation RMSE: 5.29846 Running time: 391 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Seattle-data-set/tensor.npz')['arr_0']
dim = dense_tensor.shape
missing_rate = 0.4 # Random missing (RM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 50
time_lags = np.array([1, 2, 288])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0618055 Imputation RMSE: 3.79776 Iter: 200 Imputation MAPE: 0.0617049 Imputation RMSE: 3.79509 Running time: 615 seconds
Scenario setting:
import scipy.io
import numpy as np
np.random.seed(1000)
dense_tensor = scipy.io.loadmat('../datasets/Seattle-data-set/tensor.npz')['arr_0']
dim = dense_tensor.shape
missing_rate = 0.6 # Random missing (RM)
sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate)
dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]])
sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]])
del dense_tensor, sparse_tensor
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 50
time_lags = np.array([1, 2, 288])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0651363 Imputation RMSE: 3.93597 Iter: 200 Imputation MAPE: 0.0647982 Imputation RMSE: 3.92264 Running time: 620 seconds
London movement speed data set is is a city-wide hourly traffic speeddataset collected in London.
Observation rate | $>90\%$ | $>80\%$ | $>70\%$ | $>60\%$ | $>50\%$ |
---|---|---|---|---|---|
Number of roads | 17,666 | 27,148 | 35,912 | 44,352 | 52,727 |
If want to test on the full dataset, you could consider the following setting for masking observations as missing values.
import numpy as np
np.random.seed(1000)
mask_rate = 0.20
dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy')
pos_obs = np.where(dense_mat != 0)
num = len(pos_obs[0])
sample_ind = np.random.choice(num, size = int(mask_rate * num), replace = False)
sparse_mat = dense_mat.copy()
sparse_mat[pos_obs[0][sample_ind], pos_obs[1][sample_ind]] = 0
Notably, you could also consider to evaluate the model on a subset of the data with the following setting.
import numpy as np
np.random.seed(1000)
missing_rate = 0.4
dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy')
binary_mat = dense_mat.copy()
binary_mat[binary_mat != 0] = 1
pos = np.where(np.sum(binary_mat, axis = 1) > 0.7 * binary_mat.shape[1])
dense_mat = dense_mat[pos[0], :]
## Non-random missing (NM)
binary_mat = np.zeros(dense_mat.shape)
random_mat = np.random.rand(dense_mat.shape[0], 30)
for i1 in range(dense_mat.shape[0]):
for i2 in range(30):
binary_mat[i1, i2 * 24 : (i2 + 1) * 24] = np.round(random_mat[i1, i2] + 0.5 - missing_rate)
sparse_mat = dense_mat * binary_mat
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 20
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0958622 Imputation RMSE: 2.3367 Iter: 200 Imputation MAPE: 0.0957049 Imputation RMSE: 2.33306 Running time: 960 seconds
import numpy as np
np.random.seed(1000)
missing_rate = 0.4
dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy')
binary_mat = dense_mat.copy()
binary_mat[binary_mat != 0] = 1
pos = np.where(np.sum(binary_mat, axis = 1) > 0.7 * binary_mat.shape[1])
dense_mat = dense_mat[pos[0], :]
## Random missing (RM)
random_mat = np.random.rand(dense_mat.shape[0], dense_mat.shape[1])
binary_mat = np.round(random_mat + 0.5 - missing_rate)
sparse_mat = dense_mat * binary_mat
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 20
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0920514 Imputation RMSE: 2.23467 Iter: 200 Imputation MAPE: 0.0919819 Imputation RMSE: 2.23439 Running time: 1123 seconds
import numpy as np
np.random.seed(1000)
missing_rate = 0.6
dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy')
binary_mat = dense_mat.copy()
binary_mat[binary_mat != 0] = 1
pos = np.where(np.sum(binary_mat, axis = 1) > 0.7 * binary_mat.shape[1])
dense_mat = dense_mat[pos[0], :]
## Random missing (RM)
random_mat = np.random.rand(dense_mat.shape[0], dense_mat.shape[1])
binary_mat = np.round(random_mat + 0.5 - missing_rate)
sparse_mat = dense_mat * binary_mat
Model setting:
import time
start = time.time()
dim1, dim2 = sparse_mat.shape
rank = 20
time_lags = np.array([1, 2, 24])
d = time_lags.shape[0]
## Initialize parameters
W = 0.1 * np.random.rand(dim1, rank)
X = 0.1 * np.random.rand(dim2, rank)
theta = 0.1 * np.random.rand(d, rank)
init_para = {"W": W, "X": X, "theta": theta}
## Set hyparameters
lambda_w = 500
lambda_x = 500
lambda_theta = 500
eta = 0.03
init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta}
maxiter = 200
TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter)
end = time.time()
print('Running time: %d seconds'%(end - start))
Iter: 100 Imputation MAPE: 0.0936975 Imputation RMSE: 2.27472 Iter: 200 Imputation MAPE: 0.0936174 Imputation RMSE: 2.27413 Running time: 951 seconds