#!/usr/bin/env python # coding: utf-8 # # About this Notebook # # Temporal Regularized Matrix Factorization (TRMF) is an effective tool for imputing missing data within a given multivariate time series and forecasting time series with missing values. This approach is from the following literature: # # > Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon, 2016. [**Temporal regularized matrix factorization for high-dimensional time series prediction**](http://www.cs.utexas.edu/~rofuyu/papers/tr-mf-nips.pdf). 30th Conference on Neural Information Processing Systems (*NIPS 2016*), Barcelona, Spain. # # **Acknowledgement**: We would like to thank # # - Antony Masso Lussier (HEC Montreal) # # for providing helpful suggestion and discussion. Thank you! # ## Quick Run # # This notebook is publicly available for any usage at our data imputation project. Please click [**transdim**](https://github.com/xinychen/transdim). # # ## Data Organization: Matrix Structure # # In this post, we consider a dataset of $m$ discrete time series $\boldsymbol{y}_{i}\in\mathbb{R}^{f},i\in\left\{1,2,...,m\right\}$. The time series may have missing elements. We express spatio-temporal dataset as a matrix $Y\in\mathbb{R}^{m\times f}$ with $m$ rows (e.g., locations) and $f$ columns (e.g., discrete time intervals), # # $$Y=\left[ \begin{array}{cccc} y_{11} & y_{12} & \cdots & y_{1f} \\ y_{21} & y_{22} & \cdots & y_{2f} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \cdots & y_{mf} \\ \end{array} \right]\in\mathbb{R}^{m\times f}.$$ # # # ## TRMF model # # Temporal Regularized Matrix Factorization (TRMF) is an approach to incorporate temporal dependencies into commonly-used matrix factorization model. The temporal dependencies are described among ${\boldsymbol{x}_t}$ explicitly. Such approach takes the form: # # $$\boldsymbol{x}_{t}\approx\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l},$$ # where this autoregressive (AR) is specialized by a lag set $\mathcal{L}=\left\{l_1,l_2,...,l_d\right\}$ (e.g., $\mathcal{L}=\left\{1,2,144\right\}$) and weights $\boldsymbol{\theta}_{l}\in\mathbb{R}^{r},\forall l$, and we further define # # $$\mathcal{R}_{AR}\left(X\mid \mathcal{L},\Theta,\eta\right)=\frac{1}{2}\sum_{t=l_d+1}^{f}\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)^\top\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)+\frac{\eta}{2}\sum_{t=1}^{f}\boldsymbol{x}_{t}^\top\boldsymbol{x}_{t}.$$ # # Thus, TRMF-AR is given by solving # # $$\min_{W,X,\Theta}\frac{1}{2}\underbrace{\sum_{(i,t)\in\Omega}\left(y_{it}-\boldsymbol{w}_{i}^T\boldsymbol{x}_{t}\right)^2}_{\text{sum of squared residual errors}}+\lambda_{w}\underbrace{\mathcal{R}_{w}\left(W\right)}_{W-\text{regularizer}}+\lambda_{x}\underbrace{\mathcal{R}_{AR}\left(X\mid \mathcal{L},\Theta,\eta\right)}_{\text{AR-regularizer}}+\lambda_{\theta}\underbrace{\mathcal{R}_{\theta}\left(\Theta\right)}_{\Theta-\text{regularizer}}$$ # # where $\mathcal{R}_{w}\left(W\right)=\frac{1}{2}\sum_{i=1}^{m}\boldsymbol{w}_{i}^\top\boldsymbol{w}_{i}$ and $\mathcal{R}_{\theta}\left(\Theta\right)=\frac{1}{2}\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}^\top\boldsymbol{\theta}_{l}$ are regularization terms. # ### Define TRMF model with `Numpy` # # Observing the optimization problem of TRMF model as mentioned above, we categorize the parameters within this model as **parameters** (i.e., `init_para` in the TRMF function) and **hyperparameters** (i.e., `init_hyper`). # # - **Parameters** include spatial matrix $W$, temporal matrix $X$, and AR coefficients $\Theta$. # - **Hyperparameters** include weight parameters on some regularizers, i.e., $\lambda_w$, $\lambda_x$, $\lambda_\theta$, and $\eta$. # ### How to understand Python code of TRMF? # # #### Update spatial matrix $W$ # # We write Python code for updating spatial matrix as follows, # # ```python # for i in range(dim1): # pos0 = np.where(sparse_mat[i, :] != 0) # Xt = X[pos0[0], :] # vec0 = Xt.T @ sparse_mat[i, pos0[0]] # mat0 = inv(Xt.T @ Xt + lambda_w * np.eye(rank)) # W[i, :] = mat0 @ vec0 # ``` # # For your better understanding of these codes, let us see what happened in each line. Recall that the equation for updating $W$ is # $$\boldsymbol{w}_{i} \Leftarrow\left(\sum_{t:(i, t) \in \Omega} \boldsymbol{x}_{t} \boldsymbol{x}_{t}^{T}+\lambda_{w} I\right)^{-1} \sum_{t:(i, t) \in \Omega} y_{i t} \boldsymbol{x}_{t}$$ # from the optimizization problem: # $$\min _{W} \frac{1}{2} \underbrace{\sum_{(i, t) \in \Omega}\left(y_{i t}-\boldsymbol{w}_{i}^{T} \boldsymbol{x}_{t}\right)^{2}}_{\text {sum of squared residual errors }}+\frac{1}{2} \lambda_{w} \underbrace{\sum_{i=1}^{m} \boldsymbol{w}_{i}^{T} \boldsymbol{w}_{i}}_{\text{sum of squared entries}}.$$ # # As can be seen, # # - `vec0 = Xt.T @ sparse_mat[i, pos0[0]])` corresponds to $$\sum_{t:(i, t) \in \Omega} y_{i t} \boldsymbol{x}_{t}.$$ # # - `mat0 = inv(Xt.T @ Xt + lambda_w * np.eye(rank))` corresponds to $$\left(\sum_{t:(i, t) \in \Omega} \boldsymbol{x}_{t} \boldsymbol{x}_{t}^{T}+\lambda_{w} I\right)^{-1}.$$ # # - `W[i, :] = mat0 @ vec0` corresponds to the update: # $$\boldsymbol{w}_{i} \Leftarrow\left(\sum_{t:(i, t) \in \Omega} \boldsymbol{x}_{t} \boldsymbol{x}_{t}^{T}+\lambda_{w} I\right)^{-1} \sum_{t:(i, t) \in \Omega} y_{i t} \boldsymbol{x}_{t}.$$ # #### Update temporal matrix $X$ # # We write Python code for updating temporal matrix as follows, # # ```python # for t in range(dim2): # pos0 = np.where(sparse_mat[:, t] != 0) # Wt = W[pos0[0], :] # Mt = np.zeros((rank, rank)) # Nt = np.zeros(rank) # if t < np.max(time_lags): # Pt = np.zeros((rank, rank)) # Qt = np.zeros(rank) # else: # Pt = np.eye(rank) # Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :]) # if t < dim2 - np.min(time_lags): # if t >= np.max(time_lags) and t < dim2 - np.max(time_lags): # index = list(range(0, d)) # else: # index = list(np.where((t + time_lags >= np.max(time_lags)) & (t + time_lags < dim2)))[0] # for k in index: # Ak = theta[k, :] # Mt += np.diag(Ak ** 2) # theta0 = theta.copy() # theta0[k, :] = 0 # Nt += np.multiply(Ak, X[t + time_lags[k], :] # - np.einsum('ij, ij -> j', theta0, X[t + time_lags[k] - time_lags, :])) # vec0 = Wt.T @ sparse_mat[pos0[0], t] + lambda_x * Nt + lambda_x * Qt # mat0 = inv(Wt.T @ Wt + lambda_x * Mt + lambda_x * Pt + lambda_x * eta * np.eye(rank)) # X[t, :] = mat0 @ vec0 # ``` # These codes seem to be very complicated. Let us first see the optimization problem for getting a closed-form update of $X$: # $$\min_{W,X,\Theta}\frac{1}{2}\underbrace{\sum_{(i,t)\in\Omega}\left(y_{it}-\boldsymbol{w}_{i}^T\boldsymbol{x}_{t}\right)^2}_{\text{sum of squared residual errors}}+\underbrace{\frac{1}{2}\lambda_{x}\sum_{t=l_d+1}^{f}\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)^\top\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)+\frac{1}{2}\lambda_{x}\eta\sum_{t=1}^{f}\boldsymbol{x}_{t}^\top\boldsymbol{x}_{t}}_{\text{AR-term}}+\underbrace{\frac{1}{2}\lambda_{\theta}\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}^\top\boldsymbol{\theta}_{l}}_{\Theta-\text{term}}.$$ # # - For $t=1,...,l_d$, update of $X$ is # $$\boldsymbol{x}_{t} \Leftarrow\left(\sum_{i:(i, t) \in \Omega} \boldsymbol{w}_{i} \boldsymbol{w}_{i}^{T}+\lambda_{x} \eta I\right)^{-1} \sum_{i:(i, t) \in \Omega} y_{i t} \boldsymbol{w}_{i}.$$ # - For $t=l_d+1,...,f$, update of $X$ is # $${\boldsymbol{x}_{t}\Leftarrow\left(\sum_{i:(i,t)\in\Omega}\boldsymbol{w}_{i}\boldsymbol{w}_{i}^{T}+\lambda_xI+\lambda_x\sum_{h\in\mathcal{L},t+h \leq T}\text{diag}(\boldsymbol{\theta}_{h}\circledast\boldsymbol{\theta}_{h})+\lambda_x\eta I\right)^{-1}}{\left(\sum_{i:(i,t)\in\Omega}y_{it}\boldsymbol{w}_{i}+\lambda_x\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}+\lambda_x\sum_{h\in\mathcal{L},t+h \leq T}\boldsymbol{\theta}_{h}\circledast\boldsymbol{\psi}_{t+h}\right)}.$$ # # Then, as can be seen, # # - `Mt += np.diag(Ak ** 2)` corresponds to $$\sum_{h\in\mathcal{L},t+h \leq T}\text{diag}(\boldsymbol{\theta}_{h}\circledast\boldsymbol{\theta}_{h}).$$ # # - `Nt += np.multiply(Ak, X[t + time_lags[k], :] - np.einsum('ij, ij -> j', theta0, X[t + time_lags[k] - time_lags, :]))` corresponds to $$\sum_{h\in\mathcal{L},t+h \leq T}\boldsymbol{\theta}_{h}\circledast\boldsymbol{\psi}_{t+h}.$$ # # - `Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :])` corresponds to $$\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}.$$ # -`X[t, :] = mat0 @ vec0` corresponds to the update of $X$. # #### Update AR coefficients $\Theta$ # # We write Python code for updating temporal matrix as follows, # # ```python # for k in range(d): # theta0 = theta.copy() # theta0[k, :] = 0 # mat0 = np.zeros((dim2 - np.max(time_lags), rank)) # for L in range(d): # mat0 += X[np.max(time_lags) - time_lags[L] : dim2 - time_lags[L] , :] @ np.diag(theta0[L, :]) # VarPi = X[np.max(time_lags) : dim2, :] - mat0 # var1 = np.zeros((rank, rank)) # var2 = np.zeros(rank) # for t in range(np.max(time_lags), dim2): # B = X[t - time_lags[k], :] # var1 += np.diag(np.multiply(B, B)) # var2 += np.diag(B) @ VarPi[t - np.max(time_lags), :] # theta[k, :] = inv(var1 + lambda_theta * np.eye(rank) / lambda_x) @ var2 # ``` # # For your better understanding of these codes, let us see what happened in each line. Recall that the equation for updating $\theta$ is # $$ # \color{red} {\boldsymbol{\theta}_{h}\Leftarrow\left(\sum_{t=l_d+1}^{f}\text{diag}(\boldsymbol{x}_{t-h}\circledast \boldsymbol{x}_{t-h})+\frac{\lambda_{\theta}}{\lambda_x}I\right)^{-1}\left(\sum_{t=l_d+1}^{f}{\boldsymbol{\pi}_{t}^{h}}\circledast \boldsymbol{x}_{t-h}\right)} # $$ # where $\boldsymbol{\pi}_{t}^{h}=\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L},l\neq h}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}$ from the optimizization problem: # $$ # \min_{\Theta}\frac{1}{2}\lambda_{x}\underbrace{\sum_{t=l_d+1}^{f}\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)^\top\left(\boldsymbol{x}_{t}-\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}\right)}_{\text{sum of squared residual errors}}+\frac{1}{2}\lambda_{\theta}\underbrace{\sum_{l\in\mathcal{L}}\boldsymbol{\theta}_{l}^\top\boldsymbol{\theta}_{l}}_{\text{sum of squared entries}} # $$ # # As can be seen, # - `mat0 += X[np.max(time_lags) - time_lags[L] : dim2 - time_lags[L] , :] @ np.diag(theta0[L, :])` corresponds to $$\sum_{l\in\mathcal{L},l\neq h}\boldsymbol{\theta}_{l}\circledast\boldsymbol{x}_{t-l}$$. # # - `var1 += np.diag(np.multiply(B, B))` corresponds to $$\sum_{t=l_d+1}^{f}\text{diag}(\boldsymbol{x}_{t-h}\circledast \boldsymbol{x}_{t-h}).$$ # # - `var2 += np.diag(B) @ VarPi[t - np.max(time_lags), :]` corresponds to $$\sum_{t=l_d+1}^{f}{\boldsymbol{\pi}_{t}^{h}}\circledast \boldsymbol{x}_{t-h}.$$ # In[1]: import numpy as np from numpy.linalg import inv as inv def TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter): """Temporal Regularized Matrix Factorization, TRMF.""" ## Initialize parameters W = init_para["W"] X = init_para["X"] theta = init_para["theta"] ## Set hyperparameters lambda_w = init_hyper["lambda_w"] lambda_x = init_hyper["lambda_x"] lambda_theta = init_hyper["lambda_theta"] eta = init_hyper["eta"] dim1, dim2 = sparse_mat.shape pos_train = np.where(sparse_mat != 0) pos_test = np.where((dense_mat != 0) & (sparse_mat == 0)) binary_mat = sparse_mat.copy() binary_mat[pos_train] = 1 d, rank = theta.shape for it in range(maxiter): ## Update spatial matrix W for i in range(dim1): pos0 = np.where(sparse_mat[i, :] != 0) Xt = X[pos0[0], :] vec0 = Xt.T @ sparse_mat[i, pos0[0]] mat0 = inv(Xt.T @ Xt + lambda_w * np.eye(rank)) W[i, :] = mat0 @ vec0 ## Update temporal matrix X for t in range(dim2): pos0 = np.where(sparse_mat[:, t] != 0) Wt = W[pos0[0], :] Mt = np.zeros((rank, rank)) Nt = np.zeros(rank) if t < np.max(time_lags): Pt = np.zeros((rank, rank)) Qt = np.zeros(rank) else: Pt = np.eye(rank) Qt = np.einsum('ij, ij -> j', theta, X[t - time_lags, :]) if t < dim2 - np.min(time_lags): if t >= np.max(time_lags) and t < dim2 - np.max(time_lags): index = list(range(0, d)) else: index = list(np.where((t + time_lags >= np.max(time_lags)) & (t + time_lags < dim2)))[0] for k in index: Ak = theta[k, :] Mt += np.diag(Ak ** 2) theta0 = theta.copy() theta0[k, :] = 0 Nt += np.multiply(Ak, X[t + time_lags[k], :] - np.einsum('ij, ij -> j', theta0, X[t + time_lags[k] - time_lags, :])) vec0 = Wt.T @ sparse_mat[pos0[0], t] + lambda_x * Nt + lambda_x * Qt mat0 = inv(Wt.T @ Wt + lambda_x * Mt + lambda_x * Pt + lambda_x * eta * np.eye(rank)) X[t, :] = mat0 @ vec0 ## Update AR coefficients theta for k in range(d): theta0 = theta.copy() theta0[k, :] = 0 mat0 = np.zeros((dim2 - np.max(time_lags), rank)) for L in range(d): mat0 += X[np.max(time_lags) - time_lags[L] : dim2 - time_lags[L] , :] @ np.diag(theta0[L, :]) VarPi = X[np.max(time_lags) : dim2, :] - mat0 var1 = np.zeros((rank, rank)) var2 = np.zeros(rank) for t in range(np.max(time_lags), dim2): B = X[t - time_lags[k], :] var1 += np.diag(np.multiply(B, B)) var2 += np.diag(B) @ VarPi[t - np.max(time_lags), :] theta[k, :] = inv(var1 + lambda_theta * np.eye(rank) / lambda_x) @ var2 mat_hat = W @ X.T mape = np.sum(np.abs(dense_mat[pos_test] - mat_hat[pos_test]) / dense_mat[pos_test]) / dense_mat[pos_test].shape[0] rmse = np.sqrt(np.sum((dense_mat[pos_test] - mat_hat[pos_test]) ** 2)/dense_mat[pos_test].shape[0]) if (it + 1) % 100 == 0: print('Iter: {}'.format(it + 1)) print('Imputation MAPE: {:.6}'.format(mape)) print('Imputation RMSE: {:.6}'.format(rmse)) print() return mat_hat # ## Missing Data Imputation # # In the following, we apply the above defined TRMF function to the task of missing data imputation task on the following spatiotemporal multivariate time series datasets/matrices: # # - **Guangzhou data set**: [Guangzhou urban traffic speed data set](https://doi.org/10.5281/zenodo.1205228). # - **Birmingham data set**: [Birmingham parking data set](https://archive.ics.uci.edu/ml/datasets/Parking+Birmingham). # - **Hangzhou data set**: [Hangzhou metro passenger flow data set](https://doi.org/10.5281/zenodo.3145403). # - **Settle data set**: [Seattle freeway traffic speed data set](https://github.com/zhiyongc/Seattle-Loop-Data). # # The original data sets have been adapted into our experiments, and it is now available at the fold of `datasets`. # ## Evaluation on Guangzhou Speed Data # # **Scenario setting**: # # - Tensor size: $214\times 61\times 144$ (road segment, day, time of day) # - Non-random missing (NM) # - 40% missing rate # # In[2]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'] dim = dense_tensor.shape missing_rate = 0.4 # Non-random missing (NM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1])[:, :, np.newaxis] + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 10 # - Time lags: {1, 2, 144} # - The number of iterations: 200 # In[3]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 10 time_lags = np.array([1, 2, 144]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # **Scenario setting**: # # - Tensor size: $214\times 61\times 144$ (road segment, day, time of day) # - Random missing (RM) # - 40% missing rate # # In[4]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'] dim = dense_tensor.shape missing_rate = 0.4 # Random missing (RM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 80 # - Time lags: {1, 2, 144} # - The number of iterations: 200 # In[5]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 80 time_lags = np.array([1, 2, 144]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # **Scenario setting**: # # - Tensor size: $214\times 61\times 144$ (road segment, day, time of day) # - Random missing (RM) # - 60% missing rate # # In[6]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'] dim = dense_tensor.shape missing_rate = 0.6 # Random missing (RM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 80 # - Time lags: {1, 2, 144} # - The number of iterations: 200 # In[7]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 80 time_lags = np.array([1, 2, 144]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # ## Evaluation on Hangzhou Flow Data # # **Scenario setting**: # # - Tensor size: $80\times 25\times 108$ (metro station, day, time of day) # - Non-random missing (NM) # - 40% missing rate # # In[10]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor'] dim = dense_tensor.shape missing_rate = 0.4 # Non-random missing (NM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1])[:, :, np.newaxis] + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 30 # - Time lags: {1, 2, 108} # - The number of iterations: 200 # In[11]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 30 time_lags = np.array([1, 2, 108]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # **Scenario setting**: # # - Tensor size: $80\times 25\times 108$ (metro station, day, time of day) # - Random missing (RM) # - 40% missing rate # # In[12]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor'] dim = dense_tensor.shape missing_rate = 0.4 # Random missing (RM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 30 # - Time lags: {1, 2, 108} # - The number of iterations: 200 # In[13]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 30 time_lags = np.array([1, 2, 108]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # **Scenario setting**: # # - Tensor size: $80\times 25\times 108$ (metro station, day, time of day) # - Random missing (RM) # - 60% missing rate # # In[14]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor'] dim = dense_tensor.shape missing_rate = 0.6 # Random missing (RM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 30 # - Time lags: {1, 2, 108} # - The number of iterations: 200 # In[15]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 30 time_lags = np.array([1, 2, 108]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # ## Evaluation on Seattle Speed Data # # **Scenario setting**: # # - Tensor size: $323\times 28\times 288$ (road segment, day, time of day) # - Non-random missing (NM) # - 40% missing rate # # In[16]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Seattle-data-set/tensor.npz')['arr_0'] dim = dense_tensor.shape missing_rate = 0.4 # Non-random missing (NM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1])[:, :, np.newaxis] + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 10 # - Time lags: {1, 2, 288} # - The number of iterations: 200 # In[17]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 10 time_lags = np.array([1, 2, 288]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # **Scenario setting**: # # - Tensor size: $323\times 28\times 288$ (road segment, day, time of day) # - Random missing (RM) # - 40% missing rate # # In[18]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Seattle-data-set/tensor.npz')['arr_0'] dim = dense_tensor.shape missing_rate = 0.4 # Random missing (RM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 50 # - Time lags: {1, 2, 288} # - The number of iterations: 200 # In[19]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 50 time_lags = np.array([1, 2, 288]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # **Scenario setting**: # # - Tensor size: $323\times 28\times 288$ (road segment, day, time of day) # - Random missing (RM) # - 60% missing rate # # In[20]: import scipy.io import numpy as np np.random.seed(1000) dense_tensor = scipy.io.loadmat('../datasets/Seattle-data-set/tensor.npz')['arr_0'] dim = dense_tensor.shape missing_rate = 0.6 # Random missing (RM) sparse_tensor = dense_tensor * np.round(np.random.rand(dim[0], dim[1], dim[2]) + 0.5 - missing_rate) dense_mat = dense_tensor.reshape([dim[0], dim[1] * dim[2]]) sparse_mat = sparse_tensor.reshape([dim[0], dim[1] * dim[2]]) del dense_tensor, sparse_tensor # **Model setting**: # # - Low rank: 50 # - Time lags: {1, 2, 288} # - The number of iterations: 200 # In[21]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 50 time_lags = np.array([1, 2, 288]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # ## Evaluation on London Movement Speed Data # # London movement speed data set is is a city-wide hourly traffic speeddataset collected in London. # # - Collected from 200,000+ road segments. # - 720 time points in April 2019. # - 73% missing values in the original data. # # | Observation rate | $>90\%$ | $>80\%$ | $>70\%$ | $>60\%$ | $>50\%$ | # |:------------------|--------:|--------:|--------:|--------:|--------:| # |**Number of roads**| 17,666 | 27,148 | 35,912 | 44,352 | 52,727 | # # # If want to test on the full dataset, you could consider the following setting for masking observations as missing values. # # ```python # import numpy as np # np.random.seed(1000) # mask_rate = 0.20 # # dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy') # pos_obs = np.where(dense_mat != 0) # num = len(pos_obs[0]) # sample_ind = np.random.choice(num, size = int(mask_rate * num), replace = False) # sparse_mat = dense_mat.copy() # sparse_mat[pos_obs[0][sample_ind], pos_obs[1][sample_ind]] = 0 # ``` # # Notably, you could also consider to evaluate the model on a subset of the data with the following setting. # In[22]: import numpy as np np.random.seed(1000) missing_rate = 0.4 dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy') binary_mat = dense_mat.copy() binary_mat[binary_mat != 0] = 1 pos = np.where(np.sum(binary_mat, axis = 1) > 0.7 * binary_mat.shape[1]) dense_mat = dense_mat[pos[0], :] ## Non-random missing (NM) binary_mat = np.zeros(dense_mat.shape) random_mat = np.random.rand(dense_mat.shape[0], 30) for i1 in range(dense_mat.shape[0]): for i2 in range(30): binary_mat[i1, i2 * 24 : (i2 + 1) * 24] = np.round(random_mat[i1, i2] + 0.5 - missing_rate) sparse_mat = dense_mat * binary_mat # **Model setting**: # # - Low rank: 20 # - Time lags: {1, 2, 24} # - The number of iterations: 200 # In[23]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 20 time_lags = np.array([1, 2, 24]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # In[24]: import numpy as np np.random.seed(1000) missing_rate = 0.4 dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy') binary_mat = dense_mat.copy() binary_mat[binary_mat != 0] = 1 pos = np.where(np.sum(binary_mat, axis = 1) > 0.7 * binary_mat.shape[1]) dense_mat = dense_mat[pos[0], :] ## Random missing (RM) random_mat = np.random.rand(dense_mat.shape[0], dense_mat.shape[1]) binary_mat = np.round(random_mat + 0.5 - missing_rate) sparse_mat = dense_mat * binary_mat # **Model setting**: # # - Low rank: 20 # - Time lags: {1, 2, 24} # - The number of iterations: 200 # In[25]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 20 time_lags = np.array([1, 2, 24]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # In[26]: import numpy as np np.random.seed(1000) missing_rate = 0.6 dense_mat = np.load('../datasets/London-data-set/hourly_speed_mat.npy') binary_mat = dense_mat.copy() binary_mat[binary_mat != 0] = 1 pos = np.where(np.sum(binary_mat, axis = 1) > 0.7 * binary_mat.shape[1]) dense_mat = dense_mat[pos[0], :] ## Random missing (RM) random_mat = np.random.rand(dense_mat.shape[0], dense_mat.shape[1]) binary_mat = np.round(random_mat + 0.5 - missing_rate) sparse_mat = dense_mat * binary_mat # **Model setting**: # # - Low rank: 20 # - Time lags: {1, 2, 24} # - The number of iterations: 200 # In[27]: import time start = time.time() dim1, dim2 = sparse_mat.shape rank = 20 time_lags = np.array([1, 2, 24]) d = time_lags.shape[0] ## Initialize parameters W = 0.1 * np.random.rand(dim1, rank) X = 0.1 * np.random.rand(dim2, rank) theta = 0.1 * np.random.rand(d, rank) init_para = {"W": W, "X": X, "theta": theta} ## Set hyparameters lambda_w = 500 lambda_x = 500 lambda_theta = 500 eta = 0.03 init_hyper = {"lambda_w": lambda_w, "lambda_x": lambda_x, "lambda_theta": lambda_theta, "eta": eta} maxiter = 200 TRMF(dense_mat, sparse_mat, init_para, init_hyper, time_lags, maxiter) end = time.time() print('Running time: %d seconds'%(end - start)) # ### License # #
# This work is released under the MIT license. #