Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka 

CPython 3.7.3
IPython 7.6.1

torch 1.2.0
  • Runs on CPU or GPU (if available)

Basic Graph Neural Network with Spectral Graph Convolution on MNIST

Implementing a very basic graph neural network (GNN) using a spectral graph convolution.

Here, the 28x28 image of a digit in MNIST represents the graph, where each pixel (i.e., cell in the grid) represents a particular node. The feature of that node is simply the pixel intensity in range [0, 1].

Here, the adjacency matrix of the pixels is basically just determined by their neighborhood pixels. Using a Gaussian filter, we connect pixels based on their Euclidean distance in the grid.

In the related notebook, ./gnn-basic-1.ipynb, we used this adjacency matrix $A$ to compute the output of a layer as

$$X^{(l+1)}=A X^{(l)} W^{(l)}.$$

Here, $A$ is the $N \times N$ adjacency matrix, and $X$ is the $N \times C$ feature matrix (a 2D coordinate array, where $N$ is the total number of pixels -- $28 \times 28 = 784$ in MNIST). $W$ is the weight matrix of shape $N \times P$, where $P$ would represent the number of classes if we have only a single hidden layer.

In this notebook, we modify this code using spectral graph convolution, i.e.,

$$X^{(l+1)}=V\left(V^{T} X^{(l)} \odot V^{T} W_{\text {spectral }}^{(l)}\right).$$

Where $V$ are the eigenvectors of the graph Laplacian $L$, which we can compute from the adjacency matrix $A$. Here, $W_{\text {spectral }}$ represents the trainable weights (filters).

Imports

In [2]:
import time
import numpy as np
from scipy.spatial.distance import cdist
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
from torch.utils.data.dataset import Subset


if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True
In [3]:
%matplotlib inline
import matplotlib.pyplot as plt

Settings and Dataset

In [4]:
##########################
### SETTINGS
##########################

# Device
DEVICE = torch.device("cuda:3" if torch.cuda.is_available() else "cpu")

# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.05
NUM_EPOCHS = 50
BATCH_SIZE = 128
IMG_SIZE = 28

# Architecture
NUM_CLASSES = 10

MNIST Dataset

In [5]:
train_indices = torch.arange(0, 59000)
valid_indices = torch.arange(59000, 60000)

custom_transform = transforms.Compose([transforms.ToTensor()])


train_and_valid = datasets.MNIST(root='data', 
                                 train=True, 
                                 transform=custom_transform,
                                 download=True)

test_dataset = datasets.MNIST(root='data', 
                              train=False, 
                              transform=custom_transform,
                              download=True)

train_dataset = Subset(train_and_valid, train_indices)
valid_dataset = Subset(train_and_valid, valid_indices)

train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=4,
                          shuffle=True)

valid_loader = DataLoader(dataset=valid_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=4,
                          shuffle=False)

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE,
                         num_workers=4,
                         shuffle=False)

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
Image batch dimensions: torch.Size([128, 1, 28, 28])
Image label dimensions: torch.Size([128])

Model

In [6]:
def precompute_adjacency_matrix(img_size):
    col, row = np.meshgrid(np.arange(img_size), np.arange(img_size))
    
    # N = img_size^2
    # construct 2D coordinate array (shape N x 2) and normalize
    # in range [0, 1]
    coord = np.stack((col, row), axis=2).reshape(-1, 2) / img_size

    # compute pairwise distance matrix (N x N)
    dist = cdist(coord, coord, metric='euclidean')
    
    # Apply Gaussian filter
    sigma = 0.05 * np.pi
    A = np.exp(- dist / sigma ** 2)
    A[A < 0.01] = 0
    A = torch.from_numpy(A).float()
    
    return A

    """
    # Normalization as per (Kipf & Welling, ICLR 2017)
    D = A.sum(1)  # nodes degree (N,)
    D_hat = (D + 1e-5) ** (-0.5)
    A_hat = D_hat.view(-1, 1) * A * D_hat.view(1, -1)  # N,N
    
    return A_hat
    """


def get_graph_laplacian(A):
    # From https://towardsdatascience.com/spectral-graph-convolution-
    #   explained-and-implemented-step-by-step-2e495b57f801
    #
    # Computing the graph Laplacian
    # A is an adjacency matrix of some graph G
    N = A.shape[0] # number of nodes in a graph
    D = np.sum(A, 0) # node degrees
    D_hat = np.diag((D + 1e-5)**(-0.5)) # normalized node degrees
    L = np.identity(N) - np.dot(D_hat, A).dot(D_hat) # Laplacian
    return torch.from_numpy(L).float()
In [7]:
A = precompute_adjacency_matrix(28)
plt.imshow(A, vmin=0., vmax=1.)
plt.colorbar()
plt.show()
In [8]:
L = get_graph_laplacian(A.numpy())
plt.imshow(L, vmin=0., vmax=1.)
plt.colorbar()
plt.show()
In [9]:
##########################
### MODEL
##########################

from scipy.sparse.linalg import eigsh
        

class GraphNet(nn.Module):
    def __init__(self, img_size=28, num_filters=2, num_classes=10):
        super(GraphNet, self).__init__()
        
        n_rows = img_size**2
        self.fc = nn.Linear(n_rows*num_filters, num_classes, bias=False)

        A = precompute_adjacency_matrix(img_size)
        L = get_graph_laplacian(A.numpy())
        Λ,V = eigsh(L.numpy(), k=20, which='SM') # eigen-decomposition (i.e. find Λ,V)

        V = torch.from_numpy(V)
        
        # Weight matrix
        W_spectral = nn.Parameter(torch.ones((img_size**2, num_filters))).float()
        torch.nn.init.kaiming_uniform_(W_spectral)
        
        self.register_buffer('A', A)
        self.register_buffer('L', L)
        self.register_buffer('V', V)
        self.register_buffer('W_spectral', W_spectral)

        

    def forward(self, x):
        
        B = x.size(0) # Batch size

        ### Reshape eigenvectors
        # from [H*W, 20] to [B, H*W, 20]
        V_tensor = self.V.unsqueeze(0)
        V_tensor = self.V.expand(B, -1, -1)
        # from [H*W, 20] to [B, 20, H*W]
        V_tensor_T = self.V.T.unsqueeze(0)
        V_tensor_T = self.V.T.expand(B, -1, -1)
        
        ### Reshape inputs
        # [B, C, H, W] => [B, H*W, 1]
        x_reshape = x.view(B, -1, 1)
        
        ### Reshape spectral weights
        # to size [128, H*W, F]
        W_spectral_tensor = self.W_spectral.unsqueeze(0)
        W_spectral_tensor = self.W_spectral.expand(B, -1, -1)
        
        ### Spectral convolution on graphs
        # [B, 20, H*W] . [B, H*W, 1]  ==> [B, 20, 1]
        X_hat = V_tensor_T.bmm(x_reshape) # 20×1 node features in the "spectral" domain
        W_hat = V_tensor_T.bmm(W_spectral_tensor)  # 20×F filters in the "spectral" domain
        Y = V_tensor.bmm(X_hat * W_hat)  # N×F result of convolution

        ### Fully connected
        logits = self.fc(Y.reshape(B, -1))
        probas = F.softmax(logits, dim=1)
        return logits, probas
In [10]:
torch.manual_seed(RANDOM_SEED)
model = GraphNet(img_size=IMG_SIZE, num_classes=NUM_CLASSES)

model = model.to(DEVICE)

optimizer = torch.optim.SGD(model.parameters(), lr=LEARNING_RATE)  

Training

In [11]:
def compute_acc(model, data_loader, device):
    correct_pred, num_examples = 0, 0
    for features, targets in data_loader:
        features = features.to(device)
        targets = targets.to(device)
        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100
    

start_time = time.time()

cost_list = []
train_acc_list, valid_acc_list = [], []


for epoch in range(NUM_EPOCHS):
    
    model.train()
    for batch_idx, (features, targets) in enumerate(train_loader):
        
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        #################################################
        ### CODE ONLY FOR LOGGING BEYOND THIS POINT
        ################################################
        cost_list.append(cost.item())
        if not batch_idx % 150:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

        

    model.eval()
    with torch.set_grad_enabled(False): # save memory during inference
        
        train_acc = compute_acc(model, train_loader, device=DEVICE)
        valid_acc = compute_acc(model, valid_loader, device=DEVICE)
        
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d}\n'
              f'Train ACC: {train_acc:.2f} | Validation ACC: {valid_acc:.2f}')
        
        train_acc_list.append(train_acc)
        valid_acc_list.append(valid_acc)
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/050 | Batch 000/461 | Cost: 2.3133
Epoch: 001/050 | Batch 150/461 | Cost: 1.1899
Epoch: 001/050 | Batch 300/461 | Cost: 1.0481
Epoch: 001/050 | Batch 450/461 | Cost: 0.9287
Epoch: 001/050
Train ACC: 73.79 | Validation ACC: 78.10
Time elapsed: 0.07 min
Epoch: 002/050 | Batch 000/461 | Cost: 0.8224
Epoch: 002/050 | Batch 150/461 | Cost: 0.9684
Epoch: 002/050 | Batch 300/461 | Cost: 0.6952
Epoch: 002/050 | Batch 450/461 | Cost: 0.8158
Epoch: 002/050
Train ACC: 77.48 | Validation ACC: 82.20
Time elapsed: 0.14 min
Epoch: 003/050 | Batch 000/461 | Cost: 0.8203
Epoch: 003/050 | Batch 150/461 | Cost: 0.8409
Epoch: 003/050 | Batch 300/461 | Cost: 0.8602
Epoch: 003/050 | Batch 450/461 | Cost: 0.7012
Epoch: 003/050
Train ACC: 78.55 | Validation ACC: 83.40
Time elapsed: 0.21 min
Epoch: 004/050 | Batch 000/461 | Cost: 0.7919
Epoch: 004/050 | Batch 150/461 | Cost: 0.9010
Epoch: 004/050 | Batch 300/461 | Cost: 0.6895
Epoch: 004/050 | Batch 450/461 | Cost: 0.6981
Epoch: 004/050
Train ACC: 79.30 | Validation ACC: 84.10
Time elapsed: 0.28 min
Epoch: 005/050 | Batch 000/461 | Cost: 0.6080
Epoch: 005/050 | Batch 150/461 | Cost: 0.6627
Epoch: 005/050 | Batch 300/461 | Cost: 0.7620
Epoch: 005/050 | Batch 450/461 | Cost: 0.8047
Epoch: 005/050
Train ACC: 79.66 | Validation ACC: 84.50
Time elapsed: 0.35 min
Epoch: 006/050 | Batch 000/461 | Cost: 0.5992
Epoch: 006/050 | Batch 150/461 | Cost: 0.5546
Epoch: 006/050 | Batch 300/461 | Cost: 0.6459
Epoch: 006/050 | Batch 450/461 | Cost: 0.5968
Epoch: 006/050
Train ACC: 79.91 | Validation ACC: 85.10
Time elapsed: 0.42 min
Epoch: 007/050 | Batch 000/461 | Cost: 0.7909
Epoch: 007/050 | Batch 150/461 | Cost: 0.6488
Epoch: 007/050 | Batch 300/461 | Cost: 0.7580
Epoch: 007/050 | Batch 450/461 | Cost: 0.5646
Epoch: 007/050
Train ACC: 80.50 | Validation ACC: 85.00
Time elapsed: 0.48 min
Epoch: 008/050 | Batch 000/461 | Cost: 0.6147
Epoch: 008/050 | Batch 150/461 | Cost: 0.6998
Epoch: 008/050 | Batch 300/461 | Cost: 0.5563
Epoch: 008/050 | Batch 450/461 | Cost: 0.5611
Epoch: 008/050
Train ACC: 80.73 | Validation ACC: 85.60
Time elapsed: 0.56 min
Epoch: 009/050 | Batch 000/461 | Cost: 0.5629
Epoch: 009/050 | Batch 150/461 | Cost: 0.6245
Epoch: 009/050 | Batch 300/461 | Cost: 0.7393
Epoch: 009/050 | Batch 450/461 | Cost: 0.6670
Epoch: 009/050
Train ACC: 81.09 | Validation ACC: 85.70
Time elapsed: 0.62 min
Epoch: 010/050 | Batch 000/461 | Cost: 0.6582
Epoch: 010/050 | Batch 150/461 | Cost: 0.7550
Epoch: 010/050 | Batch 300/461 | Cost: 0.7028
Epoch: 010/050 | Batch 450/461 | Cost: 0.6558
Epoch: 010/050
Train ACC: 81.00 | Validation ACC: 85.70
Time elapsed: 0.69 min
Epoch: 011/050 | Batch 000/461 | Cost: 0.5472
Epoch: 011/050 | Batch 150/461 | Cost: 0.6051
Epoch: 011/050 | Batch 300/461 | Cost: 0.5875
Epoch: 011/050 | Batch 450/461 | Cost: 0.4688
Epoch: 011/050
Train ACC: 81.50 | Validation ACC: 85.90
Time elapsed: 0.76 min
Epoch: 012/050 | Batch 000/461 | Cost: 0.5227
Epoch: 012/050 | Batch 150/461 | Cost: 0.6252
Epoch: 012/050 | Batch 300/461 | Cost: 0.6359
Epoch: 012/050 | Batch 450/461 | Cost: 0.8590
Epoch: 012/050
Train ACC: 81.61 | Validation ACC: 86.50
Time elapsed: 0.83 min
Epoch: 013/050 | Batch 000/461 | Cost: 0.4933
Epoch: 013/050 | Batch 150/461 | Cost: 0.5844
Epoch: 013/050 | Batch 300/461 | Cost: 0.4684
Epoch: 013/050 | Batch 450/461 | Cost: 0.5275
Epoch: 013/050
Train ACC: 81.79 | Validation ACC: 86.50
Time elapsed: 0.90 min
Epoch: 014/050 | Batch 000/461 | Cost: 0.6382
Epoch: 014/050 | Batch 150/461 | Cost: 0.7612
Epoch: 014/050 | Batch 300/461 | Cost: 0.5378
Epoch: 014/050 | Batch 450/461 | Cost: 0.5651
Epoch: 014/050
Train ACC: 81.94 | Validation ACC: 86.50
Time elapsed: 0.97 min
Epoch: 015/050 | Batch 000/461 | Cost: 0.5122
Epoch: 015/050 | Batch 150/461 | Cost: 0.6347
Epoch: 015/050 | Batch 300/461 | Cost: 0.6239
Epoch: 015/050 | Batch 450/461 | Cost: 0.6026
Epoch: 015/050
Train ACC: 82.01 | Validation ACC: 87.00
Time elapsed: 1.03 min
Epoch: 016/050 | Batch 000/461 | Cost: 0.6380
Epoch: 016/050 | Batch 150/461 | Cost: 0.5865
Epoch: 016/050 | Batch 300/461 | Cost: 0.3510
Epoch: 016/050 | Batch 450/461 | Cost: 0.5859
Epoch: 016/050
Train ACC: 82.06 | Validation ACC: 86.50
Time elapsed: 1.10 min
Epoch: 017/050 | Batch 000/461 | Cost: 0.6827
Epoch: 017/050 | Batch 150/461 | Cost: 0.6415
Epoch: 017/050 | Batch 300/461 | Cost: 0.7186
Epoch: 017/050 | Batch 450/461 | Cost: 0.6067
Epoch: 017/050
Train ACC: 82.41 | Validation ACC: 87.70
Time elapsed: 1.17 min
Epoch: 018/050 | Batch 000/461 | Cost: 0.7209
Epoch: 018/050 | Batch 150/461 | Cost: 0.6981
Epoch: 018/050 | Batch 300/461 | Cost: 0.6810
Epoch: 018/050 | Batch 450/461 | Cost: 0.6180
Epoch: 018/050
Train ACC: 82.55 | Validation ACC: 87.50
Time elapsed: 1.24 min
Epoch: 019/050 | Batch 000/461 | Cost: 0.7285
Epoch: 019/050 | Batch 150/461 | Cost: 0.7734
Epoch: 019/050 | Batch 300/461 | Cost: 0.7189
Epoch: 019/050 | Batch 450/461 | Cost: 0.5652
Epoch: 019/050
Train ACC: 82.46 | Validation ACC: 87.30
Time elapsed: 1.31 min
Epoch: 020/050 | Batch 000/461 | Cost: 0.7076
Epoch: 020/050 | Batch 150/461 | Cost: 0.4096
Epoch: 020/050 | Batch 300/461 | Cost: 0.7485
Epoch: 020/050 | Batch 450/461 | Cost: 0.7334
Epoch: 020/050
Train ACC: 82.48 | Validation ACC: 87.30
Time elapsed: 1.38 min
Epoch: 021/050 | Batch 000/461 | Cost: 0.4686
Epoch: 021/050 | Batch 150/461 | Cost: 0.6241
Epoch: 021/050 | Batch 300/461 | Cost: 0.5736
Epoch: 021/050 | Batch 450/461 | Cost: 0.4948
Epoch: 021/050
Train ACC: 82.67 | Validation ACC: 88.00
Time elapsed: 1.45 min
Epoch: 022/050 | Batch 000/461 | Cost: 0.4657
Epoch: 022/050 | Batch 150/461 | Cost: 0.6718
Epoch: 022/050 | Batch 300/461 | Cost: 0.6647
Epoch: 022/050 | Batch 450/461 | Cost: 0.4913
Epoch: 022/050
Train ACC: 82.87 | Validation ACC: 87.90
Time elapsed: 1.52 min
Epoch: 023/050 | Batch 000/461 | Cost: 0.5567
Epoch: 023/050 | Batch 150/461 | Cost: 0.4976
Epoch: 023/050 | Batch 300/461 | Cost: 0.5911
Epoch: 023/050 | Batch 450/461 | Cost: 0.4014
Epoch: 023/050
Train ACC: 82.91 | Validation ACC: 87.80
Time elapsed: 1.59 min
Epoch: 024/050 | Batch 000/461 | Cost: 0.5728
Epoch: 024/050 | Batch 150/461 | Cost: 0.6313
Epoch: 024/050 | Batch 300/461 | Cost: 0.5825
Epoch: 024/050 | Batch 450/461 | Cost: 0.4720
Epoch: 024/050
Train ACC: 83.00 | Validation ACC: 87.90
Time elapsed: 1.66 min
Epoch: 025/050 | Batch 000/461 | Cost: 0.5128
Epoch: 025/050 | Batch 150/461 | Cost: 0.4793
Epoch: 025/050 | Batch 300/461 | Cost: 0.7191
Epoch: 025/050 | Batch 450/461 | Cost: 0.5402
Epoch: 025/050
Train ACC: 83.12 | Validation ACC: 88.30
Time elapsed: 1.72 min
Epoch: 026/050 | Batch 000/461 | Cost: 0.4961
Epoch: 026/050 | Batch 150/461 | Cost: 0.4546
Epoch: 026/050 | Batch 300/461 | Cost: 0.5333
Epoch: 026/050 | Batch 450/461 | Cost: 0.5073
Epoch: 026/050
Train ACC: 82.98 | Validation ACC: 87.90
Time elapsed: 1.79 min
Epoch: 027/050 | Batch 000/461 | Cost: 0.7034
Epoch: 027/050 | Batch 150/461 | Cost: 0.5373
Epoch: 027/050 | Batch 300/461 | Cost: 0.5158
Epoch: 027/050 | Batch 450/461 | Cost: 0.5705
Epoch: 027/050
Train ACC: 83.15 | Validation ACC: 88.00
Time elapsed: 1.86 min
Epoch: 028/050 | Batch 000/461 | Cost: 0.4614
Epoch: 028/050 | Batch 150/461 | Cost: 0.4124
Epoch: 028/050 | Batch 300/461 | Cost: 0.7368
Epoch: 028/050 | Batch 450/461 | Cost: 0.5744
Epoch: 028/050
Train ACC: 82.85 | Validation ACC: 87.60
Time elapsed: 1.93 min
Epoch: 029/050 | Batch 000/461 | Cost: 0.5026
Epoch: 029/050 | Batch 150/461 | Cost: 0.6048
Epoch: 029/050 | Batch 300/461 | Cost: 0.6400
Epoch: 029/050 | Batch 450/461 | Cost: 0.4906
Epoch: 029/050
Train ACC: 83.26 | Validation ACC: 88.10
Time elapsed: 2.00 min
Epoch: 030/050 | Batch 000/461 | Cost: 0.6298
Epoch: 030/050 | Batch 150/461 | Cost: 0.5472
Epoch: 030/050 | Batch 300/461 | Cost: 0.5469
Epoch: 030/050 | Batch 450/461 | Cost: 0.4819
Epoch: 030/050
Train ACC: 83.30 | Validation ACC: 88.70
Time elapsed: 2.07 min
Epoch: 031/050 | Batch 000/461 | Cost: 0.6101
Epoch: 031/050 | Batch 150/461 | Cost: 0.5150
Epoch: 031/050 | Batch 300/461 | Cost: 0.5505
Epoch: 031/050 | Batch 450/461 | Cost: 0.5634
Epoch: 031/050
Train ACC: 83.28 | Validation ACC: 88.60
Time elapsed: 2.13 min
Epoch: 032/050 | Batch 000/461 | Cost: 0.5655
Epoch: 032/050 | Batch 150/461 | Cost: 0.6567
Epoch: 032/050 | Batch 300/461 | Cost: 0.5758
Epoch: 032/050 | Batch 450/461 | Cost: 0.5306
Epoch: 032/050
Train ACC: 83.31 | Validation ACC: 88.20
Time elapsed: 2.20 min
Epoch: 033/050 | Batch 000/461 | Cost: 0.6677
Epoch: 033/050 | Batch 150/461 | Cost: 0.7450
Epoch: 033/050 | Batch 300/461 | Cost: 0.5538
Epoch: 033/050 | Batch 450/461 | Cost: 0.5642
Epoch: 033/050
Train ACC: 83.33 | Validation ACC: 88.40
Time elapsed: 2.27 min
Epoch: 034/050 | Batch 000/461 | Cost: 0.6287
Epoch: 034/050 | Batch 150/461 | Cost: 0.4752
Epoch: 034/050 | Batch 300/461 | Cost: 0.5957
Epoch: 034/050 | Batch 450/461 | Cost: 0.4531
Epoch: 034/050
Train ACC: 83.50 | Validation ACC: 88.70
Time elapsed: 2.34 min
Epoch: 035/050 | Batch 000/461 | Cost: 0.5368
Epoch: 035/050 | Batch 150/461 | Cost: 0.5658
Epoch: 035/050 | Batch 300/461 | Cost: 0.6598
Epoch: 035/050 | Batch 450/461 | Cost: 0.5858
Epoch: 035/050
Train ACC: 83.59 | Validation ACC: 88.50
Time elapsed: 2.41 min
Epoch: 036/050 | Batch 000/461 | Cost: 0.5557
Epoch: 036/050 | Batch 150/461 | Cost: 0.4680
Epoch: 036/050 | Batch 300/461 | Cost: 0.4905
Epoch: 036/050 | Batch 450/461 | Cost: 0.9074
Epoch: 036/050
Train ACC: 83.67 | Validation ACC: 88.50
Time elapsed: 2.48 min
Epoch: 037/050 | Batch 000/461 | Cost: 0.6120
Epoch: 037/050 | Batch 150/461 | Cost: 0.4668
Epoch: 037/050 | Batch 300/461 | Cost: 0.5836
Epoch: 037/050 | Batch 450/461 | Cost: 0.4536
Epoch: 037/050
Train ACC: 83.35 | Validation ACC: 88.80
Time elapsed: 2.55 min
Epoch: 038/050 | Batch 000/461 | Cost: 0.5380
Epoch: 038/050 | Batch 150/461 | Cost: 0.4491
Epoch: 038/050 | Batch 300/461 | Cost: 0.4500
Epoch: 038/050 | Batch 450/461 | Cost: 0.6041
Epoch: 038/050
Train ACC: 83.69 | Validation ACC: 88.80
Time elapsed: 2.61 min
Epoch: 039/050 | Batch 000/461 | Cost: 0.4863
Epoch: 039/050 | Batch 150/461 | Cost: 0.5673
Epoch: 039/050 | Batch 300/461 | Cost: 0.4037
Epoch: 039/050 | Batch 450/461 | Cost: 0.6392
Epoch: 039/050
Train ACC: 83.71 | Validation ACC: 88.70
Time elapsed: 2.68 min
Epoch: 040/050 | Batch 000/461 | Cost: 0.6707
Epoch: 040/050 | Batch 150/461 | Cost: 0.5601
Epoch: 040/050 | Batch 300/461 | Cost: 0.5265
Epoch: 040/050 | Batch 450/461 | Cost: 0.4867
Epoch: 040/050
Train ACC: 83.76 | Validation ACC: 88.90
Time elapsed: 2.75 min
Epoch: 041/050 | Batch 000/461 | Cost: 0.5379
Epoch: 041/050 | Batch 150/461 | Cost: 0.4588
Epoch: 041/050 | Batch 300/461 | Cost: 0.5684
Epoch: 041/050 | Batch 450/461 | Cost: 0.5547
Epoch: 041/050
Train ACC: 83.75 | Validation ACC: 88.60
Time elapsed: 2.82 min
Epoch: 042/050 | Batch 000/461 | Cost: 0.5714
Epoch: 042/050 | Batch 150/461 | Cost: 0.3863
Epoch: 042/050 | Batch 300/461 | Cost: 0.5142
Epoch: 042/050 | Batch 450/461 | Cost: 0.6219
Epoch: 042/050
Train ACC: 83.79 | Validation ACC: 89.20
Time elapsed: 2.89 min
Epoch: 043/050 | Batch 000/461 | Cost: 0.5385
Epoch: 043/050 | Batch 150/461 | Cost: 0.4801
Epoch: 043/050 | Batch 300/461 | Cost: 0.6064
Epoch: 043/050 | Batch 450/461 | Cost: 0.4959
Epoch: 043/050
Train ACC: 83.89 | Validation ACC: 88.80
Time elapsed: 2.96 min
Epoch: 044/050 | Batch 000/461 | Cost: 0.6742
Epoch: 044/050 | Batch 150/461 | Cost: 0.5746
Epoch: 044/050 | Batch 300/461 | Cost: 0.6846
Epoch: 044/050 | Batch 450/461 | Cost: 0.6283
Epoch: 044/050
Train ACC: 83.91 | Validation ACC: 89.00
Time elapsed: 3.03 min
Epoch: 045/050 | Batch 000/461 | Cost: 0.5646
Epoch: 045/050 | Batch 150/461 | Cost: 0.3776
Epoch: 045/050 | Batch 300/461 | Cost: 0.5457
Epoch: 045/050 | Batch 450/461 | Cost: 0.4897
Epoch: 045/050
Train ACC: 83.87 | Validation ACC: 89.10
Time elapsed: 3.10 min
Epoch: 046/050 | Batch 000/461 | Cost: 0.5300
Epoch: 046/050 | Batch 150/461 | Cost: 0.6787
Epoch: 046/050 | Batch 300/461 | Cost: 0.4310
Epoch: 046/050 | Batch 450/461 | Cost: 0.5758
Epoch: 046/050
Train ACC: 84.01 | Validation ACC: 89.10
Time elapsed: 3.17 min
Epoch: 047/050 | Batch 000/461 | Cost: 0.6111
Epoch: 047/050 | Batch 150/461 | Cost: 0.5679
Epoch: 047/050 | Batch 300/461 | Cost: 0.6306
Epoch: 047/050 | Batch 450/461 | Cost: 0.7292
Epoch: 047/050
Train ACC: 84.03 | Validation ACC: 89.20
Time elapsed: 3.24 min
Epoch: 048/050 | Batch 000/461 | Cost: 0.5925
Epoch: 048/050 | Batch 150/461 | Cost: 0.6623
Epoch: 048/050 | Batch 300/461 | Cost: 0.4188
Epoch: 048/050 | Batch 450/461 | Cost: 0.3433
Epoch: 048/050
Train ACC: 83.89 | Validation ACC: 89.10
Time elapsed: 3.31 min
Epoch: 049/050 | Batch 000/461 | Cost: 0.4881
Epoch: 049/050 | Batch 150/461 | Cost: 0.5040
Epoch: 049/050 | Batch 300/461 | Cost: 0.5655
Epoch: 049/050 | Batch 450/461 | Cost: 0.5264
Epoch: 049/050
Train ACC: 83.83 | Validation ACC: 88.60
Time elapsed: 3.38 min
Epoch: 050/050 | Batch 000/461 | Cost: 0.5284
Epoch: 050/050 | Batch 150/461 | Cost: 0.6253
Epoch: 050/050 | Batch 300/461 | Cost: 0.3891
Epoch: 050/050 | Batch 450/461 | Cost: 0.4316
Epoch: 050/050
Train ACC: 83.90 | Validation ACC: 88.70
Time elapsed: 3.45 min
Total Training Time: 3.45 min

Evaluation

In [12]:
plt.plot(cost_list, label='Minibatch cost')
plt.plot(np.convolve(cost_list, 
                     np.ones(200,)/200, mode='valid'), 
         label='Running average')

plt.ylabel('Cross Entropy')
plt.xlabel('Iteration')
plt.legend()
plt.show()
In [13]:
plt.plot(np.arange(1, NUM_EPOCHS+1), train_acc_list, label='Training')
plt.plot(np.arange(1, NUM_EPOCHS+1), valid_acc_list, label='Validation')

plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
In [14]:
with torch.set_grad_enabled(False):
    test_acc = compute_acc(model=model,
                           data_loader=test_loader,
                           device=DEVICE)
    
    valid_acc = compute_acc(model=model,
                            data_loader=valid_loader,
                            device=DEVICE)
    

print(f'Validation ACC: {valid_acc:.2f}%')
print(f'Test ACC: {test_acc:.2f}%')
Validation ACC: 88.70%
Test ACC: 84.55%
In [15]:
%watermark -iv
numpy       1.16.4
torch       1.2.0
matplotlib  3.1.0
torchvision 0.4.0a0+6b959ee