Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka 

CPython 3.7.3
IPython 7.9.0

torch 1.3.0

Filter Response Normalization for Network-in-Network CIFAR-10 Classifier

The CNN architecture is based on

  • Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).

This notebook implements Filter Response Normalization as a drop-in replacement for BatchNorm, based on the paper:

  • S. Singh and S. Krishnan (2019). Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks. https://arxiv.org/abs/1911.09737
In [2]:
import torch
import torch.nn as nn


class FilterResponseNormalization(nn.Module):
    def __init__(self, num_features, eps=1e-6):
        super(FilterResponseNormalization, self).__init__()
        
        self.register_parameter('beta', 
                                torch.nn.Parameter(
                                        torch.empty([1, num_features, 1, 1]).normal_()))
    
        self.register_parameter('gamma', 
                                torch.nn.Parameter(
                                        torch.empty([1, num_features, 1, 1]).normal_()))
        
        self.register_parameter('tau', 
                                torch.nn.Parameter(
                                        torch.empty([1, num_features, 1, 1]).normal_()))
        
        self.eps = torch.Tensor([eps])

    def forward(self, x):
        # forward function based on
        # https://github.com/gupta-abhay/pytorch-frn/blob/master/frn.py
        n, c, h, w = x.size()
        
        self.eps = self.eps.to(self.tau.device)

        nu2 = torch.mean(x.pow(2), (2, 3), keepdims=True)
        x = x * torch.rsqrt(nu2 + torch.abs(self.eps))
        return torch.max(self.gamma*x + self.beta, self.tau)

Additional Imports

In [3]:
import os
import time

import numpy as np
import pandas as pd

import torch.nn.functional as F
from torch.utils.data import DataLoader
from torch.utils.data import Subset

from torchvision import datasets
from torchvision import transforms

import matplotlib.pyplot as plt
from PIL import Image


if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True

Model Settings

In [4]:
##########################
### SETTINGS
##########################

# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.00005
BATCH_SIZE = 256
NUM_EPOCHS = 100

# Architecture
NUM_CLASSES = 10

# Other
DEVICE = "cuda:0"
GRAYSCALE = False
In [5]:
##########################
### CIFAR-10 Dataset
##########################


# Note transforms.ToTensor() scales input images
# to 0-1 range


train_indices = torch.arange(0, 49000)
valid_indices = torch.arange(49000, 50000)


train_and_valid = datasets.CIFAR10(root='data', 
                                   train=True, 
                                   transform=transforms.ToTensor(),
                                   download=True)

train_dataset = Subset(train_and_valid, train_indices)
valid_dataset = Subset(train_and_valid, valid_indices)


test_dataset = datasets.CIFAR10(root='data', 
                                train=False, 
                                transform=transforms.ToTensor())


#####################################################
### Data Loaders
#####################################################

train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=8,
                          shuffle=True)

valid_loader = DataLoader(dataset=valid_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=8,
                          shuffle=False)

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE,
                         num_workers=8,
                         shuffle=False)

#####################################################

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break

for images, labels in test_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
    
for images, labels in valid_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
Files already downloaded and verified
Image batch dimensions: torch.Size([256, 3, 32, 32])
Image label dimensions: torch.Size([256])
Image batch dimensions: torch.Size([256, 3, 32, 32])
Image label dimensions: torch.Size([256])
Image batch dimensions: torch.Size([256, 3, 32, 32])
Image label dimensions: torch.Size([256])

Filter Response Normalization

In [6]:
##########################
### MODEL
##########################


class NiN(nn.Module):
    def __init__(self, num_classes):
        super(NiN, self).__init__()
        self.num_classes = num_classes
        self.classifier = nn.Sequential(
                nn.Conv2d(3, 192, kernel_size=5, stride=1, padding=2, bias=False),
                FilterResponseNormalization(192),
                #nn.ReLU(inplace=True),
                nn.Conv2d(192, 160, kernel_size=1, stride=1, padding=0, bias=False),
                FilterResponseNormalization(160),
                #nn.ReLU(inplace=True),
                nn.Conv2d(160,  96, kernel_size=1, stride=1, padding=0, bias=False),
                FilterResponseNormalization(96),
                #nn.ReLU(inplace=True),
                nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(96, 192, kernel_size=5, stride=1, padding=2, bias=False),
                FilterResponseNormalization(192),
                #nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                FilterResponseNormalization(192),
                #nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                FilterResponseNormalization(192),
                #nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(192, 192, kernel_size=3, stride=1, padding=1, bias=False),
                FilterResponseNormalization(192),
                #nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                FilterResponseNormalization(192),
                #nn.ReLU(inplace=True),
                nn.Conv2d(192,  10, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=8, stride=1, padding=0),

                )

    def forward(self, x):
        x = self.classifier(x)
        logits = x.view(x.size(0), self.num_classes)
        probas = torch.softmax(logits, dim=1)
        return logits, probas
In [ ]:
 
In [7]:
torch.manual_seed(RANDOM_SEED)

model = NiN(NUM_CLASSES)
model.to(DEVICE)

optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)  
In [8]:
def compute_accuracy(model, data_loader, device):
    correct_pred, num_examples = 0, 0
    for i, (features, targets) in enumerate(data_loader):
            
        features = features.to(device)
        targets = targets.to(device)

        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100
    

start_time = time.time()
for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    with torch.set_grad_enabled(False):
        train_acc = compute_accuracy(model, train_loader, device=DEVICE)
        valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/100 | Batch 000/192 | Cost: 2.3138
Epoch: 001/100 | Batch 120/192 | Cost: 2.1962
Epoch: 001/100 Train Acc.: 22.35% | Validation Acc.: 24.80%
Time elapsed: 0.60 min
Epoch: 002/100 | Batch 000/192 | Cost: 2.1328
Epoch: 002/100 | Batch 120/192 | Cost: 2.0764
Epoch: 002/100 Train Acc.: 24.54% | Validation Acc.: 26.70%
Time elapsed: 1.21 min
Epoch: 003/100 | Batch 000/192 | Cost: 2.0738
Epoch: 003/100 | Batch 120/192 | Cost: 1.9631
Epoch: 003/100 Train Acc.: 28.22% | Validation Acc.: 32.10%
Time elapsed: 1.80 min
Epoch: 004/100 | Batch 000/192 | Cost: 2.0147
Epoch: 004/100 | Batch 120/192 | Cost: 1.9606
Epoch: 004/100 Train Acc.: 29.99% | Validation Acc.: 31.10%
Time elapsed: 2.40 min
Epoch: 005/100 | Batch 000/192 | Cost: 1.9880
Epoch: 005/100 | Batch 120/192 | Cost: 1.9411
Epoch: 005/100 Train Acc.: 31.99% | Validation Acc.: 33.40%
Time elapsed: 3.00 min
Epoch: 006/100 | Batch 000/192 | Cost: 1.9271
Epoch: 006/100 | Batch 120/192 | Cost: 1.9257
Epoch: 006/100 Train Acc.: 36.23% | Validation Acc.: 36.90%
Time elapsed: 3.60 min
Epoch: 007/100 | Batch 000/192 | Cost: 1.9154
Epoch: 007/100 | Batch 120/192 | Cost: 1.9543
Epoch: 007/100 Train Acc.: 37.08% | Validation Acc.: 39.60%
Time elapsed: 4.19 min
Epoch: 008/100 | Batch 000/192 | Cost: 1.7458
Epoch: 008/100 | Batch 120/192 | Cost: 1.7728
Epoch: 008/100 Train Acc.: 40.16% | Validation Acc.: 41.90%
Time elapsed: 4.79 min
Epoch: 009/100 | Batch 000/192 | Cost: 1.7029
Epoch: 009/100 | Batch 120/192 | Cost: 1.5784
Epoch: 009/100 Train Acc.: 42.81% | Validation Acc.: 42.80%
Time elapsed: 5.39 min
Epoch: 010/100 | Batch 000/192 | Cost: 1.6909
Epoch: 010/100 | Batch 120/192 | Cost: 1.6210
Epoch: 010/100 Train Acc.: 44.37% | Validation Acc.: 44.80%
Time elapsed: 5.98 min
Epoch: 011/100 | Batch 000/192 | Cost: 1.5235
Epoch: 011/100 | Batch 120/192 | Cost: 1.7265
Epoch: 011/100 Train Acc.: 43.96% | Validation Acc.: 41.70%
Time elapsed: 6.58 min
Epoch: 012/100 | Batch 000/192 | Cost: 1.6440
Epoch: 012/100 | Batch 120/192 | Cost: 1.6264
Epoch: 012/100 Train Acc.: 45.15% | Validation Acc.: 47.90%
Time elapsed: 7.18 min
Epoch: 013/100 | Batch 000/192 | Cost: 1.5785
Epoch: 013/100 | Batch 120/192 | Cost: 1.4762
Epoch: 013/100 Train Acc.: 47.02% | Validation Acc.: 47.70%
Time elapsed: 7.78 min
Epoch: 014/100 | Batch 000/192 | Cost: 1.5027
Epoch: 014/100 | Batch 120/192 | Cost: 1.4684
Epoch: 014/100 Train Acc.: 49.24% | Validation Acc.: 51.60%
Time elapsed: 8.37 min
Epoch: 015/100 | Batch 000/192 | Cost: 1.4039
Epoch: 015/100 | Batch 120/192 | Cost: 1.3735
Epoch: 015/100 Train Acc.: 50.82% | Validation Acc.: 49.60%
Time elapsed: 8.97 min
Epoch: 016/100 | Batch 000/192 | Cost: 1.3997
Epoch: 016/100 | Batch 120/192 | Cost: 1.2700
Epoch: 016/100 Train Acc.: 53.31% | Validation Acc.: 52.50%
Time elapsed: 9.57 min
Epoch: 017/100 | Batch 000/192 | Cost: 1.2491
Epoch: 017/100 | Batch 120/192 | Cost: 1.3126
Epoch: 017/100 Train Acc.: 53.79% | Validation Acc.: 54.50%
Time elapsed: 10.17 min
Epoch: 018/100 | Batch 000/192 | Cost: 1.1533
Epoch: 018/100 | Batch 120/192 | Cost: 1.2444
Epoch: 018/100 Train Acc.: 54.08% | Validation Acc.: 54.70%
Time elapsed: 10.76 min
Epoch: 019/100 | Batch 000/192 | Cost: 1.3304
Epoch: 019/100 | Batch 120/192 | Cost: 1.2897
Epoch: 019/100 Train Acc.: 50.59% | Validation Acc.: 49.70%
Time elapsed: 11.36 min
Epoch: 020/100 | Batch 000/192 | Cost: 1.3537
Epoch: 020/100 | Batch 120/192 | Cost: 1.3673
Epoch: 020/100 Train Acc.: 56.56% | Validation Acc.: 58.20%
Time elapsed: 11.96 min
Epoch: 021/100 | Batch 000/192 | Cost: 1.2397
Epoch: 021/100 | Batch 120/192 | Cost: 1.2370
Epoch: 021/100 Train Acc.: 56.75% | Validation Acc.: 55.90%
Time elapsed: 12.56 min
Epoch: 022/100 | Batch 000/192 | Cost: 1.1461
Epoch: 022/100 | Batch 120/192 | Cost: 1.2558
Epoch: 022/100 Train Acc.: 57.97% | Validation Acc.: 56.80%
Time elapsed: 13.15 min
Epoch: 023/100 | Batch 000/192 | Cost: 1.2667
Epoch: 023/100 | Batch 120/192 | Cost: 1.1730
Epoch: 023/100 Train Acc.: 57.39% | Validation Acc.: 58.30%
Time elapsed: 13.75 min
Epoch: 024/100 | Batch 000/192 | Cost: 1.1891
Epoch: 024/100 | Batch 120/192 | Cost: 1.1483
Epoch: 024/100 Train Acc.: 58.32% | Validation Acc.: 59.20%
Time elapsed: 14.35 min
Epoch: 025/100 | Batch 000/192 | Cost: 1.2057
Epoch: 025/100 | Batch 120/192 | Cost: 1.1603
Epoch: 025/100 Train Acc.: 59.83% | Validation Acc.: 58.80%
Time elapsed: 14.94 min
Epoch: 026/100 | Batch 000/192 | Cost: 1.0880
Epoch: 026/100 | Batch 120/192 | Cost: 1.2100
Epoch: 026/100 Train Acc.: 60.84% | Validation Acc.: 59.70%
Time elapsed: 15.54 min
Epoch: 027/100 | Batch 000/192 | Cost: 1.0250
Epoch: 027/100 | Batch 120/192 | Cost: 1.2468
Epoch: 027/100 Train Acc.: 60.00% | Validation Acc.: 58.70%
Time elapsed: 16.14 min
Epoch: 028/100 | Batch 000/192 | Cost: 1.1053
Epoch: 028/100 | Batch 120/192 | Cost: 1.1635
Epoch: 028/100 Train Acc.: 60.78% | Validation Acc.: 58.50%
Time elapsed: 16.74 min
Epoch: 029/100 | Batch 000/192 | Cost: 1.0719
Epoch: 029/100 | Batch 120/192 | Cost: 1.1288
Epoch: 029/100 Train Acc.: 61.81% | Validation Acc.: 60.30%
Time elapsed: 17.33 min
Epoch: 030/100 | Batch 000/192 | Cost: 1.1443
Epoch: 030/100 | Batch 120/192 | Cost: 1.0943
Epoch: 030/100 Train Acc.: 60.08% | Validation Acc.: 56.70%
Time elapsed: 17.93 min
Epoch: 031/100 | Batch 000/192 | Cost: 1.0960
Epoch: 031/100 | Batch 120/192 | Cost: 1.1090
Epoch: 031/100 Train Acc.: 62.24% | Validation Acc.: 61.20%
Time elapsed: 18.52 min
Epoch: 032/100 | Batch 000/192 | Cost: 1.0046
Epoch: 032/100 | Batch 120/192 | Cost: 1.2110
Epoch: 032/100 Train Acc.: 60.71% | Validation Acc.: 62.30%
Time elapsed: 19.12 min
Epoch: 033/100 | Batch 000/192 | Cost: 1.0136
Epoch: 033/100 | Batch 120/192 | Cost: 1.0387
Epoch: 033/100 Train Acc.: 62.54% | Validation Acc.: 60.90%
Time elapsed: 19.72 min
Epoch: 034/100 | Batch 000/192 | Cost: 1.1746
Epoch: 034/100 | Batch 120/192 | Cost: 1.1025
Epoch: 034/100 Train Acc.: 64.18% | Validation Acc.: 64.10%
Time elapsed: 20.32 min
Epoch: 035/100 | Batch 000/192 | Cost: 0.9348
Epoch: 035/100 | Batch 120/192 | Cost: 1.0161
Epoch: 035/100 Train Acc.: 64.49% | Validation Acc.: 62.80%
Time elapsed: 20.91 min
Epoch: 036/100 | Batch 000/192 | Cost: 0.9798
Epoch: 036/100 | Batch 120/192 | Cost: 1.0668
Epoch: 036/100 Train Acc.: 63.90% | Validation Acc.: 63.00%
Time elapsed: 21.51 min
Epoch: 037/100 | Batch 000/192 | Cost: 0.9383
Epoch: 037/100 | Batch 120/192 | Cost: 1.0818
Epoch: 037/100 Train Acc.: 62.82% | Validation Acc.: 61.30%
Time elapsed: 22.11 min
Epoch: 038/100 | Batch 000/192 | Cost: 0.9943
Epoch: 038/100 | Batch 120/192 | Cost: 0.9690
Epoch: 038/100 Train Acc.: 64.09% | Validation Acc.: 62.90%
Time elapsed: 22.71 min
Epoch: 039/100 | Batch 000/192 | Cost: 1.0250
Epoch: 039/100 | Batch 120/192 | Cost: 1.0926
Epoch: 039/100 Train Acc.: 64.78% | Validation Acc.: 63.20%
Time elapsed: 23.31 min
Epoch: 040/100 | Batch 000/192 | Cost: 0.9488
Epoch: 040/100 | Batch 120/192 | Cost: 1.0708
Epoch: 040/100 Train Acc.: 64.24% | Validation Acc.: 61.30%
Time elapsed: 23.91 min
Epoch: 041/100 | Batch 000/192 | Cost: 1.0845
Epoch: 041/100 | Batch 120/192 | Cost: 1.0342
Epoch: 041/100 Train Acc.: 65.33% | Validation Acc.: 65.30%
Time elapsed: 24.50 min
Epoch: 042/100 | Batch 000/192 | Cost: 0.9705
Epoch: 042/100 | Batch 120/192 | Cost: 0.9504
Epoch: 042/100 Train Acc.: 65.32% | Validation Acc.: 63.70%
Time elapsed: 25.10 min
Epoch: 043/100 | Batch 000/192 | Cost: 0.9865
Epoch: 043/100 | Batch 120/192 | Cost: 0.9781
Epoch: 043/100 Train Acc.: 65.94% | Validation Acc.: 67.00%
Time elapsed: 25.70 min
Epoch: 044/100 | Batch 000/192 | Cost: 0.9381
Epoch: 044/100 | Batch 120/192 | Cost: 0.8644
Epoch: 044/100 Train Acc.: 65.39% | Validation Acc.: 63.10%
Time elapsed: 26.30 min
Epoch: 045/100 | Batch 000/192 | Cost: 0.9364
Epoch: 045/100 | Batch 120/192 | Cost: 0.9044
Epoch: 045/100 Train Acc.: 66.59% | Validation Acc.: 63.90%
Time elapsed: 26.90 min
Epoch: 046/100 | Batch 000/192 | Cost: 1.0168
Epoch: 046/100 | Batch 120/192 | Cost: 1.0457
Epoch: 046/100 Train Acc.: 67.00% | Validation Acc.: 64.60%
Time elapsed: 27.49 min
Epoch: 047/100 | Batch 000/192 | Cost: 0.9355
Epoch: 047/100 | Batch 120/192 | Cost: 0.8651
Epoch: 047/100 Train Acc.: 66.29% | Validation Acc.: 63.80%
Time elapsed: 28.10 min
Epoch: 048/100 | Batch 000/192 | Cost: 0.8658
Epoch: 048/100 | Batch 120/192 | Cost: 0.9858
Epoch: 048/100 Train Acc.: 67.62% | Validation Acc.: 66.30%
Time elapsed: 28.69 min
Epoch: 049/100 | Batch 000/192 | Cost: 0.9450
Epoch: 049/100 | Batch 120/192 | Cost: 0.8909
Epoch: 049/100 Train Acc.: 65.02% | Validation Acc.: 62.30%
Time elapsed: 29.29 min
Epoch: 050/100 | Batch 000/192 | Cost: 1.0726
Epoch: 050/100 | Batch 120/192 | Cost: 0.8521
Epoch: 050/100 Train Acc.: 65.28% | Validation Acc.: 61.80%
Time elapsed: 29.89 min
Epoch: 051/100 | Batch 000/192 | Cost: 1.0874
Epoch: 051/100 | Batch 120/192 | Cost: 0.8756
Epoch: 051/100 Train Acc.: 67.28% | Validation Acc.: 66.20%
Time elapsed: 30.49 min
Epoch: 052/100 | Batch 000/192 | Cost: 0.8616
Epoch: 052/100 | Batch 120/192 | Cost: 0.8323
Epoch: 052/100 Train Acc.: 67.98% | Validation Acc.: 66.70%
Time elapsed: 31.09 min
Epoch: 053/100 | Batch 000/192 | Cost: 0.8961
Epoch: 053/100 | Batch 120/192 | Cost: 0.8221
Epoch: 053/100 Train Acc.: 67.47% | Validation Acc.: 67.00%
Time elapsed: 31.68 min
Epoch: 054/100 | Batch 000/192 | Cost: 1.0299
Epoch: 054/100 | Batch 120/192 | Cost: 0.9036
Epoch: 054/100 Train Acc.: 68.66% | Validation Acc.: 66.30%
Time elapsed: 32.28 min
Epoch: 055/100 | Batch 000/192 | Cost: 0.9415
Epoch: 055/100 | Batch 120/192 | Cost: 0.9244
Epoch: 055/100 Train Acc.: 69.47% | Validation Acc.: 67.60%
Time elapsed: 32.88 min
Epoch: 056/100 | Batch 000/192 | Cost: 0.9208
Epoch: 056/100 | Batch 120/192 | Cost: 0.8804
Epoch: 056/100 Train Acc.: 68.89% | Validation Acc.: 65.20%
Time elapsed: 33.48 min
Epoch: 057/100 | Batch 000/192 | Cost: 0.8637
Epoch: 057/100 | Batch 120/192 | Cost: 0.8408
Epoch: 057/100 Train Acc.: 69.76% | Validation Acc.: 65.50%
Time elapsed: 34.08 min
Epoch: 058/100 | Batch 000/192 | Cost: 0.8583
Epoch: 058/100 | Batch 120/192 | Cost: 0.8561
Epoch: 058/100 Train Acc.: 68.19% | Validation Acc.: 67.10%
Time elapsed: 34.68 min
Epoch: 059/100 | Batch 000/192 | Cost: 1.0149
Epoch: 059/100 | Batch 120/192 | Cost: 0.8976
Epoch: 059/100 Train Acc.: 69.50% | Validation Acc.: 67.40%
Time elapsed: 35.28 min
Epoch: 060/100 | Batch 000/192 | Cost: 0.8902
Epoch: 060/100 | Batch 120/192 | Cost: 0.9281
Epoch: 060/100 Train Acc.: 70.08% | Validation Acc.: 67.60%
Time elapsed: 35.87 min
Epoch: 061/100 | Batch 000/192 | Cost: 0.8918
Epoch: 061/100 | Batch 120/192 | Cost: 0.8235
Epoch: 061/100 Train Acc.: 71.08% | Validation Acc.: 66.80%
Time elapsed: 36.47 min
Epoch: 062/100 | Batch 000/192 | Cost: 0.8430
Epoch: 062/100 | Batch 120/192 | Cost: 0.9288
Epoch: 062/100 Train Acc.: 69.94% | Validation Acc.: 67.90%
Time elapsed: 37.07 min
Epoch: 063/100 | Batch 000/192 | Cost: 0.8885
Epoch: 063/100 | Batch 120/192 | Cost: 0.9188
Epoch: 063/100 Train Acc.: 69.76% | Validation Acc.: 67.50%
Time elapsed: 37.66 min
Epoch: 064/100 | Batch 000/192 | Cost: 0.7734
Epoch: 064/100 | Batch 120/192 | Cost: 0.8158
Epoch: 064/100 Train Acc.: 68.91% | Validation Acc.: 67.60%
Time elapsed: 38.26 min
Epoch: 065/100 | Batch 000/192 | Cost: 0.8541
Epoch: 065/100 | Batch 120/192 | Cost: 0.8135
Epoch: 065/100 Train Acc.: 70.94% | Validation Acc.: 67.80%
Time elapsed: 38.86 min
Epoch: 066/100 | Batch 000/192 | Cost: 0.8057
Epoch: 066/100 | Batch 120/192 | Cost: 0.9248
Epoch: 066/100 Train Acc.: 71.24% | Validation Acc.: 68.10%
Time elapsed: 39.46 min
Epoch: 067/100 | Batch 000/192 | Cost: 0.7940
Epoch: 067/100 | Batch 120/192 | Cost: 0.8346
Epoch: 067/100 Train Acc.: 71.26% | Validation Acc.: 68.00%
Time elapsed: 40.05 min
Epoch: 068/100 | Batch 000/192 | Cost: 0.7323
Epoch: 068/100 | Batch 120/192 | Cost: 0.8404
Epoch: 068/100 Train Acc.: 71.13% | Validation Acc.: 69.20%
Time elapsed: 40.65 min
Epoch: 069/100 | Batch 000/192 | Cost: 0.8699
Epoch: 069/100 | Batch 120/192 | Cost: 0.9162
Epoch: 069/100 Train Acc.: 68.77% | Validation Acc.: 65.40%
Time elapsed: 41.25 min
Epoch: 070/100 | Batch 000/192 | Cost: 0.8281
Epoch: 070/100 | Batch 120/192 | Cost: 0.7858
Epoch: 070/100 Train Acc.: 72.29% | Validation Acc.: 68.50%
Time elapsed: 41.85 min
Epoch: 071/100 | Batch 000/192 | Cost: 0.8033
Epoch: 071/100 | Batch 120/192 | Cost: 0.7622
Epoch: 071/100 Train Acc.: 70.89% | Validation Acc.: 68.20%
Time elapsed: 42.44 min
Epoch: 072/100 | Batch 000/192 | Cost: 0.8036
Epoch: 072/100 | Batch 120/192 | Cost: 0.7706
Epoch: 072/100 Train Acc.: 71.73% | Validation Acc.: 69.30%
Time elapsed: 43.04 min
Epoch: 073/100 | Batch 000/192 | Cost: 0.8105
Epoch: 073/100 | Batch 120/192 | Cost: 0.7701
Epoch: 073/100 Train Acc.: 72.28% | Validation Acc.: 68.50%
Time elapsed: 43.64 min
Epoch: 074/100 | Batch 000/192 | Cost: 0.8407
Epoch: 074/100 | Batch 120/192 | Cost: 0.7073
Epoch: 074/100 Train Acc.: 71.31% | Validation Acc.: 66.80%
Time elapsed: 44.23 min
Epoch: 075/100 | Batch 000/192 | Cost: 0.8167
Epoch: 075/100 | Batch 120/192 | Cost: 0.8992
Epoch: 075/100 Train Acc.: 70.88% | Validation Acc.: 67.70%
Time elapsed: 44.83 min
Epoch: 076/100 | Batch 000/192 | Cost: 0.8213
Epoch: 076/100 | Batch 120/192 | Cost: 0.7992
Epoch: 076/100 Train Acc.: 73.29% | Validation Acc.: 69.70%
Time elapsed: 45.43 min
Epoch: 077/100 | Batch 000/192 | Cost: 0.7490
Epoch: 077/100 | Batch 120/192 | Cost: 0.8266
Epoch: 077/100 Train Acc.: 71.30% | Validation Acc.: 69.50%
Time elapsed: 46.03 min
Epoch: 078/100 | Batch 000/192 | Cost: 0.8243
Epoch: 078/100 | Batch 120/192 | Cost: 0.8135
Epoch: 078/100 Train Acc.: 72.05% | Validation Acc.: 69.60%
Time elapsed: 46.62 min
Epoch: 079/100 | Batch 000/192 | Cost: 0.8013
Epoch: 079/100 | Batch 120/192 | Cost: 0.7685
Epoch: 079/100 Train Acc.: 71.57% | Validation Acc.: 69.10%
Time elapsed: 47.22 min
Epoch: 080/100 | Batch 000/192 | Cost: 0.7319
Epoch: 080/100 | Batch 120/192 | Cost: 0.8254
Epoch: 080/100 Train Acc.: 74.19% | Validation Acc.: 71.50%
Time elapsed: 47.82 min
Epoch: 081/100 | Batch 000/192 | Cost: 0.6880
Epoch: 081/100 | Batch 120/192 | Cost: 0.8099
Epoch: 081/100 Train Acc.: 73.16% | Validation Acc.: 70.70%
Time elapsed: 48.42 min
Epoch: 082/100 | Batch 000/192 | Cost: 0.6497
Epoch: 082/100 | Batch 120/192 | Cost: 0.8208
Epoch: 082/100 Train Acc.: 72.91% | Validation Acc.: 71.60%
Time elapsed: 49.01 min
Epoch: 083/100 | Batch 000/192 | Cost: 0.6912
Epoch: 083/100 | Batch 120/192 | Cost: 0.7313
Epoch: 083/100 Train Acc.: 74.04% | Validation Acc.: 70.10%
Time elapsed: 49.61 min
Epoch: 084/100 | Batch 000/192 | Cost: 0.7115
Epoch: 084/100 | Batch 120/192 | Cost: 0.7145
Epoch: 084/100 Train Acc.: 74.19% | Validation Acc.: 71.40%
Time elapsed: 50.21 min
Epoch: 085/100 | Batch 000/192 | Cost: 0.7889
Epoch: 085/100 | Batch 120/192 | Cost: 0.6902
Epoch: 085/100 Train Acc.: 73.75% | Validation Acc.: 69.70%
Time elapsed: 50.80 min
Epoch: 086/100 | Batch 000/192 | Cost: 0.7790
Epoch: 086/100 | Batch 120/192 | Cost: 0.7807
Epoch: 086/100 Train Acc.: 74.25% | Validation Acc.: 70.50%
Time elapsed: 51.40 min
Epoch: 087/100 | Batch 000/192 | Cost: 0.6930
Epoch: 087/100 | Batch 120/192 | Cost: 0.7695
Epoch: 087/100 Train Acc.: 73.66% | Validation Acc.: 71.30%
Time elapsed: 52.00 min
Epoch: 088/100 | Batch 000/192 | Cost: 0.7779
Epoch: 088/100 | Batch 120/192 | Cost: 0.7066
Epoch: 088/100 Train Acc.: 74.70% | Validation Acc.: 70.70%
Time elapsed: 52.60 min
Epoch: 089/100 | Batch 000/192 | Cost: 0.7385
Epoch: 089/100 | Batch 120/192 | Cost: 0.7996
Epoch: 089/100 Train Acc.: 73.74% | Validation Acc.: 69.80%
Time elapsed: 53.20 min
Epoch: 090/100 | Batch 000/192 | Cost: 0.7230
Epoch: 090/100 | Batch 120/192 | Cost: 0.7889
Epoch: 090/100 Train Acc.: 75.10% | Validation Acc.: 71.70%
Time elapsed: 53.80 min
Epoch: 091/100 | Batch 000/192 | Cost: 0.7388
Epoch: 091/100 | Batch 120/192 | Cost: 0.7569
Epoch: 091/100 Train Acc.: 75.22% | Validation Acc.: 71.90%
Time elapsed: 54.39 min
Epoch: 092/100 | Batch 000/192 | Cost: 0.6327
Epoch: 092/100 | Batch 120/192 | Cost: 0.6937
Epoch: 092/100 Train Acc.: 75.02% | Validation Acc.: 72.40%
Time elapsed: 54.99 min
Epoch: 093/100 | Batch 000/192 | Cost: 0.7830
Epoch: 093/100 | Batch 120/192 | Cost: 0.6699
Epoch: 093/100 Train Acc.: 75.35% | Validation Acc.: 71.20%
Time elapsed: 55.59 min
Epoch: 094/100 | Batch 000/192 | Cost: 0.7633
Epoch: 094/100 | Batch 120/192 | Cost: 0.7601
Epoch: 094/100 Train Acc.: 74.38% | Validation Acc.: 71.20%
Time elapsed: 56.19 min
Epoch: 095/100 | Batch 000/192 | Cost: 0.6946
Epoch: 095/100 | Batch 120/192 | Cost: 0.8123
Epoch: 095/100 Train Acc.: 75.07% | Validation Acc.: 70.90%
Time elapsed: 56.79 min
Epoch: 096/100 | Batch 000/192 | Cost: 0.6740
Epoch: 096/100 | Batch 120/192 | Cost: 0.6667
Epoch: 096/100 Train Acc.: 75.40% | Validation Acc.: 73.30%
Time elapsed: 57.38 min
Epoch: 097/100 | Batch 000/192 | Cost: 0.6683
Epoch: 097/100 | Batch 120/192 | Cost: 0.6792
Epoch: 097/100 Train Acc.: 73.43% | Validation Acc.: 68.00%
Time elapsed: 57.98 min
Epoch: 098/100 | Batch 000/192 | Cost: 0.8212
Epoch: 098/100 | Batch 120/192 | Cost: 0.8133
Epoch: 098/100 Train Acc.: 75.72% | Validation Acc.: 72.80%
Time elapsed: 58.58 min
Epoch: 099/100 | Batch 000/192 | Cost: 0.7529
Epoch: 099/100 | Batch 120/192 | Cost: 0.7609
Epoch: 099/100 Train Acc.: 73.84% | Validation Acc.: 70.60%
Time elapsed: 59.18 min
Epoch: 100/100 | Batch 000/192 | Cost: 0.8095
Epoch: 100/100 | Batch 120/192 | Cost: 0.7173
Epoch: 100/100 Train Acc.: 75.09% | Validation Acc.: 72.10%
Time elapsed: 59.78 min
Total Training Time: 59.78 min

Batch Normalization (for comparison)

In [9]:
##########################
### MODEL
##########################


class NiN(nn.Module):
    def __init__(self, num_classes):
        super(NiN, self).__init__()
        self.num_classes = num_classes
        self.classifier = nn.Sequential(
                nn.Conv2d(3, 192, kernel_size=5, stride=1, padding=2, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 160, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(160),
                nn.ReLU(inplace=True),
                nn.Conv2d(160,  96, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(96),
                nn.ReLU(inplace=True),
                nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(96, 192, kernel_size=5, stride=1, padding=2, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(192, 192, kernel_size=3, stride=1, padding=1, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192,  10, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=8, stride=1, padding=0),

                )

    def forward(self, x):
        x = self.classifier(x)
        logits = x.view(x.size(0), self.num_classes)
        probas = torch.softmax(logits, dim=1)
        return logits, probas
In [10]:
start_time = time.time()
for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    with torch.set_grad_enabled(False):
        train_acc = compute_accuracy(model, train_loader, device=DEVICE)
        valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/100 | Batch 000/192 | Cost: 0.6825
Epoch: 001/100 | Batch 120/192 | Cost: 0.7027
Epoch: 001/100 Train Acc.: 76.08% | Validation Acc.: 72.60%
Time elapsed: 0.60 min
Epoch: 002/100 | Batch 000/192 | Cost: 0.6714
Epoch: 002/100 | Batch 120/192 | Cost: 0.6445
Epoch: 002/100 Train Acc.: 76.35% | Validation Acc.: 73.40%
Time elapsed: 1.20 min
Epoch: 003/100 | Batch 000/192 | Cost: 0.7066
Epoch: 003/100 | Batch 120/192 | Cost: 0.6379
Epoch: 003/100 Train Acc.: 76.63% | Validation Acc.: 73.80%
Time elapsed: 1.80 min
Epoch: 004/100 | Batch 000/192 | Cost: 0.7371
Epoch: 004/100 | Batch 120/192 | Cost: 0.7181
Epoch: 004/100 Train Acc.: 75.50% | Validation Acc.: 70.80%
Time elapsed: 2.40 min
Epoch: 005/100 | Batch 000/192 | Cost: 0.7634
Epoch: 005/100 | Batch 120/192 | Cost: 0.6887
Epoch: 005/100 Train Acc.: 77.08% | Validation Acc.: 73.50%
Time elapsed: 3.00 min
Epoch: 006/100 | Batch 000/192 | Cost: 0.6509
Epoch: 006/100 | Batch 120/192 | Cost: 0.7140
Epoch: 006/100 Train Acc.: 76.52% | Validation Acc.: 73.70%
Time elapsed: 3.60 min
Epoch: 007/100 | Batch 000/192 | Cost: 0.7971
Epoch: 007/100 | Batch 120/192 | Cost: 0.6890
Epoch: 007/100 Train Acc.: 76.42% | Validation Acc.: 72.60%
Time elapsed: 4.20 min
Epoch: 008/100 | Batch 000/192 | Cost: 0.6466
Epoch: 008/100 | Batch 120/192 | Cost: 0.5831
Epoch: 008/100 Train Acc.: 76.12% | Validation Acc.: 72.60%
Time elapsed: 4.80 min
Epoch: 009/100 | Batch 000/192 | Cost: 0.6249
Epoch: 009/100 | Batch 120/192 | Cost: 0.7722
Epoch: 009/100 Train Acc.: 76.25% | Validation Acc.: 73.00%
Time elapsed: 5.40 min
Epoch: 010/100 | Batch 000/192 | Cost: 0.6890
Epoch: 010/100 | Batch 120/192 | Cost: 0.6893
Epoch: 010/100 Train Acc.: 76.44% | Validation Acc.: 72.30%
Time elapsed: 6.00 min
Epoch: 011/100 | Batch 000/192 | Cost: 0.5889
Epoch: 011/100 | Batch 120/192 | Cost: 0.6722
Epoch: 011/100 Train Acc.: 75.29% | Validation Acc.: 71.20%
Time elapsed: 6.60 min
Epoch: 012/100 | Batch 000/192 | Cost: 0.7587
Epoch: 012/100 | Batch 120/192 | Cost: 0.6418
Epoch: 012/100 Train Acc.: 74.49% | Validation Acc.: 70.20%
Time elapsed: 7.20 min
Epoch: 013/100 | Batch 000/192 | Cost: 0.7850
Epoch: 013/100 | Batch 120/192 | Cost: 0.6407
Epoch: 013/100 Train Acc.: 76.05% | Validation Acc.: 71.00%
Time elapsed: 7.80 min
Epoch: 014/100 | Batch 000/192 | Cost: 0.8101
Epoch: 014/100 | Batch 120/192 | Cost: 0.6613
Epoch: 014/100 Train Acc.: 78.08% | Validation Acc.: 75.60%
Time elapsed: 8.40 min
Epoch: 015/100 | Batch 000/192 | Cost: 0.6271
Epoch: 015/100 | Batch 120/192 | Cost: 0.6263
Epoch: 015/100 Train Acc.: 77.98% | Validation Acc.: 73.80%
Time elapsed: 9.00 min
Epoch: 016/100 | Batch 000/192 | Cost: 0.6320
Epoch: 016/100 | Batch 120/192 | Cost: 0.7301
Epoch: 016/100 Train Acc.: 78.47% | Validation Acc.: 74.30%
Time elapsed: 9.59 min
Epoch: 017/100 | Batch 000/192 | Cost: 0.6161
Epoch: 017/100 | Batch 120/192 | Cost: 0.5296
Epoch: 017/100 Train Acc.: 75.61% | Validation Acc.: 71.90%
Time elapsed: 10.19 min
Epoch: 018/100 | Batch 000/192 | Cost: 0.7155
Epoch: 018/100 | Batch 120/192 | Cost: 0.6337
Epoch: 018/100 Train Acc.: 78.82% | Validation Acc.: 73.20%
Time elapsed: 10.79 min
Epoch: 019/100 | Batch 000/192 | Cost: 0.5872
Epoch: 019/100 | Batch 120/192 | Cost: 0.7798
Epoch: 019/100 Train Acc.: 77.89% | Validation Acc.: 71.50%
Time elapsed: 11.39 min
Epoch: 020/100 | Batch 000/192 | Cost: 0.6479
Epoch: 020/100 | Batch 120/192 | Cost: 0.7479
Epoch: 020/100 Train Acc.: 77.54% | Validation Acc.: 73.90%
Time elapsed: 11.99 min
Epoch: 021/100 | Batch 000/192 | Cost: 0.6184
Epoch: 021/100 | Batch 120/192 | Cost: 0.5969
Epoch: 021/100 Train Acc.: 78.09% | Validation Acc.: 73.10%
Time elapsed: 12.59 min
Epoch: 022/100 | Batch 000/192 | Cost: 0.6987
Epoch: 022/100 | Batch 120/192 | Cost: 0.6107
Epoch: 022/100 Train Acc.: 79.20% | Validation Acc.: 75.40%
Time elapsed: 13.19 min
Epoch: 023/100 | Batch 000/192 | Cost: 0.6745
Epoch: 023/100 | Batch 120/192 | Cost: 0.6630
Epoch: 023/100 Train Acc.: 77.09% | Validation Acc.: 73.50%
Time elapsed: 13.79 min
Epoch: 024/100 | Batch 000/192 | Cost: 0.5587
Epoch: 024/100 | Batch 120/192 | Cost: 0.7128
Epoch: 024/100 Train Acc.: 78.96% | Validation Acc.: 75.10%
Time elapsed: 14.39 min
Epoch: 025/100 | Batch 000/192 | Cost: 0.5177
Epoch: 025/100 | Batch 120/192 | Cost: 0.5650
Epoch: 025/100 Train Acc.: 78.93% | Validation Acc.: 74.90%
Time elapsed: 14.99 min
Epoch: 026/100 | Batch 000/192 | Cost: 0.5755
Epoch: 026/100 | Batch 120/192 | Cost: 0.5524
Epoch: 026/100 Train Acc.: 78.21% | Validation Acc.: 72.10%
Time elapsed: 15.59 min
Epoch: 027/100 | Batch 000/192 | Cost: 0.6927
Epoch: 027/100 | Batch 120/192 | Cost: 0.6578
Epoch: 027/100 Train Acc.: 77.63% | Validation Acc.: 74.20%
Time elapsed: 16.19 min
Epoch: 028/100 | Batch 000/192 | Cost: 0.5719
Epoch: 028/100 | Batch 120/192 | Cost: 0.5817
Epoch: 028/100 Train Acc.: 79.22% | Validation Acc.: 75.80%
Time elapsed: 16.78 min
Epoch: 029/100 | Batch 000/192 | Cost: 0.7459
Epoch: 029/100 | Batch 120/192 | Cost: 0.5700
Epoch: 029/100 Train Acc.: 78.72% | Validation Acc.: 73.20%
Time elapsed: 17.38 min
Epoch: 030/100 | Batch 000/192 | Cost: 0.6114
Epoch: 030/100 | Batch 120/192 | Cost: 0.6975
Epoch: 030/100 Train Acc.: 79.26% | Validation Acc.: 74.50%
Time elapsed: 17.98 min
Epoch: 031/100 | Batch 000/192 | Cost: 0.6240
Epoch: 031/100 | Batch 120/192 | Cost: 0.5819
Epoch: 031/100 Train Acc.: 78.59% | Validation Acc.: 73.50%
Time elapsed: 18.58 min
Epoch: 032/100 | Batch 000/192 | Cost: 0.6015
Epoch: 032/100 | Batch 120/192 | Cost: 0.6598
Epoch: 032/100 Train Acc.: 79.42% | Validation Acc.: 74.90%
Time elapsed: 19.18 min
Epoch: 033/100 | Batch 000/192 | Cost: 0.5757
Epoch: 033/100 | Batch 120/192 | Cost: 0.5761
Epoch: 033/100 Train Acc.: 79.55% | Validation Acc.: 73.30%
Time elapsed: 19.78 min
Epoch: 034/100 | Batch 000/192 | Cost: 0.5700
Epoch: 034/100 | Batch 120/192 | Cost: 0.5877
Epoch: 034/100 Train Acc.: 78.55% | Validation Acc.: 74.30%
Time elapsed: 20.38 min
Epoch: 035/100 | Batch 000/192 | Cost: 0.7023
Epoch: 035/100 | Batch 120/192 | Cost: 0.6592
Epoch: 035/100 Train Acc.: 79.70% | Validation Acc.: 74.50%
Time elapsed: 20.98 min
Epoch: 036/100 | Batch 000/192 | Cost: 0.6325
Epoch: 036/100 | Batch 120/192 | Cost: 0.5737
Epoch: 036/100 Train Acc.: 78.72% | Validation Acc.: 74.00%
Time elapsed: 21.58 min
Epoch: 037/100 | Batch 000/192 | Cost: 0.6329
Epoch: 037/100 | Batch 120/192 | Cost: 0.6539
Epoch: 037/100 Train Acc.: 79.14% | Validation Acc.: 75.80%
Time elapsed: 22.18 min
Epoch: 038/100 | Batch 000/192 | Cost: 0.6387
Epoch: 038/100 | Batch 120/192 | Cost: 0.7099
Epoch: 038/100 Train Acc.: 79.74% | Validation Acc.: 75.80%
Time elapsed: 22.78 min
Epoch: 039/100 | Batch 000/192 | Cost: 0.5981
Epoch: 039/100 | Batch 120/192 | Cost: 0.6197
Epoch: 039/100 Train Acc.: 79.78% | Validation Acc.: 75.20%
Time elapsed: 23.38 min
Epoch: 040/100 | Batch 000/192 | Cost: 0.5893
Epoch: 040/100 | Batch 120/192 | Cost: 0.6424
Epoch: 040/100 Train Acc.: 79.92% | Validation Acc.: 75.40%
Time elapsed: 23.97 min
Epoch: 041/100 | Batch 000/192 | Cost: 0.5046
Epoch: 041/100 | Batch 120/192 | Cost: 0.6603
Epoch: 041/100 Train Acc.: 80.10% | Validation Acc.: 76.90%
Time elapsed: 24.57 min
Epoch: 042/100 | Batch 000/192 | Cost: 0.5880
Epoch: 042/100 | Batch 120/192 | Cost: 0.6180
Epoch: 042/100 Train Acc.: 80.03% | Validation Acc.: 75.80%
Time elapsed: 25.17 min
Epoch: 043/100 | Batch 000/192 | Cost: 0.5328
Epoch: 043/100 | Batch 120/192 | Cost: 0.5885
Epoch: 043/100 Train Acc.: 80.61% | Validation Acc.: 75.30%
Time elapsed: 25.77 min
Epoch: 044/100 | Batch 000/192 | Cost: 0.5354
Epoch: 044/100 | Batch 120/192 | Cost: 0.6034
Epoch: 044/100 Train Acc.: 79.90% | Validation Acc.: 75.40%
Time elapsed: 26.37 min
Epoch: 045/100 | Batch 000/192 | Cost: 0.5417
Epoch: 045/100 | Batch 120/192 | Cost: 0.5952
Epoch: 045/100 Train Acc.: 80.90% | Validation Acc.: 76.70%
Time elapsed: 26.97 min
Epoch: 046/100 | Batch 000/192 | Cost: 0.5907
Epoch: 046/100 | Batch 120/192 | Cost: 0.5139
Epoch: 046/100 Train Acc.: 80.51% | Validation Acc.: 75.30%
Time elapsed: 27.57 min
Epoch: 047/100 | Batch 000/192 | Cost: 0.5880
Epoch: 047/100 | Batch 120/192 | Cost: 0.6240
Epoch: 047/100 Train Acc.: 79.33% | Validation Acc.: 75.70%
Time elapsed: 28.17 min
Epoch: 048/100 | Batch 000/192 | Cost: 0.5946
Epoch: 048/100 | Batch 120/192 | Cost: 0.4510
Epoch: 048/100 Train Acc.: 78.83% | Validation Acc.: 74.50%
Time elapsed: 28.76 min
Epoch: 049/100 | Batch 000/192 | Cost: 0.4876
Epoch: 049/100 | Batch 120/192 | Cost: 0.5322
Epoch: 049/100 Train Acc.: 79.87% | Validation Acc.: 73.60%
Time elapsed: 29.37 min
Epoch: 050/100 | Batch 000/192 | Cost: 0.5297
Epoch: 050/100 | Batch 120/192 | Cost: 0.5505
Epoch: 050/100 Train Acc.: 80.06% | Validation Acc.: 74.70%
Time elapsed: 29.96 min
Epoch: 051/100 | Batch 000/192 | Cost: 0.5628
Epoch: 051/100 | Batch 120/192 | Cost: 0.6086
Epoch: 051/100 Train Acc.: 79.69% | Validation Acc.: 74.50%
Time elapsed: 30.56 min
Epoch: 052/100 | Batch 000/192 | Cost: 0.5363
Epoch: 052/100 | Batch 120/192 | Cost: 0.5686
Epoch: 052/100 Train Acc.: 79.82% | Validation Acc.: 75.40%
Time elapsed: 31.17 min
Epoch: 053/100 | Batch 000/192 | Cost: 0.5480
Epoch: 053/100 | Batch 120/192 | Cost: 0.6160
Epoch: 053/100 Train Acc.: 81.01% | Validation Acc.: 74.90%
Time elapsed: 31.77 min
Epoch: 054/100 | Batch 000/192 | Cost: 0.6155
Epoch: 054/100 | Batch 120/192 | Cost: 0.5478
Epoch: 054/100 Train Acc.: 80.68% | Validation Acc.: 76.30%
Time elapsed: 32.37 min
Epoch: 055/100 | Batch 000/192 | Cost: 0.5578
Epoch: 055/100 | Batch 120/192 | Cost: 0.5159
Epoch: 055/100 Train Acc.: 81.08% | Validation Acc.: 76.00%
Time elapsed: 32.97 min
Epoch: 056/100 | Batch 000/192 | Cost: 0.5334
Epoch: 056/100 | Batch 120/192 | Cost: 0.4940
Epoch: 056/100 Train Acc.: 80.93% | Validation Acc.: 74.80%
Time elapsed: 33.56 min
Epoch: 057/100 | Batch 000/192 | Cost: 0.4510
Epoch: 057/100 | Batch 120/192 | Cost: 0.4787
Epoch: 057/100 Train Acc.: 80.99% | Validation Acc.: 77.60%
Time elapsed: 34.16 min
Epoch: 058/100 | Batch 000/192 | Cost: 0.5937
Epoch: 058/100 | Batch 120/192 | Cost: 0.5977
Epoch: 058/100 Train Acc.: 81.55% | Validation Acc.: 76.60%
Time elapsed: 34.76 min
Epoch: 059/100 | Batch 000/192 | Cost: 0.5205
Epoch: 059/100 | Batch 120/192 | Cost: 0.5913
Epoch: 059/100 Train Acc.: 81.43% | Validation Acc.: 75.50%
Time elapsed: 35.36 min
Epoch: 060/100 | Batch 000/192 | Cost: 0.5559
Epoch: 060/100 | Batch 120/192 | Cost: 0.5472
Epoch: 060/100 Train Acc.: 81.28% | Validation Acc.: 76.00%
Time elapsed: 35.96 min
Epoch: 061/100 | Batch 000/192 | Cost: 0.5187
Epoch: 061/100 | Batch 120/192 | Cost: 0.5881
Epoch: 061/100 Train Acc.: 80.71% | Validation Acc.: 75.30%
Time elapsed: 36.56 min
Epoch: 062/100 | Batch 000/192 | Cost: 0.5069
Epoch: 062/100 | Batch 120/192 | Cost: 0.4619
Epoch: 062/100 Train Acc.: 79.42% | Validation Acc.: 74.50%
Time elapsed: 37.16 min
Epoch: 063/100 | Batch 000/192 | Cost: 0.5226
Epoch: 063/100 | Batch 120/192 | Cost: 0.4582
Epoch: 063/100 Train Acc.: 82.28% | Validation Acc.: 76.50%
Time elapsed: 37.76 min
Epoch: 064/100 | Batch 000/192 | Cost: 0.4503
Epoch: 064/100 | Batch 120/192 | Cost: 0.6131
Epoch: 064/100 Train Acc.: 81.77% | Validation Acc.: 77.10%
Time elapsed: 38.36 min
Epoch: 065/100 | Batch 000/192 | Cost: 0.5485
Epoch: 065/100 | Batch 120/192 | Cost: 0.4611
Epoch: 065/100 Train Acc.: 82.12% | Validation Acc.: 76.90%
Time elapsed: 38.96 min
Epoch: 066/100 | Batch 000/192 | Cost: 0.5312
Epoch: 066/100 | Batch 120/192 | Cost: 0.5280
Epoch: 066/100 Train Acc.: 80.55% | Validation Acc.: 75.70%
Time elapsed: 39.56 min
Epoch: 067/100 | Batch 000/192 | Cost: 0.5594
Epoch: 067/100 | Batch 120/192 | Cost: 0.5339
Epoch: 067/100 Train Acc.: 82.44% | Validation Acc.: 75.90%
Time elapsed: 40.16 min
Epoch: 068/100 | Batch 000/192 | Cost: 0.5033
Epoch: 068/100 | Batch 120/192 | Cost: 0.5830
Epoch: 068/100 Train Acc.: 80.29% | Validation Acc.: 77.00%
Time elapsed: 40.76 min
Epoch: 069/100 | Batch 000/192 | Cost: 0.5595
Epoch: 069/100 | Batch 120/192 | Cost: 0.4824
Epoch: 069/100 Train Acc.: 82.34% | Validation Acc.: 76.60%
Time elapsed: 41.36 min
Epoch: 070/100 | Batch 000/192 | Cost: 0.4599
Epoch: 070/100 | Batch 120/192 | Cost: 0.5537
Epoch: 070/100 Train Acc.: 80.60% | Validation Acc.: 76.40%
Time elapsed: 41.96 min
Epoch: 071/100 | Batch 000/192 | Cost: 0.4914
Epoch: 071/100 | Batch 120/192 | Cost: 0.4687
Epoch: 071/100 Train Acc.: 81.68% | Validation Acc.: 78.70%
Time elapsed: 42.56 min
Epoch: 072/100 | Batch 000/192 | Cost: 0.5055
Epoch: 072/100 | Batch 120/192 | Cost: 0.5267
Epoch: 072/100 Train Acc.: 82.67% | Validation Acc.: 78.50%
Time elapsed: 43.16 min
Epoch: 073/100 | Batch 000/192 | Cost: 0.4431
Epoch: 073/100 | Batch 120/192 | Cost: 0.5050
Epoch: 073/100 Train Acc.: 80.47% | Validation Acc.: 75.50%
Time elapsed: 43.75 min
Epoch: 074/100 | Batch 000/192 | Cost: 0.6785
Epoch: 074/100 | Batch 120/192 | Cost: 0.4457
Epoch: 074/100 Train Acc.: 83.49% | Validation Acc.: 78.10%
Time elapsed: 44.35 min
Epoch: 075/100 | Batch 000/192 | Cost: 0.4841
Epoch: 075/100 | Batch 120/192 | Cost: 0.5260
Epoch: 075/100 Train Acc.: 82.56% | Validation Acc.: 76.60%
Time elapsed: 44.95 min
Epoch: 076/100 | Batch 000/192 | Cost: 0.4382
Epoch: 076/100 | Batch 120/192 | Cost: 0.5470
Epoch: 076/100 Train Acc.: 80.99% | Validation Acc.: 76.40%
Time elapsed: 45.55 min
Epoch: 077/100 | Batch 000/192 | Cost: 0.5573
Epoch: 077/100 | Batch 120/192 | Cost: 0.5162
Epoch: 077/100 Train Acc.: 81.44% | Validation Acc.: 75.30%
Time elapsed: 46.15 min
Epoch: 078/100 | Batch 000/192 | Cost: 0.6098
Epoch: 078/100 | Batch 120/192 | Cost: 0.5280
Epoch: 078/100 Train Acc.: 81.36% | Validation Acc.: 77.40%
Time elapsed: 46.75 min
Epoch: 079/100 | Batch 000/192 | Cost: 0.5197
Epoch: 079/100 | Batch 120/192 | Cost: 0.5272
Epoch: 079/100 Train Acc.: 83.33% | Validation Acc.: 79.00%
Time elapsed: 47.35 min
Epoch: 080/100 | Batch 000/192 | Cost: 0.4899
Epoch: 080/100 | Batch 120/192 | Cost: 0.4799
Epoch: 080/100 Train Acc.: 82.94% | Validation Acc.: 77.20%
Time elapsed: 47.95 min
Epoch: 081/100 | Batch 000/192 | Cost: 0.5248
Epoch: 081/100 | Batch 120/192 | Cost: 0.5740
Epoch: 081/100 Train Acc.: 82.80% | Validation Acc.: 76.40%
Time elapsed: 48.54 min
Epoch: 082/100 | Batch 000/192 | Cost: 0.5918
Epoch: 082/100 | Batch 120/192 | Cost: 0.5965
Epoch: 082/100 Train Acc.: 82.79% | Validation Acc.: 79.90%
Time elapsed: 49.14 min
Epoch: 083/100 | Batch 000/192 | Cost: 0.4830
Epoch: 083/100 | Batch 120/192 | Cost: 0.5613
Epoch: 083/100 Train Acc.: 82.44% | Validation Acc.: 77.20%
Time elapsed: 49.74 min
Epoch: 084/100 | Batch 000/192 | Cost: 0.4969
Epoch: 084/100 | Batch 120/192 | Cost: 0.4625
Epoch: 084/100 Train Acc.: 82.37% | Validation Acc.: 77.20%
Time elapsed: 50.34 min
Epoch: 085/100 | Batch 000/192 | Cost: 0.5664
Epoch: 085/100 | Batch 120/192 | Cost: 0.4476
Epoch: 085/100 Train Acc.: 83.63% | Validation Acc.: 78.10%
Time elapsed: 50.94 min
Epoch: 086/100 | Batch 000/192 | Cost: 0.4885
Epoch: 086/100 | Batch 120/192 | Cost: 0.5724
Epoch: 086/100 Train Acc.: 82.88% | Validation Acc.: 77.10%
Time elapsed: 51.54 min
Epoch: 087/100 | Batch 000/192 | Cost: 0.4967
Epoch: 087/100 | Batch 120/192 | Cost: 0.3785
Epoch: 087/100 Train Acc.: 83.54% | Validation Acc.: 77.70%
Time elapsed: 52.14 min
Epoch: 088/100 | Batch 000/192 | Cost: 0.3773
Epoch: 088/100 | Batch 120/192 | Cost: 0.5794
Epoch: 088/100 Train Acc.: 83.13% | Validation Acc.: 77.00%
Time elapsed: 52.74 min
Epoch: 089/100 | Batch 000/192 | Cost: 0.4332
Epoch: 089/100 | Batch 120/192 | Cost: 0.5796
Epoch: 089/100 Train Acc.: 83.47% | Validation Acc.: 78.10%
Time elapsed: 53.34 min
Epoch: 090/100 | Batch 000/192 | Cost: 0.4397
Epoch: 090/100 | Batch 120/192 | Cost: 0.4944
Epoch: 090/100 Train Acc.: 83.38% | Validation Acc.: 78.10%
Time elapsed: 53.94 min
Epoch: 091/100 | Batch 000/192 | Cost: 0.3981
Epoch: 091/100 | Batch 120/192 | Cost: 0.5605
Epoch: 091/100 Train Acc.: 81.94% | Validation Acc.: 77.50%
Time elapsed: 54.54 min
Epoch: 092/100 | Batch 000/192 | Cost: 0.4216
Epoch: 092/100 | Batch 120/192 | Cost: 0.4957
Epoch: 092/100 Train Acc.: 83.90% | Validation Acc.: 78.60%
Time elapsed: 55.14 min
Epoch: 093/100 | Batch 000/192 | Cost: 0.4642
Epoch: 093/100 | Batch 120/192 | Cost: 0.4166
Epoch: 093/100 Train Acc.: 80.66% | Validation Acc.: 76.10%
Time elapsed: 55.74 min
Epoch: 094/100 | Batch 000/192 | Cost: 0.6470
Epoch: 094/100 | Batch 120/192 | Cost: 0.4605
Epoch: 094/100 Train Acc.: 82.41% | Validation Acc.: 77.70%
Time elapsed: 56.34 min
Epoch: 095/100 | Batch 000/192 | Cost: 0.5082
Epoch: 095/100 | Batch 120/192 | Cost: 0.5747
Epoch: 095/100 Train Acc.: 83.41% | Validation Acc.: 77.90%
Time elapsed: 56.94 min
Epoch: 096/100 | Batch 000/192 | Cost: 0.5011
Epoch: 096/100 | Batch 120/192 | Cost: 0.4896
Epoch: 096/100 Train Acc.: 83.62% | Validation Acc.: 77.80%
Time elapsed: 57.53 min
Epoch: 097/100 | Batch 000/192 | Cost: 0.4947
Epoch: 097/100 | Batch 120/192 | Cost: 0.4069
Epoch: 097/100 Train Acc.: 82.59% | Validation Acc.: 77.80%
Time elapsed: 58.13 min
Epoch: 098/100 | Batch 000/192 | Cost: 0.5170
Epoch: 098/100 | Batch 120/192 | Cost: 0.5355
Epoch: 098/100 Train Acc.: 84.31% | Validation Acc.: 80.50%
Time elapsed: 58.73 min
Epoch: 099/100 | Batch 000/192 | Cost: 0.3841
Epoch: 099/100 | Batch 120/192 | Cost: 0.4383
Epoch: 099/100 Train Acc.: 84.64% | Validation Acc.: 78.80%
Time elapsed: 59.33 min
Epoch: 100/100 | Batch 000/192 | Cost: 0.3343
Epoch: 100/100 | Batch 120/192 | Cost: 0.5621
Epoch: 100/100 Train Acc.: 83.69% | Validation Acc.: 78.20%
Time elapsed: 59.93 min
Total Training Time: 59.93 min