Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka 

CPython 3.7.3
IPython 7.9.0

torch 1.3.0

BatchNorm before and after Activation for Network-in-Network CIFAR-10 Classifier

The CNN architecture is based on

  • Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).

This paper compares using BatchNorm before the activation function as suggested in

and after the activation function as it is nowadays common practice.

Imports

In [2]:
import os
import time

import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torch.utils.data import Subset

from torchvision import datasets
from torchvision import transforms

import matplotlib.pyplot as plt
from PIL import Image


if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True

Model Settings

In [3]:
##########################
### SETTINGS
##########################

# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.0005
BATCH_SIZE = 256
NUM_EPOCHS = 100

# Architecture
NUM_CLASSES = 10

# Other
DEVICE = "cuda:2"
GRAYSCALE = False
In [4]:
##########################
### CIFAR-10 Dataset
##########################


# Note transforms.ToTensor() scales input images
# to 0-1 range


train_indices = torch.arange(0, 49000)
valid_indices = torch.arange(49000, 50000)


train_and_valid = datasets.CIFAR10(root='data', 
                                   train=True, 
                                   transform=transforms.ToTensor(),
                                   download=True)

train_dataset = Subset(train_and_valid, train_indices)
valid_dataset = Subset(train_and_valid, valid_indices)


test_dataset = datasets.CIFAR10(root='data', 
                                train=False, 
                                transform=transforms.ToTensor())


#####################################################
### Data Loaders
#####################################################

train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=8,
                          shuffle=True)

valid_loader = DataLoader(dataset=valid_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=8,
                          shuffle=False)

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE,
                         num_workers=8,
                         shuffle=False)

#####################################################

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break

for images, labels in test_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
    
for images, labels in valid_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
Files already downloaded and verified
Image batch dimensions: torch.Size([256, 3, 32, 32])
Image label dimensions: torch.Size([256])
Image batch dimensions: torch.Size([256, 3, 32, 32])
Image label dimensions: torch.Size([256])
Image batch dimensions: torch.Size([256, 3, 32, 32])
Image label dimensions: torch.Size([256])

Without BatchNorm

In [5]:
##########################
### MODEL
##########################


class NiN(nn.Module):
    def __init__(self, num_classes):
        super(NiN, self).__init__()
        self.num_classes = num_classes
        self.classifier = nn.Sequential(
                nn.Conv2d(3, 192, kernel_size=5, stride=1, padding=2),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 160, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.Conv2d(160,  96, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(96, 192, kernel_size=5, stride=1, padding=2),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(192, 192, kernel_size=3, stride=1, padding=1),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.Conv2d(192,  10, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=8, stride=1, padding=0),

                )

    def forward(self, x):
        x = self.classifier(x)
        logits = x.view(x.size(0), self.num_classes)
        probas = torch.softmax(logits, dim=1)
        return logits, probas
In [6]:
torch.manual_seed(RANDOM_SEED)

model = NiN(NUM_CLASSES)
model.to(DEVICE)

optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)  
In [7]:
def compute_accuracy(model, data_loader, device):
    correct_pred, num_examples = 0, 0
    for i, (features, targets) in enumerate(data_loader):
            
        features = features.to(device)
        targets = targets.to(device)

        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100
    

start_time = time.time()
for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    with torch.set_grad_enabled(False):
        train_acc = compute_accuracy(model, train_loader, device=DEVICE)
        valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/100 | Batch 000/192 | Cost: 2.3043
Epoch: 001/100 | Batch 120/192 | Cost: 2.0653
Epoch: 001/100 Train Acc.: 24.69% | Validation Acc.: 24.50%
Time elapsed: 0.33 min
Epoch: 002/100 | Batch 000/192 | Cost: 1.8584
Epoch: 002/100 | Batch 120/192 | Cost: 1.7447
Epoch: 002/100 Train Acc.: 36.51% | Validation Acc.: 36.90%
Time elapsed: 0.65 min
Epoch: 003/100 | Batch 000/192 | Cost: 1.6050
Epoch: 003/100 | Batch 120/192 | Cost: 1.5591
Epoch: 003/100 Train Acc.: 40.50% | Validation Acc.: 37.50%
Time elapsed: 0.97 min
Epoch: 004/100 | Batch 000/192 | Cost: 1.5428
Epoch: 004/100 | Batch 120/192 | Cost: 1.4454
Epoch: 004/100 Train Acc.: 46.12% | Validation Acc.: 45.80%
Time elapsed: 1.30 min
Epoch: 005/100 | Batch 000/192 | Cost: 1.4038
Epoch: 005/100 | Batch 120/192 | Cost: 1.4141
Epoch: 005/100 Train Acc.: 50.21% | Validation Acc.: 49.90%
Time elapsed: 1.63 min
Epoch: 006/100 | Batch 000/192 | Cost: 1.3475
Epoch: 006/100 | Batch 120/192 | Cost: 1.2627
Epoch: 006/100 Train Acc.: 52.66% | Validation Acc.: 54.40%
Time elapsed: 1.97 min
Epoch: 007/100 | Batch 000/192 | Cost: 1.3238
Epoch: 007/100 | Batch 120/192 | Cost: 1.2220
Epoch: 007/100 Train Acc.: 54.42% | Validation Acc.: 54.40%
Time elapsed: 2.31 min
Epoch: 008/100 | Batch 000/192 | Cost: 1.2009
Epoch: 008/100 | Batch 120/192 | Cost: 1.2045
Epoch: 008/100 Train Acc.: 55.81% | Validation Acc.: 55.30%
Time elapsed: 2.65 min
Epoch: 009/100 | Batch 000/192 | Cost: 1.2797
Epoch: 009/100 | Batch 120/192 | Cost: 1.1397
Epoch: 009/100 Train Acc.: 59.10% | Validation Acc.: 60.60%
Time elapsed: 2.98 min
Epoch: 010/100 | Batch 000/192 | Cost: 1.0562
Epoch: 010/100 | Batch 120/192 | Cost: 1.1625
Epoch: 010/100 Train Acc.: 59.79% | Validation Acc.: 60.90%
Time elapsed: 3.32 min
Epoch: 011/100 | Batch 000/192 | Cost: 1.0868
Epoch: 011/100 | Batch 120/192 | Cost: 1.0636
Epoch: 011/100 Train Acc.: 60.43% | Validation Acc.: 61.00%
Time elapsed: 3.66 min
Epoch: 012/100 | Batch 000/192 | Cost: 1.0049
Epoch: 012/100 | Batch 120/192 | Cost: 1.2247
Epoch: 012/100 Train Acc.: 62.14% | Validation Acc.: 62.70%
Time elapsed: 4.00 min
Epoch: 013/100 | Batch 000/192 | Cost: 0.9232
Epoch: 013/100 | Batch 120/192 | Cost: 1.0345
Epoch: 013/100 Train Acc.: 61.42% | Validation Acc.: 61.70%
Time elapsed: 4.34 min
Epoch: 014/100 | Batch 000/192 | Cost: 0.9256
Epoch: 014/100 | Batch 120/192 | Cost: 1.1639
Epoch: 014/100 Train Acc.: 63.82% | Validation Acc.: 65.80%
Time elapsed: 4.68 min
Epoch: 015/100 | Batch 000/192 | Cost: 0.9600
Epoch: 015/100 | Batch 120/192 | Cost: 1.0263
Epoch: 015/100 Train Acc.: 63.94% | Validation Acc.: 64.00%
Time elapsed: 5.02 min
Epoch: 016/100 | Batch 000/192 | Cost: 0.8859
Epoch: 016/100 | Batch 120/192 | Cost: 1.0307
Epoch: 016/100 Train Acc.: 65.79% | Validation Acc.: 66.40%
Time elapsed: 5.36 min
Epoch: 017/100 | Batch 000/192 | Cost: 1.0020
Epoch: 017/100 | Batch 120/192 | Cost: 0.9755
Epoch: 017/100 Train Acc.: 66.95% | Validation Acc.: 66.60%
Time elapsed: 5.70 min
Epoch: 018/100 | Batch 000/192 | Cost: 0.9551
Epoch: 018/100 | Batch 120/192 | Cost: 0.8429
Epoch: 018/100 Train Acc.: 67.56% | Validation Acc.: 66.30%
Time elapsed: 6.04 min
Epoch: 019/100 | Batch 000/192 | Cost: 1.0420
Epoch: 019/100 | Batch 120/192 | Cost: 0.9771
Epoch: 019/100 Train Acc.: 69.44% | Validation Acc.: 68.20%
Time elapsed: 6.38 min
Epoch: 020/100 | Batch 000/192 | Cost: 0.8471
Epoch: 020/100 | Batch 120/192 | Cost: 0.8322
Epoch: 020/100 Train Acc.: 69.99% | Validation Acc.: 70.20%
Time elapsed: 6.72 min
Epoch: 021/100 | Batch 000/192 | Cost: 0.8974
Epoch: 021/100 | Batch 120/192 | Cost: 0.8585
Epoch: 021/100 Train Acc.: 69.52% | Validation Acc.: 69.30%
Time elapsed: 7.07 min
Epoch: 022/100 | Batch 000/192 | Cost: 0.8691
Epoch: 022/100 | Batch 120/192 | Cost: 0.6618
Epoch: 022/100 Train Acc.: 68.26% | Validation Acc.: 65.90%
Time elapsed: 7.41 min
Epoch: 023/100 | Batch 000/192 | Cost: 0.9277
Epoch: 023/100 | Batch 120/192 | Cost: 0.9011
Epoch: 023/100 Train Acc.: 71.66% | Validation Acc.: 72.10%
Time elapsed: 7.75 min
Epoch: 024/100 | Batch 000/192 | Cost: 0.7764
Epoch: 024/100 | Batch 120/192 | Cost: 0.7561
Epoch: 024/100 Train Acc.: 71.70% | Validation Acc.: 68.80%
Time elapsed: 8.09 min
Epoch: 025/100 | Batch 000/192 | Cost: 0.8113
Epoch: 025/100 | Batch 120/192 | Cost: 0.7186
Epoch: 025/100 Train Acc.: 73.62% | Validation Acc.: 73.00%
Time elapsed: 8.44 min
Epoch: 026/100 | Batch 000/192 | Cost: 0.6515
Epoch: 026/100 | Batch 120/192 | Cost: 0.6954
Epoch: 026/100 Train Acc.: 72.22% | Validation Acc.: 70.20%
Time elapsed: 8.78 min
Epoch: 027/100 | Batch 000/192 | Cost: 0.7278
Epoch: 027/100 | Batch 120/192 | Cost: 0.7117
Epoch: 027/100 Train Acc.: 74.82% | Validation Acc.: 72.30%
Time elapsed: 9.12 min
Epoch: 028/100 | Batch 000/192 | Cost: 0.6732
Epoch: 028/100 | Batch 120/192 | Cost: 0.6591
Epoch: 028/100 Train Acc.: 74.93% | Validation Acc.: 72.60%
Time elapsed: 9.46 min
Epoch: 029/100 | Batch 000/192 | Cost: 0.7438
Epoch: 029/100 | Batch 120/192 | Cost: 0.6429
Epoch: 029/100 Train Acc.: 75.44% | Validation Acc.: 72.80%
Time elapsed: 9.80 min
Epoch: 030/100 | Batch 000/192 | Cost: 0.7306
Epoch: 030/100 | Batch 120/192 | Cost: 0.6643
Epoch: 030/100 Train Acc.: 76.34% | Validation Acc.: 74.40%
Time elapsed: 10.15 min
Epoch: 031/100 | Batch 000/192 | Cost: 0.5957
Epoch: 031/100 | Batch 120/192 | Cost: 0.5574
Epoch: 031/100 Train Acc.: 76.60% | Validation Acc.: 75.90%
Time elapsed: 10.49 min
Epoch: 032/100 | Batch 000/192 | Cost: 0.6414
Epoch: 032/100 | Batch 120/192 | Cost: 0.6951
Epoch: 032/100 Train Acc.: 77.15% | Validation Acc.: 76.10%
Time elapsed: 10.83 min
Epoch: 033/100 | Batch 000/192 | Cost: 0.6898
Epoch: 033/100 | Batch 120/192 | Cost: 0.7784
Epoch: 033/100 Train Acc.: 77.15% | Validation Acc.: 74.70%
Time elapsed: 11.17 min
Epoch: 034/100 | Batch 000/192 | Cost: 0.5633
Epoch: 034/100 | Batch 120/192 | Cost: 0.6176
Epoch: 034/100 Train Acc.: 77.53% | Validation Acc.: 74.30%
Time elapsed: 11.52 min
Epoch: 035/100 | Batch 000/192 | Cost: 0.6300
Epoch: 035/100 | Batch 120/192 | Cost: 0.6720
Epoch: 035/100 Train Acc.: 78.39% | Validation Acc.: 76.10%
Time elapsed: 11.86 min
Epoch: 036/100 | Batch 000/192 | Cost: 0.7154
Epoch: 036/100 | Batch 120/192 | Cost: 0.6519
Epoch: 036/100 Train Acc.: 78.49% | Validation Acc.: 75.40%
Time elapsed: 12.20 min
Epoch: 037/100 | Batch 000/192 | Cost: 0.6381
Epoch: 037/100 | Batch 120/192 | Cost: 0.6618
Epoch: 037/100 Train Acc.: 79.58% | Validation Acc.: 75.80%
Time elapsed: 12.54 min
Epoch: 038/100 | Batch 000/192 | Cost: 0.6078
Epoch: 038/100 | Batch 120/192 | Cost: 0.5283
Epoch: 038/100 Train Acc.: 79.17% | Validation Acc.: 76.00%
Time elapsed: 12.88 min
Epoch: 039/100 | Batch 000/192 | Cost: 0.5576
Epoch: 039/100 | Batch 120/192 | Cost: 0.6219
Epoch: 039/100 Train Acc.: 79.91% | Validation Acc.: 76.70%
Time elapsed: 13.22 min
Epoch: 040/100 | Batch 000/192 | Cost: 0.5660
Epoch: 040/100 | Batch 120/192 | Cost: 0.5577
Epoch: 040/100 Train Acc.: 80.49% | Validation Acc.: 76.50%
Time elapsed: 13.56 min
Epoch: 041/100 | Batch 000/192 | Cost: 0.5098
Epoch: 041/100 | Batch 120/192 | Cost: 0.6621
Epoch: 041/100 Train Acc.: 80.86% | Validation Acc.: 75.70%
Time elapsed: 13.90 min
Epoch: 042/100 | Batch 000/192 | Cost: 0.4589
Epoch: 042/100 | Batch 120/192 | Cost: 0.5637
Epoch: 042/100 Train Acc.: 81.11% | Validation Acc.: 77.00%
Time elapsed: 14.24 min
Epoch: 043/100 | Batch 000/192 | Cost: 0.4507
Epoch: 043/100 | Batch 120/192 | Cost: 0.4865
Epoch: 043/100 Train Acc.: 82.07% | Validation Acc.: 78.10%
Time elapsed: 14.58 min
Epoch: 044/100 | Batch 000/192 | Cost: 0.4427
Epoch: 044/100 | Batch 120/192 | Cost: 0.5242
Epoch: 044/100 Train Acc.: 82.61% | Validation Acc.: 79.10%
Time elapsed: 14.92 min
Epoch: 045/100 | Batch 000/192 | Cost: 0.4989
Epoch: 045/100 | Batch 120/192 | Cost: 0.5811
Epoch: 045/100 Train Acc.: 82.55% | Validation Acc.: 79.30%
Time elapsed: 15.26 min
Epoch: 046/100 | Batch 000/192 | Cost: 0.5303
Epoch: 046/100 | Batch 120/192 | Cost: 0.4242
Epoch: 046/100 Train Acc.: 81.80% | Validation Acc.: 76.80%
Time elapsed: 15.60 min
Epoch: 047/100 | Batch 000/192 | Cost: 0.4491
Epoch: 047/100 | Batch 120/192 | Cost: 0.4902
Epoch: 047/100 Train Acc.: 82.54% | Validation Acc.: 77.90%
Time elapsed: 15.94 min
Epoch: 048/100 | Batch 000/192 | Cost: 0.4913
Epoch: 048/100 | Batch 120/192 | Cost: 0.6474
Epoch: 048/100 Train Acc.: 83.31% | Validation Acc.: 79.20%
Time elapsed: 16.28 min
Epoch: 049/100 | Batch 000/192 | Cost: 0.4585
Epoch: 049/100 | Batch 120/192 | Cost: 0.4845
Epoch: 049/100 Train Acc.: 83.53% | Validation Acc.: 78.40%
Time elapsed: 16.62 min
Epoch: 050/100 | Batch 000/192 | Cost: 0.6038
Epoch: 050/100 | Batch 120/192 | Cost: 0.5446
Epoch: 050/100 Train Acc.: 83.86% | Validation Acc.: 80.50%
Time elapsed: 16.96 min
Epoch: 051/100 | Batch 000/192 | Cost: 0.3793
Epoch: 051/100 | Batch 120/192 | Cost: 0.4499
Epoch: 051/100 Train Acc.: 83.11% | Validation Acc.: 76.80%
Time elapsed: 17.29 min
Epoch: 052/100 | Batch 000/192 | Cost: 0.5527
Epoch: 052/100 | Batch 120/192 | Cost: 0.4610
Epoch: 052/100 Train Acc.: 84.63% | Validation Acc.: 79.30%
Time elapsed: 17.63 min
Epoch: 053/100 | Batch 000/192 | Cost: 0.5015
Epoch: 053/100 | Batch 120/192 | Cost: 0.4079
Epoch: 053/100 Train Acc.: 84.18% | Validation Acc.: 77.60%
Time elapsed: 17.97 min
Epoch: 054/100 | Batch 000/192 | Cost: 0.5012
Epoch: 054/100 | Batch 120/192 | Cost: 0.4912
Epoch: 054/100 Train Acc.: 84.41% | Validation Acc.: 77.20%
Time elapsed: 18.30 min
Epoch: 055/100 | Batch 000/192 | Cost: 0.4015
Epoch: 055/100 | Batch 120/192 | Cost: 0.4919
Epoch: 055/100 Train Acc.: 85.16% | Validation Acc.: 80.20%
Time elapsed: 18.64 min
Epoch: 056/100 | Batch 000/192 | Cost: 0.3976
Epoch: 056/100 | Batch 120/192 | Cost: 0.4252
Epoch: 056/100 Train Acc.: 85.28% | Validation Acc.: 80.30%
Time elapsed: 18.97 min
Epoch: 057/100 | Batch 000/192 | Cost: 0.3372
Epoch: 057/100 | Batch 120/192 | Cost: 0.4634
Epoch: 057/100 Train Acc.: 84.29% | Validation Acc.: 78.60%
Time elapsed: 19.32 min
Epoch: 058/100 | Batch 000/192 | Cost: 0.4438
Epoch: 058/100 | Batch 120/192 | Cost: 0.3490
Epoch: 058/100 Train Acc.: 85.93% | Validation Acc.: 77.50%
Time elapsed: 19.66 min
Epoch: 059/100 | Batch 000/192 | Cost: 0.4541
Epoch: 059/100 | Batch 120/192 | Cost: 0.4415
Epoch: 059/100 Train Acc.: 84.34% | Validation Acc.: 78.40%
Time elapsed: 19.99 min
Epoch: 060/100 | Batch 000/192 | Cost: 0.3766
Epoch: 060/100 | Batch 120/192 | Cost: 0.4851
Epoch: 060/100 Train Acc.: 86.02% | Validation Acc.: 80.00%
Time elapsed: 20.33 min
Epoch: 061/100 | Batch 000/192 | Cost: 0.4967
Epoch: 061/100 | Batch 120/192 | Cost: 0.3708
Epoch: 061/100 Train Acc.: 85.57% | Validation Acc.: 79.50%
Time elapsed: 20.67 min
Epoch: 062/100 | Batch 000/192 | Cost: 0.4197
Epoch: 062/100 | Batch 120/192 | Cost: 0.3054
Epoch: 062/100 Train Acc.: 86.23% | Validation Acc.: 78.40%
Time elapsed: 21.01 min
Epoch: 063/100 | Batch 000/192 | Cost: 0.4595
Epoch: 063/100 | Batch 120/192 | Cost: 0.4200
Epoch: 063/100 Train Acc.: 86.52% | Validation Acc.: 79.80%
Time elapsed: 21.35 min
Epoch: 064/100 | Batch 000/192 | Cost: 0.3806
Epoch: 064/100 | Batch 120/192 | Cost: 0.3670
Epoch: 064/100 Train Acc.: 86.81% | Validation Acc.: 80.20%
Time elapsed: 21.69 min
Epoch: 065/100 | Batch 000/192 | Cost: 0.3922
Epoch: 065/100 | Batch 120/192 | Cost: 0.3698
Epoch: 065/100 Train Acc.: 86.30% | Validation Acc.: 77.90%
Time elapsed: 22.03 min
Epoch: 066/100 | Batch 000/192 | Cost: 0.3608
Epoch: 066/100 | Batch 120/192 | Cost: 0.4444
Epoch: 066/100 Train Acc.: 88.01% | Validation Acc.: 80.10%
Time elapsed: 22.37 min
Epoch: 067/100 | Batch 000/192 | Cost: 0.3374
Epoch: 067/100 | Batch 120/192 | Cost: 0.3158
Epoch: 067/100 Train Acc.: 87.94% | Validation Acc.: 80.40%
Time elapsed: 22.70 min
Epoch: 068/100 | Batch 000/192 | Cost: 0.3959
Epoch: 068/100 | Batch 120/192 | Cost: 0.2217
Epoch: 068/100 Train Acc.: 87.74% | Validation Acc.: 79.70%
Time elapsed: 23.04 min
Epoch: 069/100 | Batch 000/192 | Cost: 0.3795
Epoch: 069/100 | Batch 120/192 | Cost: 0.3398
Epoch: 069/100 Train Acc.: 88.28% | Validation Acc.: 79.70%
Time elapsed: 23.39 min
Epoch: 070/100 | Batch 000/192 | Cost: 0.3098
Epoch: 070/100 | Batch 120/192 | Cost: 0.3012
Epoch: 070/100 Train Acc.: 87.96% | Validation Acc.: 80.80%
Time elapsed: 23.73 min
Epoch: 071/100 | Batch 000/192 | Cost: 0.3705
Epoch: 071/100 | Batch 120/192 | Cost: 0.2943
Epoch: 071/100 Train Acc.: 88.02% | Validation Acc.: 79.90%
Time elapsed: 24.06 min
Epoch: 072/100 | Batch 000/192 | Cost: 0.3353
Epoch: 072/100 | Batch 120/192 | Cost: 0.3237
Epoch: 072/100 Train Acc.: 88.34% | Validation Acc.: 80.60%
Time elapsed: 24.40 min
Epoch: 073/100 | Batch 000/192 | Cost: 0.3683
Epoch: 073/100 | Batch 120/192 | Cost: 0.4178
Epoch: 073/100 Train Acc.: 88.93% | Validation Acc.: 80.10%
Time elapsed: 24.74 min
Epoch: 074/100 | Batch 000/192 | Cost: 0.2282
Epoch: 074/100 | Batch 120/192 | Cost: 0.1967
Epoch: 074/100 Train Acc.: 88.58% | Validation Acc.: 81.40%
Time elapsed: 25.08 min
Epoch: 075/100 | Batch 000/192 | Cost: 0.2701
Epoch: 075/100 | Batch 120/192 | Cost: 0.3722
Epoch: 075/100 Train Acc.: 87.93% | Validation Acc.: 79.70%
Time elapsed: 25.42 min
Epoch: 076/100 | Batch 000/192 | Cost: 0.2850
Epoch: 076/100 | Batch 120/192 | Cost: 0.2874
Epoch: 076/100 Train Acc.: 88.92% | Validation Acc.: 81.10%
Time elapsed: 25.75 min
Epoch: 077/100 | Batch 000/192 | Cost: 0.2686
Epoch: 077/100 | Batch 120/192 | Cost: 0.4312
Epoch: 077/100 Train Acc.: 89.39% | Validation Acc.: 81.60%
Time elapsed: 26.10 min
Epoch: 078/100 | Batch 000/192 | Cost: 0.2282
Epoch: 078/100 | Batch 120/192 | Cost: 0.3395
Epoch: 078/100 Train Acc.: 88.67% | Validation Acc.: 78.90%
Time elapsed: 26.43 min
Epoch: 079/100 | Batch 000/192 | Cost: 0.3127
Epoch: 079/100 | Batch 120/192 | Cost: 0.2906
Epoch: 079/100 Train Acc.: 90.77% | Validation Acc.: 81.20%
Time elapsed: 26.77 min
Epoch: 080/100 | Batch 000/192 | Cost: 0.2468
Epoch: 080/100 | Batch 120/192 | Cost: 0.3638
Epoch: 080/100 Train Acc.: 89.99% | Validation Acc.: 80.40%
Time elapsed: 27.11 min
Epoch: 081/100 | Batch 000/192 | Cost: 0.2936
Epoch: 081/100 | Batch 120/192 | Cost: 0.3772
Epoch: 081/100 Train Acc.: 90.76% | Validation Acc.: 80.50%
Time elapsed: 27.45 min
Epoch: 082/100 | Batch 000/192 | Cost: 0.2584
Epoch: 082/100 | Batch 120/192 | Cost: 0.2718
Epoch: 082/100 Train Acc.: 91.01% | Validation Acc.: 81.20%
Time elapsed: 27.79 min
Epoch: 083/100 | Batch 000/192 | Cost: 0.1904
Epoch: 083/100 | Batch 120/192 | Cost: 0.3090
Epoch: 083/100 Train Acc.: 90.68% | Validation Acc.: 81.30%
Time elapsed: 28.14 min
Epoch: 084/100 | Batch 000/192 | Cost: 0.2506
Epoch: 084/100 | Batch 120/192 | Cost: 0.2825
Epoch: 084/100 Train Acc.: 90.43% | Validation Acc.: 80.40%
Time elapsed: 28.47 min
Epoch: 085/100 | Batch 000/192 | Cost: 0.2307
Epoch: 085/100 | Batch 120/192 | Cost: 0.2441
Epoch: 085/100 Train Acc.: 90.88% | Validation Acc.: 81.30%
Time elapsed: 28.82 min
Epoch: 086/100 | Batch 000/192 | Cost: 0.3149
Epoch: 086/100 | Batch 120/192 | Cost: 0.3129
Epoch: 086/100 Train Acc.: 90.13% | Validation Acc.: 82.40%
Time elapsed: 29.16 min
Epoch: 087/100 | Batch 000/192 | Cost: 0.3487
Epoch: 087/100 | Batch 120/192 | Cost: 0.2559
Epoch: 087/100 Train Acc.: 90.74% | Validation Acc.: 81.40%
Time elapsed: 29.50 min
Epoch: 088/100 | Batch 000/192 | Cost: 0.2412
Epoch: 088/100 | Batch 120/192 | Cost: 0.1828
Epoch: 088/100 Train Acc.: 91.08% | Validation Acc.: 80.20%
Time elapsed: 29.84 min
Epoch: 089/100 | Batch 000/192 | Cost: 0.2957
Epoch: 089/100 | Batch 120/192 | Cost: 0.2939
Epoch: 089/100 Train Acc.: 90.67% | Validation Acc.: 80.30%
Time elapsed: 30.19 min
Epoch: 090/100 | Batch 000/192 | Cost: 0.2298
Epoch: 090/100 | Batch 120/192 | Cost: 0.2900
Epoch: 090/100 Train Acc.: 91.63% | Validation Acc.: 79.00%
Time elapsed: 30.53 min
Epoch: 091/100 | Batch 000/192 | Cost: 0.2558
Epoch: 091/100 | Batch 120/192 | Cost: 0.2915
Epoch: 091/100 Train Acc.: 91.36% | Validation Acc.: 81.00%
Time elapsed: 30.88 min
Epoch: 092/100 | Batch 000/192 | Cost: 0.1510
Epoch: 092/100 | Batch 120/192 | Cost: 0.1974
Epoch: 092/100 Train Acc.: 91.84% | Validation Acc.: 82.20%
Time elapsed: 31.22 min
Epoch: 093/100 | Batch 000/192 | Cost: 0.2308
Epoch: 093/100 | Batch 120/192 | Cost: 0.2247
Epoch: 093/100 Train Acc.: 91.50% | Validation Acc.: 80.50%
Time elapsed: 31.56 min
Epoch: 094/100 | Batch 000/192 | Cost: 0.2712
Epoch: 094/100 | Batch 120/192 | Cost: 0.3268
Epoch: 094/100 Train Acc.: 91.74% | Validation Acc.: 81.30%
Time elapsed: 31.91 min
Epoch: 095/100 | Batch 000/192 | Cost: 0.2417
Epoch: 095/100 | Batch 120/192 | Cost: 0.2162
Epoch: 095/100 Train Acc.: 91.53% | Validation Acc.: 79.00%
Time elapsed: 32.26 min
Epoch: 096/100 | Batch 000/192 | Cost: 0.2523
Epoch: 096/100 | Batch 120/192 | Cost: 0.2598
Epoch: 096/100 Train Acc.: 91.56% | Validation Acc.: 81.00%
Time elapsed: 32.60 min
Epoch: 097/100 | Batch 000/192 | Cost: 0.2027
Epoch: 097/100 | Batch 120/192 | Cost: 0.2432
Epoch: 097/100 Train Acc.: 92.53% | Validation Acc.: 80.80%
Time elapsed: 32.94 min
Epoch: 098/100 | Batch 000/192 | Cost: 0.2115
Epoch: 098/100 | Batch 120/192 | Cost: 0.2746
Epoch: 098/100 Train Acc.: 92.30% | Validation Acc.: 81.10%
Time elapsed: 33.28 min
Epoch: 099/100 | Batch 000/192 | Cost: 0.1611
Epoch: 099/100 | Batch 120/192 | Cost: 0.2142
Epoch: 099/100 Train Acc.: 92.66% | Validation Acc.: 80.90%
Time elapsed: 33.62 min
Epoch: 100/100 | Batch 000/192 | Cost: 0.1935
Epoch: 100/100 | Batch 120/192 | Cost: 0.2488
Epoch: 100/100 Train Acc.: 92.68% | Validation Acc.: 80.20%
Time elapsed: 33.97 min
Total Training Time: 33.97 min

BatchNorm before Activation

In [8]:
##########################
### MODEL
##########################


class NiN(nn.Module):
    def __init__(self, num_classes):
        super(NiN, self).__init__()
        self.num_classes = num_classes
        self.classifier = nn.Sequential(
                nn.Conv2d(3, 192, kernel_size=5, stride=1, padding=2, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 160, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(160),
                nn.ReLU(inplace=True),
                nn.Conv2d(160,  96, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(96),
                nn.ReLU(inplace=True),
                nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(96, 192, kernel_size=5, stride=1, padding=2, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(192, 192, kernel_size=3, stride=1, padding=1, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(192),
                nn.ReLU(inplace=True),
                nn.Conv2d(192,  10, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=8, stride=1, padding=0),

                )

    def forward(self, x):
        x = self.classifier(x)
        logits = x.view(x.size(0), self.num_classes)
        probas = torch.softmax(logits, dim=1)
        return logits, probas
In [9]:
torch.manual_seed(RANDOM_SEED)

model = NiN(NUM_CLASSES)
model.to(DEVICE)

optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)  
In [10]:
start_time = time.time()
for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    with torch.set_grad_enabled(False):
        train_acc = compute_accuracy(model, train_loader, device=DEVICE)
        valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/100 | Batch 000/192 | Cost: 2.3003
Epoch: 001/100 | Batch 120/192 | Cost: 1.1791
Epoch: 001/100 Train Acc.: 61.28% | Validation Acc.: 61.40%
Time elapsed: 0.37 min
Epoch: 002/100 | Batch 000/192 | Cost: 1.2742
Epoch: 002/100 | Batch 120/192 | Cost: 0.9198
Epoch: 002/100 Train Acc.: 69.36% | Validation Acc.: 66.70%
Time elapsed: 0.74 min
Epoch: 003/100 | Batch 000/192 | Cost: 0.7803
Epoch: 003/100 | Batch 120/192 | Cost: 0.8857
Epoch: 003/100 Train Acc.: 74.03% | Validation Acc.: 71.70%
Time elapsed: 1.11 min
Epoch: 004/100 | Batch 000/192 | Cost: 0.7233
Epoch: 004/100 | Batch 120/192 | Cost: 0.7254
Epoch: 004/100 Train Acc.: 76.76% | Validation Acc.: 75.80%
Time elapsed: 1.48 min
Epoch: 005/100 | Batch 000/192 | Cost: 0.6941
Epoch: 005/100 | Batch 120/192 | Cost: 0.7137
Epoch: 005/100 Train Acc.: 79.56% | Validation Acc.: 77.70%
Time elapsed: 1.84 min
Epoch: 006/100 | Batch 000/192 | Cost: 0.7098
Epoch: 006/100 | Batch 120/192 | Cost: 0.5519
Epoch: 006/100 Train Acc.: 80.33% | Validation Acc.: 78.80%
Time elapsed: 2.22 min
Epoch: 007/100 | Batch 000/192 | Cost: 0.6615
Epoch: 007/100 | Batch 120/192 | Cost: 0.5217
Epoch: 007/100 Train Acc.: 81.49% | Validation Acc.: 79.40%
Time elapsed: 2.58 min
Epoch: 008/100 | Batch 000/192 | Cost: 0.5005
Epoch: 008/100 | Batch 120/192 | Cost: 0.5437
Epoch: 008/100 Train Acc.: 83.25% | Validation Acc.: 80.10%
Time elapsed: 2.94 min
Epoch: 009/100 | Batch 000/192 | Cost: 0.4481
Epoch: 009/100 | Batch 120/192 | Cost: 0.5191
Epoch: 009/100 Train Acc.: 83.73% | Validation Acc.: 80.50%
Time elapsed: 3.32 min
Epoch: 010/100 | Batch 000/192 | Cost: 0.5392
Epoch: 010/100 | Batch 120/192 | Cost: 0.4766
Epoch: 010/100 Train Acc.: 84.86% | Validation Acc.: 80.20%
Time elapsed: 3.68 min
Epoch: 011/100 | Batch 000/192 | Cost: 0.4486
Epoch: 011/100 | Batch 120/192 | Cost: 0.5472
Epoch: 011/100 Train Acc.: 86.29% | Validation Acc.: 82.30%
Time elapsed: 4.05 min
Epoch: 012/100 | Batch 000/192 | Cost: 0.4129
Epoch: 012/100 | Batch 120/192 | Cost: 0.3839
Epoch: 012/100 Train Acc.: 87.13% | Validation Acc.: 82.60%
Time elapsed: 4.42 min
Epoch: 013/100 | Batch 000/192 | Cost: 0.3117
Epoch: 013/100 | Batch 120/192 | Cost: 0.3525
Epoch: 013/100 Train Acc.: 87.16% | Validation Acc.: 83.50%
Time elapsed: 4.78 min
Epoch: 014/100 | Batch 000/192 | Cost: 0.3939
Epoch: 014/100 | Batch 120/192 | Cost: 0.3900
Epoch: 014/100 Train Acc.: 87.78% | Validation Acc.: 83.30%
Time elapsed: 5.15 min
Epoch: 015/100 | Batch 000/192 | Cost: 0.4223
Epoch: 015/100 | Batch 120/192 | Cost: 0.3745
Epoch: 015/100 Train Acc.: 88.49% | Validation Acc.: 82.40%
Time elapsed: 5.52 min
Epoch: 016/100 | Batch 000/192 | Cost: 0.3464
Epoch: 016/100 | Batch 120/192 | Cost: 0.3434
Epoch: 016/100 Train Acc.: 88.83% | Validation Acc.: 83.10%
Time elapsed: 5.88 min
Epoch: 017/100 | Batch 000/192 | Cost: 0.2876
Epoch: 017/100 | Batch 120/192 | Cost: 0.2826
Epoch: 017/100 Train Acc.: 89.34% | Validation Acc.: 82.40%
Time elapsed: 6.25 min
Epoch: 018/100 | Batch 000/192 | Cost: 0.3779
Epoch: 018/100 | Batch 120/192 | Cost: 0.2662
Epoch: 018/100 Train Acc.: 90.05% | Validation Acc.: 84.50%
Time elapsed: 6.62 min
Epoch: 019/100 | Batch 000/192 | Cost: 0.3824
Epoch: 019/100 | Batch 120/192 | Cost: 0.2750
Epoch: 019/100 Train Acc.: 90.35% | Validation Acc.: 82.60%
Time elapsed: 6.99 min
Epoch: 020/100 | Batch 000/192 | Cost: 0.2361
Epoch: 020/100 | Batch 120/192 | Cost: 0.2459
Epoch: 020/100 Train Acc.: 91.09% | Validation Acc.: 83.90%
Time elapsed: 7.35 min
Epoch: 021/100 | Batch 000/192 | Cost: 0.2592
Epoch: 021/100 | Batch 120/192 | Cost: 0.2218
Epoch: 021/100 Train Acc.: 91.12% | Validation Acc.: 82.40%
Time elapsed: 7.72 min
Epoch: 022/100 | Batch 000/192 | Cost: 0.2464
Epoch: 022/100 | Batch 120/192 | Cost: 0.2699
Epoch: 022/100 Train Acc.: 91.39% | Validation Acc.: 84.20%
Time elapsed: 8.14 min
Epoch: 023/100 | Batch 000/192 | Cost: 0.1852
Epoch: 023/100 | Batch 120/192 | Cost: 0.2371
Epoch: 023/100 Train Acc.: 91.83% | Validation Acc.: 85.00%
Time elapsed: 8.92 min
Epoch: 024/100 | Batch 000/192 | Cost: 0.2384
Epoch: 024/100 | Batch 120/192 | Cost: 0.2285
Epoch: 024/100 Train Acc.: 91.90% | Validation Acc.: 83.90%
Time elapsed: 9.48 min
Epoch: 025/100 | Batch 000/192 | Cost: 0.1705
Epoch: 025/100 | Batch 120/192 | Cost: 0.2497
Epoch: 025/100 Train Acc.: 92.28% | Validation Acc.: 85.40%
Time elapsed: 9.99 min
Epoch: 026/100 | Batch 000/192 | Cost: 0.2336
Epoch: 026/100 | Batch 120/192 | Cost: 0.2631
Epoch: 026/100 Train Acc.: 93.21% | Validation Acc.: 85.80%
Time elapsed: 10.36 min
Epoch: 027/100 | Batch 000/192 | Cost: 0.1927
Epoch: 027/100 | Batch 120/192 | Cost: 0.1936
Epoch: 027/100 Train Acc.: 93.37% | Validation Acc.: 84.90%
Time elapsed: 10.85 min
Epoch: 028/100 | Batch 000/192 | Cost: 0.1647
Epoch: 028/100 | Batch 120/192 | Cost: 0.1183
Epoch: 028/100 Train Acc.: 93.40% | Validation Acc.: 84.50%
Time elapsed: 11.21 min
Epoch: 029/100 | Batch 000/192 | Cost: 0.1562
Epoch: 029/100 | Batch 120/192 | Cost: 0.1956
Epoch: 029/100 Train Acc.: 93.58% | Validation Acc.: 84.40%
Time elapsed: 11.58 min
Epoch: 030/100 | Batch 000/192 | Cost: 0.1309
Epoch: 030/100 | Batch 120/192 | Cost: 0.2334
Epoch: 030/100 Train Acc.: 93.98% | Validation Acc.: 86.40%
Time elapsed: 11.95 min
Epoch: 031/100 | Batch 000/192 | Cost: 0.1280
Epoch: 031/100 | Batch 120/192 | Cost: 0.1637
Epoch: 031/100 Train Acc.: 94.12% | Validation Acc.: 85.80%
Time elapsed: 12.31 min
Epoch: 032/100 | Batch 000/192 | Cost: 0.1509
Epoch: 032/100 | Batch 120/192 | Cost: 0.2148
Epoch: 032/100 Train Acc.: 93.88% | Validation Acc.: 84.20%
Time elapsed: 12.68 min
Epoch: 033/100 | Batch 000/192 | Cost: 0.1845
Epoch: 033/100 | Batch 120/192 | Cost: 0.1109
Epoch: 033/100 Train Acc.: 94.30% | Validation Acc.: 83.90%
Time elapsed: 13.05 min
Epoch: 034/100 | Batch 000/192 | Cost: 0.1668
Epoch: 034/100 | Batch 120/192 | Cost: 0.1756
Epoch: 034/100 Train Acc.: 94.75% | Validation Acc.: 83.80%
Time elapsed: 13.42 min
Epoch: 035/100 | Batch 000/192 | Cost: 0.1348
Epoch: 035/100 | Batch 120/192 | Cost: 0.1297
Epoch: 035/100 Train Acc.: 94.46% | Validation Acc.: 85.10%
Time elapsed: 13.79 min
Epoch: 036/100 | Batch 000/192 | Cost: 0.1827
Epoch: 036/100 | Batch 120/192 | Cost: 0.2066
Epoch: 036/100 Train Acc.: 95.07% | Validation Acc.: 83.90%
Time elapsed: 14.15 min
Epoch: 037/100 | Batch 000/192 | Cost: 0.1531
Epoch: 037/100 | Batch 120/192 | Cost: 0.1473
Epoch: 037/100 Train Acc.: 95.05% | Validation Acc.: 85.80%
Time elapsed: 14.52 min
Epoch: 038/100 | Batch 000/192 | Cost: 0.0932
Epoch: 038/100 | Batch 120/192 | Cost: 0.1691
Epoch: 038/100 Train Acc.: 95.08% | Validation Acc.: 84.40%
Time elapsed: 14.89 min
Epoch: 039/100 | Batch 000/192 | Cost: 0.1114
Epoch: 039/100 | Batch 120/192 | Cost: 0.2167
Epoch: 039/100 Train Acc.: 95.74% | Validation Acc.: 86.20%
Time elapsed: 15.25 min
Epoch: 040/100 | Batch 000/192 | Cost: 0.1177
Epoch: 040/100 | Batch 120/192 | Cost: 0.1948
Epoch: 040/100 Train Acc.: 95.52% | Validation Acc.: 85.60%
Time elapsed: 15.62 min
Epoch: 041/100 | Batch 000/192 | Cost: 0.1345
Epoch: 041/100 | Batch 120/192 | Cost: 0.1891
Epoch: 041/100 Train Acc.: 96.02% | Validation Acc.: 85.60%
Time elapsed: 15.99 min
Epoch: 042/100 | Batch 000/192 | Cost: 0.1558
Epoch: 042/100 | Batch 120/192 | Cost: 0.1588
Epoch: 042/100 Train Acc.: 95.50% | Validation Acc.: 85.60%
Time elapsed: 16.35 min
Epoch: 043/100 | Batch 000/192 | Cost: 0.0832
Epoch: 043/100 | Batch 120/192 | Cost: 0.1639
Epoch: 043/100 Train Acc.: 96.13% | Validation Acc.: 83.70%
Time elapsed: 16.72 min
Epoch: 044/100 | Batch 000/192 | Cost: 0.0768
Epoch: 044/100 | Batch 120/192 | Cost: 0.1157
Epoch: 044/100 Train Acc.: 95.69% | Validation Acc.: 85.80%
Time elapsed: 17.09 min
Epoch: 045/100 | Batch 000/192 | Cost: 0.1428
Epoch: 045/100 | Batch 120/192 | Cost: 0.1093
Epoch: 045/100 Train Acc.: 95.85% | Validation Acc.: 84.70%
Time elapsed: 17.45 min
Epoch: 046/100 | Batch 000/192 | Cost: 0.1009
Epoch: 046/100 | Batch 120/192 | Cost: 0.1148
Epoch: 046/100 Train Acc.: 96.11% | Validation Acc.: 82.90%
Time elapsed: 17.82 min
Epoch: 047/100 | Batch 000/192 | Cost: 0.1023
Epoch: 047/100 | Batch 120/192 | Cost: 0.1426
Epoch: 047/100 Train Acc.: 96.00% | Validation Acc.: 84.50%
Time elapsed: 18.19 min
Epoch: 048/100 | Batch 000/192 | Cost: 0.1000
Epoch: 048/100 | Batch 120/192 | Cost: 0.1366
Epoch: 048/100 Train Acc.: 96.49% | Validation Acc.: 85.20%
Time elapsed: 18.56 min
Epoch: 049/100 | Batch 000/192 | Cost: 0.0983
Epoch: 049/100 | Batch 120/192 | Cost: 0.1003
Epoch: 049/100 Train Acc.: 96.57% | Validation Acc.: 85.10%
Time elapsed: 18.93 min
Epoch: 050/100 | Batch 000/192 | Cost: 0.0748
Epoch: 050/100 | Batch 120/192 | Cost: 0.1001
Epoch: 050/100 Train Acc.: 96.27% | Validation Acc.: 85.30%
Time elapsed: 19.29 min
Epoch: 051/100 | Batch 000/192 | Cost: 0.1418
Epoch: 051/100 | Batch 120/192 | Cost: 0.0902
Epoch: 051/100 Train Acc.: 96.55% | Validation Acc.: 85.70%
Time elapsed: 19.66 min
Epoch: 052/100 | Batch 000/192 | Cost: 0.0924
Epoch: 052/100 | Batch 120/192 | Cost: 0.1003
Epoch: 052/100 Train Acc.: 96.74% | Validation Acc.: 86.00%
Time elapsed: 20.03 min
Epoch: 053/100 | Batch 000/192 | Cost: 0.1101
Epoch: 053/100 | Batch 120/192 | Cost: 0.1555
Epoch: 053/100 Train Acc.: 96.44% | Validation Acc.: 84.90%
Time elapsed: 20.39 min
Epoch: 054/100 | Batch 000/192 | Cost: 0.0853
Epoch: 054/100 | Batch 120/192 | Cost: 0.0984
Epoch: 054/100 Train Acc.: 96.78% | Validation Acc.: 85.10%
Time elapsed: 20.76 min
Epoch: 055/100 | Batch 000/192 | Cost: 0.0503
Epoch: 055/100 | Batch 120/192 | Cost: 0.0870
Epoch: 055/100 Train Acc.: 96.91% | Validation Acc.: 84.80%
Time elapsed: 21.13 min
Epoch: 056/100 | Batch 000/192 | Cost: 0.0659
Epoch: 056/100 | Batch 120/192 | Cost: 0.0849
Epoch: 056/100 Train Acc.: 96.95% | Validation Acc.: 86.60%
Time elapsed: 21.50 min
Epoch: 057/100 | Batch 000/192 | Cost: 0.1177
Epoch: 057/100 | Batch 120/192 | Cost: 0.1281
Epoch: 057/100 Train Acc.: 97.02% | Validation Acc.: 86.70%
Time elapsed: 21.87 min
Epoch: 058/100 | Batch 000/192 | Cost: 0.0996
Epoch: 058/100 | Batch 120/192 | Cost: 0.1410
Epoch: 058/100 Train Acc.: 96.55% | Validation Acc.: 85.00%
Time elapsed: 22.24 min
Epoch: 059/100 | Batch 000/192 | Cost: 0.0621
Epoch: 059/100 | Batch 120/192 | Cost: 0.0648
Epoch: 059/100 Train Acc.: 97.04% | Validation Acc.: 85.50%
Time elapsed: 22.61 min
Epoch: 060/100 | Batch 000/192 | Cost: 0.0626
Epoch: 060/100 | Batch 120/192 | Cost: 0.0791
Epoch: 060/100 Train Acc.: 96.42% | Validation Acc.: 84.30%
Time elapsed: 22.98 min
Epoch: 061/100 | Batch 000/192 | Cost: 0.1322
Epoch: 061/100 | Batch 120/192 | Cost: 0.0991
Epoch: 061/100 Train Acc.: 97.13% | Validation Acc.: 85.80%
Time elapsed: 23.35 min
Epoch: 062/100 | Batch 000/192 | Cost: 0.0598
Epoch: 062/100 | Batch 120/192 | Cost: 0.1386
Epoch: 062/100 Train Acc.: 97.04% | Validation Acc.: 84.30%
Time elapsed: 23.71 min
Epoch: 063/100 | Batch 000/192 | Cost: 0.0402
Epoch: 063/100 | Batch 120/192 | Cost: 0.1163
Epoch: 063/100 Train Acc.: 97.16% | Validation Acc.: 84.80%
Time elapsed: 24.19 min
Epoch: 064/100 | Batch 000/192 | Cost: 0.0672
Epoch: 064/100 | Batch 120/192 | Cost: 0.0687
Epoch: 064/100 Train Acc.: 97.28% | Validation Acc.: 85.20%
Time elapsed: 24.70 min
Epoch: 065/100 | Batch 000/192 | Cost: 0.0783
Epoch: 065/100 | Batch 120/192 | Cost: 0.1035
Epoch: 065/100 Train Acc.: 97.17% | Validation Acc.: 85.70%
Time elapsed: 25.46 min
Epoch: 066/100 | Batch 000/192 | Cost: 0.0331
Epoch: 066/100 | Batch 120/192 | Cost: 0.0829
Epoch: 066/100 Train Acc.: 97.63% | Validation Acc.: 86.80%
Time elapsed: 26.24 min
Epoch: 067/100 | Batch 000/192 | Cost: 0.0836
Epoch: 067/100 | Batch 120/192 | Cost: 0.0810
Epoch: 067/100 Train Acc.: 97.38% | Validation Acc.: 84.20%
Time elapsed: 27.03 min
Epoch: 068/100 | Batch 000/192 | Cost: 0.0746
Epoch: 068/100 | Batch 120/192 | Cost: 0.1084
Epoch: 068/100 Train Acc.: 97.64% | Validation Acc.: 85.60%
Time elapsed: 27.79 min
Epoch: 069/100 | Batch 000/192 | Cost: 0.0548
Epoch: 069/100 | Batch 120/192 | Cost: 0.0487
Epoch: 069/100 Train Acc.: 97.65% | Validation Acc.: 86.00%
Time elapsed: 28.57 min
Epoch: 070/100 | Batch 000/192 | Cost: 0.0811
Epoch: 070/100 | Batch 120/192 | Cost: 0.0865
Epoch: 070/100 Train Acc.: 97.45% | Validation Acc.: 86.60%
Time elapsed: 29.34 min
Epoch: 071/100 | Batch 000/192 | Cost: 0.0757
Epoch: 071/100 | Batch 120/192 | Cost: 0.1505
Epoch: 071/100 Train Acc.: 97.52% | Validation Acc.: 84.50%
Time elapsed: 30.12 min
Epoch: 072/100 | Batch 000/192 | Cost: 0.1299
Epoch: 072/100 | Batch 120/192 | Cost: 0.0503
Epoch: 072/100 Train Acc.: 97.53% | Validation Acc.: 86.70%
Time elapsed: 30.91 min
Epoch: 073/100 | Batch 000/192 | Cost: 0.0463
Epoch: 073/100 | Batch 120/192 | Cost: 0.0583
Epoch: 073/100 Train Acc.: 97.75% | Validation Acc.: 85.00%
Time elapsed: 31.67 min
Epoch: 074/100 | Batch 000/192 | Cost: 0.0454
Epoch: 074/100 | Batch 120/192 | Cost: 0.0507
Epoch: 074/100 Train Acc.: 97.64% | Validation Acc.: 86.50%
Time elapsed: 32.45 min
Epoch: 075/100 | Batch 000/192 | Cost: 0.0686
Epoch: 075/100 | Batch 120/192 | Cost: 0.0734
Epoch: 075/100 Train Acc.: 97.79% | Validation Acc.: 86.60%
Time elapsed: 33.22 min
Epoch: 076/100 | Batch 000/192 | Cost: 0.1011
Epoch: 076/100 | Batch 120/192 | Cost: 0.0856
Epoch: 076/100 Train Acc.: 97.77% | Validation Acc.: 85.90%
Time elapsed: 34.00 min
Epoch: 077/100 | Batch 000/192 | Cost: 0.0494
Epoch: 077/100 | Batch 120/192 | Cost: 0.0623
Epoch: 077/100 Train Acc.: 97.74% | Validation Acc.: 86.90%
Time elapsed: 34.78 min
Epoch: 078/100 | Batch 000/192 | Cost: 0.0519
Epoch: 078/100 | Batch 120/192 | Cost: 0.0740
Epoch: 078/100 Train Acc.: 97.52% | Validation Acc.: 86.30%
Time elapsed: 35.55 min
Epoch: 079/100 | Batch 000/192 | Cost: 0.0502
Epoch: 079/100 | Batch 120/192 | Cost: 0.0762
Epoch: 079/100 Train Acc.: 97.44% | Validation Acc.: 86.00%
Time elapsed: 36.33 min
Epoch: 080/100 | Batch 000/192 | Cost: 0.0973
Epoch: 080/100 | Batch 120/192 | Cost: 0.0414
Epoch: 080/100 Train Acc.: 98.03% | Validation Acc.: 86.70%
Time elapsed: 37.10 min
Epoch: 081/100 | Batch 000/192 | Cost: 0.0882
Epoch: 081/100 | Batch 120/192 | Cost: 0.1327
Epoch: 081/100 Train Acc.: 97.92% | Validation Acc.: 86.20%
Time elapsed: 37.88 min
Epoch: 082/100 | Batch 000/192 | Cost: 0.0425
Epoch: 082/100 | Batch 120/192 | Cost: 0.0632
Epoch: 082/100 Train Acc.: 97.72% | Validation Acc.: 85.00%
Time elapsed: 38.66 min
Epoch: 083/100 | Batch 000/192 | Cost: 0.0676
Epoch: 083/100 | Batch 120/192 | Cost: 0.0444
Epoch: 083/100 Train Acc.: 98.06% | Validation Acc.: 87.10%
Time elapsed: 39.43 min
Epoch: 084/100 | Batch 000/192 | Cost: 0.0565
Epoch: 084/100 | Batch 120/192 | Cost: 0.0478
Epoch: 084/100 Train Acc.: 97.96% | Validation Acc.: 86.80%
Time elapsed: 40.22 min
Epoch: 085/100 | Batch 000/192 | Cost: 0.1038
Epoch: 085/100 | Batch 120/192 | Cost: 0.0502
Epoch: 085/100 Train Acc.: 98.02% | Validation Acc.: 87.20%
Time elapsed: 41.00 min
Epoch: 086/100 | Batch 000/192 | Cost: 0.1114
Epoch: 086/100 | Batch 120/192 | Cost: 0.0419
Epoch: 086/100 Train Acc.: 97.93% | Validation Acc.: 86.10%
Time elapsed: 41.77 min
Epoch: 087/100 | Batch 000/192 | Cost: 0.0485
Epoch: 087/100 | Batch 120/192 | Cost: 0.0526
Epoch: 087/100 Train Acc.: 97.99% | Validation Acc.: 87.00%
Time elapsed: 42.56 min
Epoch: 088/100 | Batch 000/192 | Cost: 0.0429
Epoch: 088/100 | Batch 120/192 | Cost: 0.0542
Epoch: 088/100 Train Acc.: 97.95% | Validation Acc.: 87.10%
Time elapsed: 43.34 min
Epoch: 089/100 | Batch 000/192 | Cost: 0.0533
Epoch: 089/100 | Batch 120/192 | Cost: 0.0241
Epoch: 089/100 Train Acc.: 98.05% | Validation Acc.: 86.60%
Time elapsed: 44.13 min
Epoch: 090/100 | Batch 000/192 | Cost: 0.0738
Epoch: 090/100 | Batch 120/192 | Cost: 0.0324
Epoch: 090/100 Train Acc.: 97.87% | Validation Acc.: 86.10%
Time elapsed: 44.91 min
Epoch: 091/100 | Batch 000/192 | Cost: 0.0778
Epoch: 091/100 | Batch 120/192 | Cost: 0.0754
Epoch: 091/100 Train Acc.: 98.22% | Validation Acc.: 86.40%
Time elapsed: 45.68 min
Epoch: 092/100 | Batch 000/192 | Cost: 0.0695
Epoch: 092/100 | Batch 120/192 | Cost: 0.0946
Epoch: 092/100 Train Acc.: 97.94% | Validation Acc.: 86.40%
Time elapsed: 46.47 min
Epoch: 093/100 | Batch 000/192 | Cost: 0.0322
Epoch: 093/100 | Batch 120/192 | Cost: 0.0522
Epoch: 093/100 Train Acc.: 98.28% | Validation Acc.: 86.40%
Time elapsed: 47.26 min
Epoch: 094/100 | Batch 000/192 | Cost: 0.0442
Epoch: 094/100 | Batch 120/192 | Cost: 0.0545
Epoch: 094/100 Train Acc.: 98.22% | Validation Acc.: 86.70%
Time elapsed: 48.04 min
Epoch: 095/100 | Batch 000/192 | Cost: 0.0355
Epoch: 095/100 | Batch 120/192 | Cost: 0.0459
Epoch: 095/100 Train Acc.: 98.13% | Validation Acc.: 87.40%
Time elapsed: 48.84 min
Epoch: 096/100 | Batch 000/192 | Cost: 0.0448
Epoch: 096/100 | Batch 120/192 | Cost: 0.0468
Epoch: 096/100 Train Acc.: 98.19% | Validation Acc.: 85.90%
Time elapsed: 49.60 min
Epoch: 097/100 | Batch 000/192 | Cost: 0.0175
Epoch: 097/100 | Batch 120/192 | Cost: 0.0409
Epoch: 097/100 Train Acc.: 98.17% | Validation Acc.: 87.10%
Time elapsed: 50.39 min
Epoch: 098/100 | Batch 000/192 | Cost: 0.0374
Epoch: 098/100 | Batch 120/192 | Cost: 0.0465
Epoch: 098/100 Train Acc.: 98.27% | Validation Acc.: 86.20%
Time elapsed: 51.15 min
Epoch: 099/100 | Batch 000/192 | Cost: 0.0628
Epoch: 099/100 | Batch 120/192 | Cost: 0.0555
Epoch: 099/100 Train Acc.: 98.00% | Validation Acc.: 85.90%
Time elapsed: 51.94 min
Epoch: 100/100 | Batch 000/192 | Cost: 0.0570
Epoch: 100/100 | Batch 120/192 | Cost: 0.0494
Epoch: 100/100 Train Acc.: 98.31% | Validation Acc.: 85.80%
Time elapsed: 52.72 min
Total Training Time: 52.72 min

BatchNorm after Activation

In [11]:
##########################
### MODEL
##########################


class NiN(nn.Module):
    def __init__(self, num_classes):
        super(NiN, self).__init__()
        self.num_classes = num_classes
        self.classifier = nn.Sequential(
                nn.Conv2d(3, 192, kernel_size=5, stride=1, padding=2, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(192),
                nn.Conv2d(192, 160, kernel_size=1, stride=1, padding=0, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(160),
                nn.Conv2d(160,  96, kernel_size=1, stride=1, padding=0, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(96),
                nn.MaxPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(96, 192, kernel_size=5, stride=1, padding=2, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(192),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(192),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(192),
                nn.AvgPool2d(kernel_size=3, stride=2, padding=1),
                nn.Dropout(0.5),

                nn.Conv2d(192, 192, kernel_size=3, stride=1, padding=1, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(192),
                nn.Conv2d(192, 192, kernel_size=1, stride=1, padding=0, bias=False),
                nn.ReLU(inplace=True),
                nn.BatchNorm2d(192),
                nn.Conv2d(192,  10, kernel_size=1, stride=1, padding=0),
                nn.ReLU(inplace=True),
                nn.AvgPool2d(kernel_size=8, stride=1, padding=0),

                )

    def forward(self, x):
        x = self.classifier(x)
        logits = x.view(x.size(0), self.num_classes)
        probas = torch.softmax(logits, dim=1)
        return logits, probas
In [12]:
torch.manual_seed(RANDOM_SEED)

model = NiN(NUM_CLASSES)
model.to(DEVICE)

optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)  
In [13]:
start_time = time.time()
for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    with torch.set_grad_enabled(False):
        train_acc = compute_accuracy(model, train_loader, device=DEVICE)
        valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/100 | Batch 000/192 | Cost: 2.3059
Epoch: 001/100 | Batch 120/192 | Cost: 1.0759
Epoch: 001/100 Train Acc.: 64.08% | Validation Acc.: 64.80%
Time elapsed: 0.77 min
Epoch: 002/100 | Batch 000/192 | Cost: 1.1736
Epoch: 002/100 | Batch 120/192 | Cost: 0.8403
Epoch: 002/100 Train Acc.: 72.13% | Validation Acc.: 69.60%
Time elapsed: 1.55 min
Epoch: 003/100 | Batch 000/192 | Cost: 0.7607
Epoch: 003/100 | Batch 120/192 | Cost: 0.7570
Epoch: 003/100 Train Acc.: 76.62% | Validation Acc.: 73.90%
Time elapsed: 2.32 min
Epoch: 004/100 | Batch 000/192 | Cost: 0.6554
Epoch: 004/100 | Batch 120/192 | Cost: 0.6539
Epoch: 004/100 Train Acc.: 78.93% | Validation Acc.: 76.70%
Time elapsed: 3.10 min
Epoch: 005/100 | Batch 000/192 | Cost: 0.5906
Epoch: 005/100 | Batch 120/192 | Cost: 0.7284
Epoch: 005/100 Train Acc.: 81.88% | Validation Acc.: 79.70%
Time elapsed: 3.87 min
Epoch: 006/100 | Batch 000/192 | Cost: 0.5847
Epoch: 006/100 | Batch 120/192 | Cost: 0.5115
Epoch: 006/100 Train Acc.: 83.57% | Validation Acc.: 79.90%
Time elapsed: 4.65 min
Epoch: 007/100 | Batch 000/192 | Cost: 0.5185
Epoch: 007/100 | Batch 120/192 | Cost: 0.4879
Epoch: 007/100 Train Acc.: 84.50% | Validation Acc.: 80.30%
Time elapsed: 5.42 min
Epoch: 008/100 | Batch 000/192 | Cost: 0.4134
Epoch: 008/100 | Batch 120/192 | Cost: 0.4843
Epoch: 008/100 Train Acc.: 85.61% | Validation Acc.: 80.80%
Time elapsed: 6.19 min
Epoch: 009/100 | Batch 000/192 | Cost: 0.3521
Epoch: 009/100 | Batch 120/192 | Cost: 0.5180
Epoch: 009/100 Train Acc.: 87.21% | Validation Acc.: 80.00%
Time elapsed: 6.96 min
Epoch: 010/100 | Batch 000/192 | Cost: 0.4342
Epoch: 010/100 | Batch 120/192 | Cost: 0.4116
Epoch: 010/100 Train Acc.: 87.58% | Validation Acc.: 80.20%
Time elapsed: 7.74 min
Epoch: 011/100 | Batch 000/192 | Cost: 0.4375
Epoch: 011/100 | Batch 120/192 | Cost: 0.4573
Epoch: 011/100 Train Acc.: 88.85% | Validation Acc.: 82.40%
Time elapsed: 8.50 min
Epoch: 012/100 | Batch 000/192 | Cost: 0.3115
Epoch: 012/100 | Batch 120/192 | Cost: 0.3661
Epoch: 012/100 Train Acc.: 89.30% | Validation Acc.: 81.80%
Time elapsed: 9.27 min
Epoch: 013/100 | Batch 000/192 | Cost: 0.2318
Epoch: 013/100 | Batch 120/192 | Cost: 0.2555
Epoch: 013/100 Train Acc.: 89.73% | Validation Acc.: 81.90%
Time elapsed: 10.05 min
Epoch: 014/100 | Batch 000/192 | Cost: 0.3029
Epoch: 014/100 | Batch 120/192 | Cost: 0.3206
Epoch: 014/100 Train Acc.: 90.71% | Validation Acc.: 84.40%
Time elapsed: 10.81 min
Epoch: 015/100 | Batch 000/192 | Cost: 0.3103
Epoch: 015/100 | Batch 120/192 | Cost: 0.3303
Epoch: 015/100 Train Acc.: 91.45% | Validation Acc.: 81.90%
Time elapsed: 11.59 min
Epoch: 016/100 | Batch 000/192 | Cost: 0.3105
Epoch: 016/100 | Batch 120/192 | Cost: 0.2497
Epoch: 016/100 Train Acc.: 91.92% | Validation Acc.: 82.60%
Time elapsed: 12.36 min
Epoch: 017/100 | Batch 000/192 | Cost: 0.1741
Epoch: 017/100 | Batch 120/192 | Cost: 0.2539
Epoch: 017/100 Train Acc.: 92.74% | Validation Acc.: 83.10%
Time elapsed: 13.13 min
Epoch: 018/100 | Batch 000/192 | Cost: 0.2569
Epoch: 018/100 | Batch 120/192 | Cost: 0.2318
Epoch: 018/100 Train Acc.: 93.14% | Validation Acc.: 83.60%
Time elapsed: 13.91 min
Epoch: 019/100 | Batch 000/192 | Cost: 0.2926
Epoch: 019/100 | Batch 120/192 | Cost: 0.1889
Epoch: 019/100 Train Acc.: 92.98% | Validation Acc.: 84.50%
Time elapsed: 14.67 min
Epoch: 020/100 | Batch 000/192 | Cost: 0.1761
Epoch: 020/100 | Batch 120/192 | Cost: 0.1828
Epoch: 020/100 Train Acc.: 93.68% | Validation Acc.: 83.10%
Time elapsed: 15.44 min
Epoch: 021/100 | Batch 000/192 | Cost: 0.1238
Epoch: 021/100 | Batch 120/192 | Cost: 0.1776
Epoch: 021/100 Train Acc.: 93.89% | Validation Acc.: 84.90%
Time elapsed: 16.20 min
Epoch: 022/100 | Batch 000/192 | Cost: 0.2031
Epoch: 022/100 | Batch 120/192 | Cost: 0.1599
Epoch: 022/100 Train Acc.: 93.94% | Validation Acc.: 84.50%
Time elapsed: 16.97 min
Epoch: 023/100 | Batch 000/192 | Cost: 0.1342
Epoch: 023/100 | Batch 120/192 | Cost: 0.1964
Epoch: 023/100 Train Acc.: 94.68% | Validation Acc.: 84.70%
Time elapsed: 17.74 min
Epoch: 024/100 | Batch 000/192 | Cost: 0.1671
Epoch: 024/100 | Batch 120/192 | Cost: 0.1648
Epoch: 024/100 Train Acc.: 94.51% | Validation Acc.: 85.20%
Time elapsed: 18.50 min
Epoch: 025/100 | Batch 000/192 | Cost: 0.1436
Epoch: 025/100 | Batch 120/192 | Cost: 0.1684
Epoch: 025/100 Train Acc.: 94.94% | Validation Acc.: 84.90%
Time elapsed: 19.27 min
Epoch: 026/100 | Batch 000/192 | Cost: 0.1587
Epoch: 026/100 | Batch 120/192 | Cost: 0.1912
Epoch: 026/100 Train Acc.: 95.05% | Validation Acc.: 83.30%
Time elapsed: 20.04 min
Epoch: 027/100 | Batch 000/192 | Cost: 0.1599
Epoch: 027/100 | Batch 120/192 | Cost: 0.1704
Epoch: 027/100 Train Acc.: 95.52% | Validation Acc.: 83.70%
Time elapsed: 20.81 min
Epoch: 028/100 | Batch 000/192 | Cost: 0.1275
Epoch: 028/100 | Batch 120/192 | Cost: 0.1232
Epoch: 028/100 Train Acc.: 95.63% | Validation Acc.: 84.70%
Time elapsed: 21.60 min
Epoch: 029/100 | Batch 000/192 | Cost: 0.1452
Epoch: 029/100 | Batch 120/192 | Cost: 0.1621
Epoch: 029/100 Train Acc.: 95.83% | Validation Acc.: 84.00%
Time elapsed: 22.36 min
Epoch: 030/100 | Batch 000/192 | Cost: 0.0822
Epoch: 030/100 | Batch 120/192 | Cost: 0.1508
Epoch: 030/100 Train Acc.: 95.72% | Validation Acc.: 85.00%
Time elapsed: 23.14 min
Epoch: 031/100 | Batch 000/192 | Cost: 0.1148
Epoch: 031/100 | Batch 120/192 | Cost: 0.0952
Epoch: 031/100 Train Acc.: 95.70% | Validation Acc.: 84.50%
Time elapsed: 23.92 min
Epoch: 032/100 | Batch 000/192 | Cost: 0.1098
Epoch: 032/100 | Batch 120/192 | Cost: 0.1265
Epoch: 032/100 Train Acc.: 95.58% | Validation Acc.: 84.50%
Time elapsed: 24.69 min
Epoch: 033/100 | Batch 000/192 | Cost: 0.0968
Epoch: 033/100 | Batch 120/192 | Cost: 0.1536
Epoch: 033/100 Train Acc.: 96.37% | Validation Acc.: 85.80%
Time elapsed: 25.47 min
Epoch: 034/100 | Batch 000/192 | Cost: 0.1380
Epoch: 034/100 | Batch 120/192 | Cost: 0.1361
Epoch: 034/100 Train Acc.: 96.40% | Validation Acc.: 84.50%
Time elapsed: 26.24 min
Epoch: 035/100 | Batch 000/192 | Cost: 0.1400
Epoch: 035/100 | Batch 120/192 | Cost: 0.1103
Epoch: 035/100 Train Acc.: 96.68% | Validation Acc.: 86.40%
Time elapsed: 27.02 min
Epoch: 036/100 | Batch 000/192 | Cost: 0.0920
Epoch: 036/100 | Batch 120/192 | Cost: 0.1332
Epoch: 036/100 Train Acc.: 96.61% | Validation Acc.: 84.30%
Time elapsed: 27.78 min
Epoch: 037/100 | Batch 000/192 | Cost: 0.0682
Epoch: 037/100 | Batch 120/192 | Cost: 0.1231
Epoch: 037/100 Train Acc.: 96.53% | Validation Acc.: 84.80%
Time elapsed: 28.55 min
Epoch: 038/100 | Batch 000/192 | Cost: 0.1042
Epoch: 038/100 | Batch 120/192 | Cost: 0.1283
Epoch: 038/100 Train Acc.: 96.54% | Validation Acc.: 84.70%
Time elapsed: 29.33 min
Epoch: 039/100 | Batch 000/192 | Cost: 0.1099
Epoch: 039/100 | Batch 120/192 | Cost: 0.0976
Epoch: 039/100 Train Acc.: 97.05% | Validation Acc.: 84.80%
Time elapsed: 30.09 min
Epoch: 040/100 | Batch 000/192 | Cost: 0.0670
Epoch: 040/100 | Batch 120/192 | Cost: 0.1400
Epoch: 040/100 Train Acc.: 96.85% | Validation Acc.: 84.90%
Time elapsed: 30.87 min
Epoch: 041/100 | Batch 000/192 | Cost: 0.1038
Epoch: 041/100 | Batch 120/192 | Cost: 0.1502
Epoch: 041/100 Train Acc.: 97.14% | Validation Acc.: 83.80%
Time elapsed: 31.64 min
Epoch: 042/100 | Batch 000/192 | Cost: 0.0742
Epoch: 042/100 | Batch 120/192 | Cost: 0.1515
Epoch: 042/100 Train Acc.: 97.21% | Validation Acc.: 86.00%
Time elapsed: 32.41 min
Epoch: 043/100 | Batch 000/192 | Cost: 0.1119
Epoch: 043/100 | Batch 120/192 | Cost: 0.1353
Epoch: 043/100 Train Acc.: 97.21% | Validation Acc.: 84.70%
Time elapsed: 33.19 min
Epoch: 044/100 | Batch 000/192 | Cost: 0.0806
Epoch: 044/100 | Batch 120/192 | Cost: 0.0663
Epoch: 044/100 Train Acc.: 97.22% | Validation Acc.: 85.50%
Time elapsed: 33.96 min
Epoch: 045/100 | Batch 000/192 | Cost: 0.0712
Epoch: 045/100 | Batch 120/192 | Cost: 0.0965
Epoch: 045/100 Train Acc.: 97.40% | Validation Acc.: 85.40%
Time elapsed: 34.73 min
Epoch: 046/100 | Batch 000/192 | Cost: 0.0878
Epoch: 046/100 | Batch 120/192 | Cost: 0.0740
Epoch: 046/100 Train Acc.: 97.51% | Validation Acc.: 84.40%
Time elapsed: 35.51 min
Epoch: 047/100 | Batch 000/192 | Cost: 0.1174
Epoch: 047/100 | Batch 120/192 | Cost: 0.0488
Epoch: 047/100 Train Acc.: 97.63% | Validation Acc.: 84.30%
Time elapsed: 36.28 min
Epoch: 048/100 | Batch 000/192 | Cost: 0.0605
Epoch: 048/100 | Batch 120/192 | Cost: 0.1052
Epoch: 048/100 Train Acc.: 97.45% | Validation Acc.: 84.70%
Time elapsed: 37.06 min
Epoch: 049/100 | Batch 000/192 | Cost: 0.0446
Epoch: 049/100 | Batch 120/192 | Cost: 0.0897
Epoch: 049/100 Train Acc.: 97.74% | Validation Acc.: 85.30%
Time elapsed: 37.82 min
Epoch: 050/100 | Batch 000/192 | Cost: 0.0623
Epoch: 050/100 | Batch 120/192 | Cost: 0.0904
Epoch: 050/100 Train Acc.: 97.39% | Validation Acc.: 83.80%
Time elapsed: 38.60 min
Epoch: 051/100 | Batch 000/192 | Cost: 0.0641
Epoch: 051/100 | Batch 120/192 | Cost: 0.0890
Epoch: 051/100 Train Acc.: 97.44% | Validation Acc.: 85.60%
Time elapsed: 39.38 min
Epoch: 052/100 | Batch 000/192 | Cost: 0.0482
Epoch: 052/100 | Batch 120/192 | Cost: 0.0669
Epoch: 052/100 Train Acc.: 97.49% | Validation Acc.: 85.40%
Time elapsed: 40.14 min
Epoch: 053/100 | Batch 000/192 | Cost: 0.0710
Epoch: 053/100 | Batch 120/192 | Cost: 0.1376
Epoch: 053/100 Train Acc.: 97.81% | Validation Acc.: 85.70%
Time elapsed: 40.91 min
Epoch: 054/100 | Batch 000/192 | Cost: 0.0518
Epoch: 054/100 | Batch 120/192 | Cost: 0.0818
Epoch: 054/100 Train Acc.: 97.23% | Validation Acc.: 83.10%
Time elapsed: 41.68 min
Epoch: 055/100 | Batch 000/192 | Cost: 0.0913
Epoch: 055/100 | Batch 120/192 | Cost: 0.1024
Epoch: 055/100 Train Acc.: 97.34% | Validation Acc.: 84.50%
Time elapsed: 42.45 min
Epoch: 056/100 | Batch 000/192 | Cost: 0.0641
Epoch: 056/100 | Batch 120/192 | Cost: 0.1011
Epoch: 056/100 Train Acc.: 97.61% | Validation Acc.: 84.50%
Time elapsed: 43.23 min
Epoch: 057/100 | Batch 000/192 | Cost: 0.0562
Epoch: 057/100 | Batch 120/192 | Cost: 0.0859
Epoch: 057/100 Train Acc.: 98.03% | Validation Acc.: 84.30%
Time elapsed: 44.00 min
Epoch: 058/100 | Batch 000/192 | Cost: 0.0774
Epoch: 058/100 | Batch 120/192 | Cost: 0.0956
Epoch: 058/100 Train Acc.: 97.79% | Validation Acc.: 84.80%
Time elapsed: 44.77 min
Epoch: 059/100 | Batch 000/192 | Cost: 0.0640
Epoch: 059/100 | Batch 120/192 | Cost: 0.0551
Epoch: 059/100 Train Acc.: 97.86% | Validation Acc.: 84.80%
Time elapsed: 45.54 min
Epoch: 060/100 | Batch 000/192 | Cost: 0.0810
Epoch: 060/100 | Batch 120/192 | Cost: 0.0322
Epoch: 060/100 Train Acc.: 97.87% | Validation Acc.: 84.50%
Time elapsed: 46.32 min
Epoch: 061/100 | Batch 000/192 | Cost: 0.0813
Epoch: 061/100 | Batch 120/192 | Cost: 0.0924
Epoch: 061/100 Train Acc.: 97.86% | Validation Acc.: 84.30%
Time elapsed: 47.10 min
Epoch: 062/100 | Batch 000/192 | Cost: 0.0727
Epoch: 062/100 | Batch 120/192 | Cost: 0.0776
Epoch: 062/100 Train Acc.: 97.73% | Validation Acc.: 84.60%
Time elapsed: 47.86 min
Epoch: 063/100 | Batch 000/192 | Cost: 0.0436
Epoch: 063/100 | Batch 120/192 | Cost: 0.0313
Epoch: 063/100 Train Acc.: 98.00% | Validation Acc.: 86.40%
Time elapsed: 48.63 min
Epoch: 064/100 | Batch 000/192 | Cost: 0.0491
Epoch: 064/100 | Batch 120/192 | Cost: 0.0530
Epoch: 064/100 Train Acc.: 98.26% | Validation Acc.: 85.40%
Time elapsed: 49.40 min
Epoch: 065/100 | Batch 000/192 | Cost: 0.0721
Epoch: 065/100 | Batch 120/192 | Cost: 0.0621
Epoch: 065/100 Train Acc.: 97.99% | Validation Acc.: 85.20%
Time elapsed: 50.17 min
Epoch: 066/100 | Batch 000/192 | Cost: 0.0697
Epoch: 066/100 | Batch 120/192 | Cost: 0.0426
Epoch: 066/100 Train Acc.: 98.02% | Validation Acc.: 84.80%
Time elapsed: 50.96 min
Epoch: 067/100 | Batch 000/192 | Cost: 0.0613
Epoch: 067/100 | Batch 120/192 | Cost: 0.0714
Epoch: 067/100 Train Acc.: 97.90% | Validation Acc.: 84.00%
Time elapsed: 51.72 min
Epoch: 068/100 | Batch 000/192 | Cost: 0.0676
Epoch: 068/100 | Batch 120/192 | Cost: 0.0286
Epoch: 068/100 Train Acc.: 98.15% | Validation Acc.: 84.10%
Time elapsed: 52.49 min
Epoch: 069/100 | Batch 000/192 | Cost: 0.0482
Epoch: 069/100 | Batch 120/192 | Cost: 0.0609
Epoch: 069/100 Train Acc.: 97.92% | Validation Acc.: 83.80%
Time elapsed: 53.25 min
Epoch: 070/100 | Batch 000/192 | Cost: 0.0462
Epoch: 070/100 | Batch 120/192 | Cost: 0.0434
Epoch: 070/100 Train Acc.: 98.15% | Validation Acc.: 84.90%
Time elapsed: 54.02 min
Epoch: 071/100 | Batch 000/192 | Cost: 0.0306
Epoch: 071/100 | Batch 120/192 | Cost: 0.1153
Epoch: 071/100 Train Acc.: 98.24% | Validation Acc.: 86.20%
Time elapsed: 54.80 min
Epoch: 072/100 | Batch 000/192 | Cost: 0.0465
Epoch: 072/100 | Batch 120/192 | Cost: 0.0603
Epoch: 072/100 Train Acc.: 98.17% | Validation Acc.: 85.70%
Time elapsed: 55.57 min
Epoch: 073/100 | Batch 000/192 | Cost: 0.0943
Epoch: 073/100 | Batch 120/192 | Cost: 0.0509
Epoch: 073/100 Train Acc.: 98.30% | Validation Acc.: 84.70%
Time elapsed: 56.35 min
Epoch: 074/100 | Batch 000/192 | Cost: 0.0651
Epoch: 074/100 | Batch 120/192 | Cost: 0.0559
Epoch: 074/100 Train Acc.: 98.24% | Validation Acc.: 86.00%
Time elapsed: 57.12 min
Epoch: 075/100 | Batch 000/192 | Cost: 0.0400
Epoch: 075/100 | Batch 120/192 | Cost: 0.0258
Epoch: 075/100 Train Acc.: 98.37% | Validation Acc.: 85.30%
Time elapsed: 57.90 min
Epoch: 076/100 | Batch 000/192 | Cost: 0.0398
Epoch: 076/100 | Batch 120/192 | Cost: 0.0495
Epoch: 076/100 Train Acc.: 98.30% | Validation Acc.: 86.00%
Time elapsed: 58.68 min
Epoch: 077/100 | Batch 000/192 | Cost: 0.0373
Epoch: 077/100 | Batch 120/192 | Cost: 0.0597
Epoch: 077/100 Train Acc.: 98.31% | Validation Acc.: 84.90%
Time elapsed: 59.44 min
Epoch: 078/100 | Batch 000/192 | Cost: 0.0468
Epoch: 078/100 | Batch 120/192 | Cost: 0.0494
Epoch: 078/100 Train Acc.: 98.31% | Validation Acc.: 85.60%
Time elapsed: 60.22 min
Epoch: 079/100 | Batch 000/192 | Cost: 0.0481
Epoch: 079/100 | Batch 120/192 | Cost: 0.0493
Epoch: 079/100 Train Acc.: 98.44% | Validation Acc.: 85.10%
Time elapsed: 60.99 min
Epoch: 080/100 | Batch 000/192 | Cost: 0.0282
Epoch: 080/100 | Batch 120/192 | Cost: 0.0537
Epoch: 080/100 Train Acc.: 98.48% | Validation Acc.: 86.80%
Time elapsed: 61.75 min
Epoch: 081/100 | Batch 000/192 | Cost: 0.0496
Epoch: 081/100 | Batch 120/192 | Cost: 0.0403
Epoch: 081/100 Train Acc.: 98.14% | Validation Acc.: 86.40%
Time elapsed: 62.52 min
Epoch: 082/100 | Batch 000/192 | Cost: 0.1032
Epoch: 082/100 | Batch 120/192 | Cost: 0.0374
Epoch: 082/100 Train Acc.: 98.17% | Validation Acc.: 86.00%
Time elapsed: 63.28 min
Epoch: 083/100 | Batch 000/192 | Cost: 0.0847
Epoch: 083/100 | Batch 120/192 | Cost: 0.0557
Epoch: 083/100 Train Acc.: 98.59% | Validation Acc.: 86.30%
Time elapsed: 64.04 min
Epoch: 084/100 | Batch 000/192 | Cost: 0.0786
Epoch: 084/100 | Batch 120/192 | Cost: 0.0694
Epoch: 084/100 Train Acc.: 98.49% | Validation Acc.: 83.90%
Time elapsed: 64.81 min
Epoch: 085/100 | Batch 000/192 | Cost: 0.0483
Epoch: 085/100 | Batch 120/192 | Cost: 0.0588
Epoch: 085/100 Train Acc.: 98.14% | Validation Acc.: 85.20%
Time elapsed: 65.58 min
Epoch: 086/100 | Batch 000/192 | Cost: 0.0279
Epoch: 086/100 | Batch 120/192 | Cost: 0.0710
Epoch: 086/100 Train Acc.: 98.48% | Validation Acc.: 86.60%
Time elapsed: 66.35 min
Epoch: 087/100 | Batch 000/192 | Cost: 0.0264
Epoch: 087/100 | Batch 120/192 | Cost: 0.0266
Epoch: 087/100 Train Acc.: 98.54% | Validation Acc.: 85.20%
Time elapsed: 67.11 min
Epoch: 088/100 | Batch 000/192 | Cost: 0.0273
Epoch: 088/100 | Batch 120/192 | Cost: 0.0402
Epoch: 088/100 Train Acc.: 98.46% | Validation Acc.: 85.50%
Time elapsed: 67.89 min
Epoch: 089/100 | Batch 000/192 | Cost: 0.0601
Epoch: 089/100 | Batch 120/192 | Cost: 0.0424
Epoch: 089/100 Train Acc.: 98.42% | Validation Acc.: 86.00%
Time elapsed: 68.67 min
Epoch: 090/100 | Batch 000/192 | Cost: 0.0147
Epoch: 090/100 | Batch 120/192 | Cost: 0.0417
Epoch: 090/100 Train Acc.: 98.60% | Validation Acc.: 85.50%
Time elapsed: 69.44 min
Epoch: 091/100 | Batch 000/192 | Cost: 0.0825
Epoch: 091/100 | Batch 120/192 | Cost: 0.1113
Epoch: 091/100 Train Acc.: 98.61% | Validation Acc.: 86.00%
Time elapsed: 70.22 min
Epoch: 092/100 | Batch 000/192 | Cost: 0.0482
Epoch: 092/100 | Batch 120/192 | Cost: 0.0664
Epoch: 092/100 Train Acc.: 98.61% | Validation Acc.: 85.30%
Time elapsed: 70.98 min
Epoch: 093/100 | Batch 000/192 | Cost: 0.0298
Epoch: 093/100 | Batch 120/192 | Cost: 0.0673
Epoch: 093/100 Train Acc.: 98.62% | Validation Acc.: 86.60%
Time elapsed: 71.74 min
Epoch: 094/100 | Batch 000/192 | Cost: 0.0173
Epoch: 094/100 | Batch 120/192 | Cost: 0.0699
Epoch: 094/100 Train Acc.: 98.55% | Validation Acc.: 85.40%
Time elapsed: 72.51 min
Epoch: 095/100 | Batch 000/192 | Cost: 0.0298
Epoch: 095/100 | Batch 120/192 | Cost: 0.0382
Epoch: 095/100 Train Acc.: 98.61% | Validation Acc.: 87.00%
Time elapsed: 73.27 min
Epoch: 096/100 | Batch 000/192 | Cost: 0.0715
Epoch: 096/100 | Batch 120/192 | Cost: 0.0298
Epoch: 096/100 Train Acc.: 98.83% | Validation Acc.: 85.50%
Time elapsed: 74.05 min
Epoch: 097/100 | Batch 000/192 | Cost: 0.0645
Epoch: 097/100 | Batch 120/192 | Cost: 0.0374
Epoch: 097/100 Train Acc.: 98.66% | Validation Acc.: 86.30%
Time elapsed: 74.81 min
Epoch: 098/100 | Batch 000/192 | Cost: 0.0257
Epoch: 098/100 | Batch 120/192 | Cost: 0.0492
Epoch: 098/100 Train Acc.: 98.62% | Validation Acc.: 87.20%
Time elapsed: 75.59 min
Epoch: 099/100 | Batch 000/192 | Cost: 0.0785
Epoch: 099/100 | Batch 120/192 | Cost: 0.0587
Epoch: 099/100 Train Acc.: 98.55% | Validation Acc.: 85.80%
Time elapsed: 76.37 min
Epoch: 100/100 | Batch 000/192 | Cost: 0.0470
Epoch: 100/100 | Batch 120/192 | Cost: 0.0452
Epoch: 100/100 Train Acc.: 98.75% | Validation Acc.: 86.20%
Time elapsed: 77.12 min
Total Training Time: 77.12 min
In [14]:
%watermark -iv
numpy       1.17.4
torchvision 0.4.1a0+d94043a
matplotlib  3.1.0
torch       1.3.0
PIL.Image   6.2.1
pandas      0.24.2