Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka 

CPython 3.7.3
IPython 7.9.0

torch 1.4.0
  • Runs on CPU or GPU (if available)

VGG16 Convolutional Neural Network for Kaggle's Cats and Dogs Images

Implementation of the VGG-16 [1] architecture for training a dogs vs cats classifier.

References

  • [1] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

Imports

In [2]:
import time
import os

import numpy as np

import torch
import torch.nn.functional as F
import torch.nn as nn


from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision import transforms

from PIL import Image
import matplotlib.pyplot as plt


if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True
In [3]:
%matplotlib inline

Settings

In [4]:
##########################
### SETTINGS
##########################

# Device
DEVICE = torch.device("cuda:3" if torch.cuda.is_available() else "cpu")

# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.001
NUM_EPOCHS = 100
BATCH_SIZE = 128

# Architecture
NUM_CLASSES = 2

Cats vs Dogs Dataset

Download the Kaggle Cats and Dogs Dataset from https://www.kaggle.com/c/dogs-vs-cats/data by clicking on the "Download All" link:

Then, unzip the dataset.

The dataset folder consists of two subfolders, train, and test1, which contain the training and test images in jpg format, respectively. Note that the test set examples are unlabeled.

import os

num_train_cats = len([i for i in os.listdir(os.path.join('dogs-vs-cats', 'train')) 
                      if i.endswith('.jpg') and i.startswith('cat')])

num_train_dogs = len([i for i in os.listdir(os.path.join('dogs-vs-cats', 'train')) 
                      if i.endswith('.jpg') and i.startswith('dog')])

print(f'Training set cats: {num_train_cats}')
print(f'Training set dogs: {num_train_dogs}')

The naming scheme within each of these subfolders is <class>.<imagenumber>.jpg.

In [5]:
img = Image.open(os.path.join('dogs-vs-cats','train', 'cat.59.jpg'))
print(np.asarray(img, dtype=np.uint8).shape)
plt.imshow(img);
(331, 464, 3)

Creating Validation and Test Subsets

  • Move 2500 images from the training folder into a test set folder
  • Move 2500 images from the training folder into a validation set folder
In [6]:
if not os.path.exists(os.path.join('dogs-vs-cats', 'test')):
    os.mkdir(os.path.join('dogs-vs-cats', 'test'))

if not os.path.exists(os.path.join('dogs-vs-cats', 'valid')):
    os.mkdir(os.path.join('dogs-vs-cats', 'valid'))
In [7]:
for fname in os.listdir(os.path.join('dogs-vs-cats', 'train')):
    if not fname.endswith('.jpg'):
        continue
    _, img_num, _ = fname.split('.')
    filepath = os.path.join('dogs-vs-cats', 'train', fname)
    img_num = int(img_num)
    if img_num > 11249:
        os.rename(filepath, filepath.replace('train', 'test'))
    elif img_num > 9999:
        os.rename(filepath, filepath.replace('train', 'valid'))

Standardizing Images

Getting mean and standard devation for normalizing images via z-score normalization. For details, see the related notebook ./cnn-standardized.ipynb.

In [8]:
class CatsDogsDataset(Dataset):
    """Custom Dataset for loading CelebA face images"""

    def __init__(self, img_dir, transform=None):
    
        self.img_dir = img_dir
        
        self.img_names = [i for i in 
                          os.listdir(img_dir) 
                          if i.endswith('.jpg')]
        
        self.y = []
        for i in self.img_names:
            if i.split('.')[0] == 'cat':
                self.y.append(0)
            else:
                self.y.append(1)
        
        self.transform = transform

    def __getitem__(self, index):
        img = Image.open(os.path.join(self.img_dir,
                                      self.img_names[index]))
        
        if self.transform is not None:
            img = self.transform(img)
        
        label = self.y[index]
        return img, label

    def __len__(self):
        return len(self.y)

    

custom_transform1 = transforms.Compose([transforms.Resize([64, 64]),
                                        transforms.ToTensor()])

train_dataset = CatsDogsDataset(img_dir=os.path.join('dogs-vs-cats', 'train'), 
                                transform=custom_transform1)

train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=5000, 
                          shuffle=False)

train_mean = []
train_std = []

for i, image in enumerate(train_loader, 0):
    numpy_image = image[0].numpy()
    
    batch_mean = np.mean(numpy_image, axis=(0, 2, 3))
    batch_std = np.std(numpy_image, axis=(0, 2, 3))
    
    train_mean.append(batch_mean)
    train_std.append(batch_std)

train_mean = torch.tensor(np.mean(train_mean, axis=0))
train_std = torch.tensor(np.mean(train_std, axis=0))

print('Mean:', train_mean)
print('Std Dev:', train_std)
Mean: tensor([0.4875, 0.4544, 0.4164])
Std Dev: tensor([0.2521, 0.2453, 0.2481])

Dataloaders

In [9]:
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(5),
        transforms.RandomHorizontalFlip(),
        transforms.RandomResizedCrop(64, scale=(0.96, 1.0), ratio=(0.95, 1.05)),
        transforms.ToTensor(),
        transforms.Normalize(train_mean, train_std)
    ]),
    'valid': transforms.Compose([
        transforms.Resize([64, 64]),
        transforms.ToTensor(),
        transforms.Normalize(train_mean, train_std)
    ]),
}


train_dataset = CatsDogsDataset(img_dir=os.path.join('dogs-vs-cats', 'train'), 
                                transform=data_transforms['train'])

train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=BATCH_SIZE,
                          drop_last=True,
                          shuffle=True)

valid_dataset = CatsDogsDataset(img_dir=os.path.join('dogs-vs-cats', 'valid'), 
                                transform=data_transforms['valid'])

valid_loader = DataLoader(dataset=valid_dataset, 
                          batch_size=BATCH_SIZE, 
                          shuffle=False)

test_dataset = CatsDogsDataset(img_dir=os.path.join('dogs-vs-cats', 'test'), 
                               transform=data_transforms['valid'])

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE, 
                         shuffle=False)

Model

In [10]:
##########################
### MODEL
##########################


class VGG16(torch.nn.Module):

    def __init__(self, num_classes):
        super(VGG16, self).__init__()
        
        # calculate same padding:
        # (w - k + 2*p)/s + 1 = o
        # => p = (s(o-1) - w + k)/2
        
        self.block_1 = nn.Sequential(
                nn.Conv2d(in_channels=3,
                          out_channels=64,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          # (1(32-1)- 32 + 3)/2 = 1
                          padding=1), 
                nn.ReLU(),
                nn.Conv2d(in_channels=64,
                          out_channels=64,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size=(2, 2),
                             stride=(2, 2))
        )
        
        self.block_2 = nn.Sequential(
                nn.Conv2d(in_channels=64,
                          out_channels=128,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.Conv2d(in_channels=128,
                          out_channels=128,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size=(2, 2),
                             stride=(2, 2))
        )
        
        self.block_3 = nn.Sequential(        
                nn.Conv2d(in_channels=128,
                          out_channels=256,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.Conv2d(in_channels=256,
                          out_channels=256,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),        
                nn.Conv2d(in_channels=256,
                          out_channels=256,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.Conv2d(in_channels=256,
                          out_channels=256,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size=(2, 2),
                             stride=(2, 2))
        )
        
          
        self.block_4 = nn.Sequential(   
                nn.Conv2d(in_channels=256,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),        
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),        
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),   
                nn.MaxPool2d(kernel_size=(2, 2),
                             stride=(2, 2))
        )
        
        self.block_5 = nn.Sequential(
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),            
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),            
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),
                nn.Conv2d(in_channels=512,
                          out_channels=512,
                          kernel_size=(3, 3),
                          stride=(1, 1),
                          padding=1),
                nn.ReLU(),   
                nn.MaxPool2d(kernel_size=(2, 2),
                             stride=(2, 2))             
        )
        
        self.classifier = nn.Sequential(
                nn.Linear(512*2*2, 4096),
                nn.ReLU(),   
                nn.Linear(4096, 4096),
                nn.ReLU(),
                nn.Linear(4096, num_classes)
        )
            
        
        for m in self.modules():
            if isinstance(m, torch.nn.Conv2d):
                #n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                #m.weight.data.normal_(0, np.sqrt(2. / n))
                m.weight.detach().normal_(0, 0.05)
                if m.bias is not None:
                    m.bias.detach().zero_()
            elif isinstance(m, torch.nn.Linear):
                m.weight.detach().normal_(0, 0.05)
                m.bias.detach().detach().zero_()
        
        
    def forward(self, x):

        x = self.block_1(x)
        x = self.block_2(x)
        x = self.block_3(x)
        x = self.block_4(x)
        x = self.block_5(x)

        logits = self.classifier(x.view(-1, 512*2*2))
        probas = F.softmax(logits, dim=1)

        return logits, probas
In [11]:
torch.manual_seed(RANDOM_SEED)
model = VGG16(num_classes=NUM_CLASSES)

model = model.to(DEVICE)

optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

Training

In [12]:
def compute_accuracy_and_loss(model, data_loader, device):
    correct_pred, num_examples = 0, 0
    cross_entropy = 0.
    for i, (features, targets) in enumerate(data_loader):
            
        features = features.to(device)
        targets = targets.to(device)

        logits, probas = model(features)
        cross_entropy += F.cross_entropy(logits, targets).item()
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100, cross_entropy/num_examples
    

start_time = time.time()
train_acc_lst, valid_acc_lst = [], []
train_loss_lst, valid_loss_lst = [], []

for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    model.eval()
    with torch.set_grad_enabled(False):
        train_acc, train_loss = compute_accuracy_and_loss(model, train_loader, device=DEVICE)
        valid_acc, valid_loss = compute_accuracy_and_loss(model, valid_loader, device=DEVICE)
        train_acc_lst.append(train_acc)
        valid_acc_lst.append(valid_acc)
        train_loss_lst.append(train_loss)
        valid_loss_lst.append(valid_loss)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/100 | Batch 000/156 | Cost: 1136.9125
Epoch: 001/100 | Batch 120/156 | Cost: 0.6327
Epoch: 001/100 Train Acc.: 63.35% | Validation Acc.: 62.12%
Time elapsed: 3.09 min
Epoch: 002/100 | Batch 000/156 | Cost: 0.6675
Epoch: 002/100 | Batch 120/156 | Cost: 0.6640
Epoch: 002/100 Train Acc.: 66.05% | Validation Acc.: 66.32%
Time elapsed: 6.15 min
Epoch: 003/100 | Batch 000/156 | Cost: 0.6137
Epoch: 003/100 | Batch 120/156 | Cost: 0.6311
Epoch: 003/100 Train Acc.: 65.82% | Validation Acc.: 63.76%
Time elapsed: 9.21 min
Epoch: 004/100 | Batch 000/156 | Cost: 0.5993
Epoch: 004/100 | Batch 120/156 | Cost: 0.5832
Epoch: 004/100 Train Acc.: 66.75% | Validation Acc.: 64.52%
Time elapsed: 12.27 min
Epoch: 005/100 | Batch 000/156 | Cost: 0.5918
Epoch: 005/100 | Batch 120/156 | Cost: 0.5747
Epoch: 005/100 Train Acc.: 68.29% | Validation Acc.: 67.00%
Time elapsed: 15.33 min
Epoch: 006/100 | Batch 000/156 | Cost: 0.5726
Epoch: 006/100 | Batch 120/156 | Cost: 0.5917
Epoch: 006/100 Train Acc.: 68.77% | Validation Acc.: 68.20%
Time elapsed: 18.39 min
Epoch: 007/100 | Batch 000/156 | Cost: 0.6484
Epoch: 007/100 | Batch 120/156 | Cost: 0.6746
Epoch: 007/100 Train Acc.: 69.85% | Validation Acc.: 67.20%
Time elapsed: 21.45 min
Epoch: 008/100 | Batch 000/156 | Cost: 0.6265
Epoch: 008/100 | Batch 120/156 | Cost: 0.5789
Epoch: 008/100 Train Acc.: 69.85% | Validation Acc.: 68.16%
Time elapsed: 24.52 min
Epoch: 009/100 | Batch 000/156 | Cost: 0.6074
Epoch: 009/100 | Batch 120/156 | Cost: 0.5174
Epoch: 009/100 Train Acc.: 71.99% | Validation Acc.: 70.56%
Time elapsed: 27.58 min
Epoch: 010/100 | Batch 000/156 | Cost: 0.5643
Epoch: 010/100 | Batch 120/156 | Cost: 0.5742
Epoch: 010/100 Train Acc.: 71.81% | Validation Acc.: 70.36%
Time elapsed: 30.64 min
Epoch: 011/100 | Batch 000/156 | Cost: 0.5232
Epoch: 011/100 | Batch 120/156 | Cost: 0.5742
Epoch: 011/100 Train Acc.: 73.53% | Validation Acc.: 73.76%
Time elapsed: 33.70 min
Epoch: 012/100 | Batch 000/156 | Cost: 0.4835
Epoch: 012/100 | Batch 120/156 | Cost: 0.5982
Epoch: 012/100 Train Acc.: 73.04% | Validation Acc.: 74.16%
Time elapsed: 36.76 min
Epoch: 013/100 | Batch 000/156 | Cost: 0.5121
Epoch: 013/100 | Batch 120/156 | Cost: 0.4783
Epoch: 013/100 Train Acc.: 74.17% | Validation Acc.: 74.60%
Time elapsed: 39.83 min
Epoch: 014/100 | Batch 000/156 | Cost: 0.4739
Epoch: 014/100 | Batch 120/156 | Cost: 0.4306
Epoch: 014/100 Train Acc.: 75.16% | Validation Acc.: 75.80%
Time elapsed: 42.90 min
Epoch: 015/100 | Batch 000/156 | Cost: 0.4841
Epoch: 015/100 | Batch 120/156 | Cost: 0.7340
Epoch: 015/100 Train Acc.: 75.57% | Validation Acc.: 75.16%
Time elapsed: 45.97 min
Epoch: 016/100 | Batch 000/156 | Cost: 0.5221
Epoch: 016/100 | Batch 120/156 | Cost: 0.4965
Epoch: 016/100 Train Acc.: 74.54% | Validation Acc.: 74.76%
Time elapsed: 49.03 min
Epoch: 017/100 | Batch 000/156 | Cost: 0.5824
Epoch: 017/100 | Batch 120/156 | Cost: 0.4898
Epoch: 017/100 Train Acc.: 69.71% | Validation Acc.: 70.20%
Time elapsed: 52.10 min
Epoch: 018/100 | Batch 000/156 | Cost: 0.5425
Epoch: 018/100 | Batch 120/156 | Cost: 0.4716
Epoch: 018/100 Train Acc.: 73.13% | Validation Acc.: 72.80%
Time elapsed: 55.17 min
Epoch: 019/100 | Batch 000/156 | Cost: 0.4678
Epoch: 019/100 | Batch 120/156 | Cost: 0.4395
Epoch: 019/100 Train Acc.: 77.46% | Validation Acc.: 75.64%
Time elapsed: 58.23 min
Epoch: 020/100 | Batch 000/156 | Cost: 0.4710
Epoch: 020/100 | Batch 120/156 | Cost: 0.4640
Epoch: 020/100 Train Acc.: 76.76% | Validation Acc.: 77.48%
Time elapsed: 61.28 min
Epoch: 021/100 | Batch 000/156 | Cost: 0.5479
Epoch: 021/100 | Batch 120/156 | Cost: 0.6039
Epoch: 021/100 Train Acc.: 78.17% | Validation Acc.: 75.80%
Time elapsed: 64.34 min
Epoch: 022/100 | Batch 000/156 | Cost: 0.4577
Epoch: 022/100 | Batch 120/156 | Cost: 0.4807
Epoch: 022/100 Train Acc.: 79.43% | Validation Acc.: 77.52%
Time elapsed: 67.40 min
Epoch: 023/100 | Batch 000/156 | Cost: 0.4451
Epoch: 023/100 | Batch 120/156 | Cost: 0.4683
Epoch: 023/100 Train Acc.: 78.66% | Validation Acc.: 75.24%
Time elapsed: 70.45 min
Epoch: 024/100 | Batch 000/156 | Cost: 0.4079
Epoch: 024/100 | Batch 120/156 | Cost: 0.4880
Epoch: 024/100 Train Acc.: 80.48% | Validation Acc.: 77.16%
Time elapsed: 73.51 min
Epoch: 025/100 | Batch 000/156 | Cost: 0.3866
Epoch: 025/100 | Batch 120/156 | Cost: 0.4633
Epoch: 025/100 Train Acc.: 81.28% | Validation Acc.: 78.04%
Time elapsed: 76.57 min
Epoch: 026/100 | Batch 000/156 | Cost: 0.4431
Epoch: 026/100 | Batch 120/156 | Cost: 0.4293
Epoch: 026/100 Train Acc.: 79.13% | Validation Acc.: 77.64%
Time elapsed: 79.63 min
Epoch: 027/100 | Batch 000/156 | Cost: 0.4641
Epoch: 027/100 | Batch 120/156 | Cost: 0.3868
Epoch: 027/100 Train Acc.: 81.39% | Validation Acc.: 78.60%
Time elapsed: 82.68 min
Epoch: 028/100 | Batch 000/156 | Cost: 0.3541
Epoch: 028/100 | Batch 120/156 | Cost: 0.4513
Epoch: 028/100 Train Acc.: 76.33% | Validation Acc.: 76.36%
Time elapsed: 85.73 min
Epoch: 029/100 | Batch 000/156 | Cost: 0.4095
Epoch: 029/100 | Batch 120/156 | Cost: 0.4936
Epoch: 029/100 Train Acc.: 82.04% | Validation Acc.: 78.84%
Time elapsed: 88.79 min
Epoch: 030/100 | Batch 000/156 | Cost: 0.3746
Epoch: 030/100 | Batch 120/156 | Cost: 0.3750
Epoch: 030/100 Train Acc.: 82.40% | Validation Acc.: 80.48%
Time elapsed: 91.85 min
Epoch: 031/100 | Batch 000/156 | Cost: 0.3374
Epoch: 031/100 | Batch 120/156 | Cost: 0.3232
Epoch: 031/100 Train Acc.: 83.65% | Validation Acc.: 80.04%
Time elapsed: 94.90 min
Epoch: 032/100 | Batch 000/156 | Cost: 0.3078
Epoch: 032/100 | Batch 120/156 | Cost: 0.3910
Epoch: 032/100 Train Acc.: 82.14% | Validation Acc.: 79.56%
Time elapsed: 97.96 min
Epoch: 033/100 | Batch 000/156 | Cost: 0.3478
Epoch: 033/100 | Batch 120/156 | Cost: 0.3938
Epoch: 033/100 Train Acc.: 84.48% | Validation Acc.: 81.40%
Time elapsed: 101.01 min
Epoch: 034/100 | Batch 000/156 | Cost: 0.4252
Epoch: 034/100 | Batch 120/156 | Cost: 0.3418
Epoch: 034/100 Train Acc.: 81.82% | Validation Acc.: 77.80%
Time elapsed: 104.06 min
Epoch: 035/100 | Batch 000/156 | Cost: 0.4092
Epoch: 035/100 | Batch 120/156 | Cost: 0.4198
Epoch: 035/100 Train Acc.: 83.34% | Validation Acc.: 81.12%
Time elapsed: 107.12 min
Epoch: 036/100 | Batch 000/156 | Cost: 0.3229
Epoch: 036/100 | Batch 120/156 | Cost: 0.3771
Epoch: 036/100 Train Acc.: 84.56% | Validation Acc.: 81.68%
Time elapsed: 110.18 min
Epoch: 037/100 | Batch 000/156 | Cost: 0.3015
Epoch: 037/100 | Batch 120/156 | Cost: 0.3045
Epoch: 037/100 Train Acc.: 85.79% | Validation Acc.: 81.68%
Time elapsed: 113.24 min
Epoch: 038/100 | Batch 000/156 | Cost: 0.3926
Epoch: 038/100 | Batch 120/156 | Cost: 0.3266
Epoch: 038/100 Train Acc.: 86.85% | Validation Acc.: 82.00%
Time elapsed: 116.30 min
Epoch: 039/100 | Batch 000/156 | Cost: 0.2918
Epoch: 039/100 | Batch 120/156 | Cost: 0.3017
Epoch: 039/100 Train Acc.: 86.57% | Validation Acc.: 83.56%
Time elapsed: 119.36 min
Epoch: 040/100 | Batch 000/156 | Cost: 0.4098
Epoch: 040/100 | Batch 120/156 | Cost: 0.3845
Epoch: 040/100 Train Acc.: 85.87% | Validation Acc.: 81.20%
Time elapsed: 122.42 min
Epoch: 041/100 | Batch 000/156 | Cost: 0.3328
Epoch: 041/100 | Batch 120/156 | Cost: 0.3668
Epoch: 041/100 Train Acc.: 85.02% | Validation Acc.: 80.48%
Time elapsed: 125.48 min
Epoch: 042/100 | Batch 000/156 | Cost: 0.3551
Epoch: 042/100 | Batch 120/156 | Cost: 0.4659
Epoch: 042/100 Train Acc.: 87.75% | Validation Acc.: 84.08%
Time elapsed: 128.54 min
Epoch: 043/100 | Batch 000/156 | Cost: 0.2675
Epoch: 043/100 | Batch 120/156 | Cost: 0.2103
Epoch: 043/100 Train Acc.: 87.38% | Validation Acc.: 83.60%
Time elapsed: 131.60 min
Epoch: 044/100 | Batch 000/156 | Cost: 0.3109
Epoch: 044/100 | Batch 120/156 | Cost: 0.3216
Epoch: 044/100 Train Acc.: 88.45% | Validation Acc.: 84.08%
Time elapsed: 134.65 min
Epoch: 045/100 | Batch 000/156 | Cost: 0.2631
Epoch: 045/100 | Batch 120/156 | Cost: 0.2968
Epoch: 045/100 Train Acc.: 86.28% | Validation Acc.: 82.36%
Time elapsed: 137.71 min
Epoch: 046/100 | Batch 000/156 | Cost: 0.3362
Epoch: 046/100 | Batch 120/156 | Cost: 0.2712
Epoch: 046/100 Train Acc.: 87.68% | Validation Acc.: 82.48%
Time elapsed: 140.77 min
Epoch: 047/100 | Batch 000/156 | Cost: 0.2369
Epoch: 047/100 | Batch 120/156 | Cost: 0.2744
Epoch: 047/100 Train Acc.: 89.02% | Validation Acc.: 84.20%
Time elapsed: 143.85 min
Epoch: 048/100 | Batch 000/156 | Cost: 0.2706
Epoch: 048/100 | Batch 120/156 | Cost: 0.1998
Epoch: 048/100 Train Acc.: 83.65% | Validation Acc.: 81.20%
Time elapsed: 146.91 min
Epoch: 049/100 | Batch 000/156 | Cost: 0.3213
Epoch: 049/100 | Batch 120/156 | Cost: 0.2467
Epoch: 049/100 Train Acc.: 89.80% | Validation Acc.: 84.20%
Time elapsed: 149.99 min
Epoch: 050/100 | Batch 000/156 | Cost: 0.2512
Epoch: 050/100 | Batch 120/156 | Cost: 0.1869
Epoch: 050/100 Train Acc.: 88.70% | Validation Acc.: 82.48%
Time elapsed: 153.05 min
Epoch: 051/100 | Batch 000/156 | Cost: 0.2588
Epoch: 051/100 | Batch 120/156 | Cost: 0.1962
Epoch: 051/100 Train Acc.: 90.07% | Validation Acc.: 84.40%
Time elapsed: 156.12 min
Epoch: 052/100 | Batch 000/156 | Cost: 0.2380
Epoch: 052/100 | Batch 120/156 | Cost: 0.2832
Epoch: 052/100 Train Acc.: 89.90% | Validation Acc.: 84.56%
Time elapsed: 159.19 min
Epoch: 053/100 | Batch 000/156 | Cost: 0.2694
Epoch: 053/100 | Batch 120/156 | Cost: 0.2922
Epoch: 053/100 Train Acc.: 90.83% | Validation Acc.: 85.16%
Time elapsed: 162.26 min
Epoch: 054/100 | Batch 000/156 | Cost: 0.2735
Epoch: 054/100 | Batch 120/156 | Cost: 0.1956
Epoch: 054/100 Train Acc.: 91.43% | Validation Acc.: 84.68%
Time elapsed: 165.33 min
Epoch: 055/100 | Batch 000/156 | Cost: 0.1598
Epoch: 055/100 | Batch 120/156 | Cost: 0.2557
Epoch: 055/100 Train Acc.: 89.95% | Validation Acc.: 84.88%
Time elapsed: 168.40 min
Epoch: 056/100 | Batch 000/156 | Cost: 0.2259
Epoch: 056/100 | Batch 120/156 | Cost: 0.2470
Epoch: 056/100 Train Acc.: 88.45% | Validation Acc.: 84.16%
Time elapsed: 171.47 min
Epoch: 057/100 | Batch 000/156 | Cost: 0.2026
Epoch: 057/100 | Batch 120/156 | Cost: 0.2183
Epoch: 057/100 Train Acc.: 91.53% | Validation Acc.: 85.20%
Time elapsed: 174.54 min
Epoch: 058/100 | Batch 000/156 | Cost: 0.3215
Epoch: 058/100 | Batch 120/156 | Cost: 0.1479
Epoch: 058/100 Train Acc.: 90.59% | Validation Acc.: 85.08%
Time elapsed: 177.61 min
Epoch: 059/100 | Batch 000/156 | Cost: 0.1667
Epoch: 059/100 | Batch 120/156 | Cost: 0.2463
Epoch: 059/100 Train Acc.: 91.15% | Validation Acc.: 85.00%
Time elapsed: 180.68 min
Epoch: 060/100 | Batch 000/156 | Cost: 0.1737
Epoch: 060/100 | Batch 120/156 | Cost: 0.1730
Epoch: 060/100 Train Acc.: 91.24% | Validation Acc.: 85.80%
Time elapsed: 183.75 min
Epoch: 061/100 | Batch 000/156 | Cost: 0.2478
Epoch: 061/100 | Batch 120/156 | Cost: 0.2276
Epoch: 061/100 Train Acc.: 92.57% | Validation Acc.: 86.16%
Time elapsed: 186.82 min
Epoch: 062/100 | Batch 000/156 | Cost: 0.1615
Epoch: 062/100 | Batch 120/156 | Cost: 0.1513
Epoch: 062/100 Train Acc.: 92.05% | Validation Acc.: 85.84%
Time elapsed: 189.89 min
Epoch: 063/100 | Batch 000/156 | Cost: 0.2236
Epoch: 063/100 | Batch 120/156 | Cost: 0.2676
Epoch: 063/100 Train Acc.: 92.67% | Validation Acc.: 85.88%
Time elapsed: 192.96 min
Epoch: 064/100 | Batch 000/156 | Cost: 0.1800
Epoch: 064/100 | Batch 120/156 | Cost: 0.3258
Epoch: 064/100 Train Acc.: 92.42% | Validation Acc.: 86.64%
Time elapsed: 196.04 min
Epoch: 065/100 | Batch 000/156 | Cost: 0.1528
Epoch: 065/100 | Batch 120/156 | Cost: 0.1287
Epoch: 065/100 Train Acc.: 92.90% | Validation Acc.: 85.52%
Time elapsed: 199.11 min
Epoch: 066/100 | Batch 000/156 | Cost: 0.1871
Epoch: 066/100 | Batch 120/156 | Cost: 0.1894
Epoch: 066/100 Train Acc.: 93.77% | Validation Acc.: 87.20%
Time elapsed: 202.18 min
Epoch: 067/100 | Batch 000/156 | Cost: 0.1162
Epoch: 067/100 | Batch 120/156 | Cost: 0.1656
Epoch: 067/100 Train Acc.: 93.52% | Validation Acc.: 86.16%
Time elapsed: 205.25 min
Epoch: 068/100 | Batch 000/156 | Cost: 0.1663
Epoch: 068/100 | Batch 120/156 | Cost: 0.1736
Epoch: 068/100 Train Acc.: 93.54% | Validation Acc.: 86.20%
Time elapsed: 208.31 min
Epoch: 069/100 | Batch 000/156 | Cost: 0.1370
Epoch: 069/100 | Batch 120/156 | Cost: 0.1789
Epoch: 069/100 Train Acc.: 92.53% | Validation Acc.: 84.84%
Time elapsed: 211.39 min
Epoch: 070/100 | Batch 000/156 | Cost: 0.1574
Epoch: 070/100 | Batch 120/156 | Cost: 0.1223
Epoch: 070/100 Train Acc.: 94.18% | Validation Acc.: 85.76%
Time elapsed: 214.45 min
Epoch: 071/100 | Batch 000/156 | Cost: 0.1062
Epoch: 071/100 | Batch 120/156 | Cost: 0.1763
Epoch: 071/100 Train Acc.: 94.26% | Validation Acc.: 85.88%
Time elapsed: 217.52 min
Epoch: 072/100 | Batch 000/156 | Cost: 0.0866
Epoch: 072/100 | Batch 120/156 | Cost: 0.2319
Epoch: 072/100 Train Acc.: 94.27% | Validation Acc.: 86.92%
Time elapsed: 220.59 min
Epoch: 073/100 | Batch 000/156 | Cost: 0.1095
Epoch: 073/100 | Batch 120/156 | Cost: 0.1807
Epoch: 073/100 Train Acc.: 92.42% | Validation Acc.: 84.12%
Time elapsed: 223.65 min
Epoch: 074/100 | Batch 000/156 | Cost: 0.3051
Epoch: 074/100 | Batch 120/156 | Cost: 0.1538
Epoch: 074/100 Train Acc.: 95.17% | Validation Acc.: 86.80%
Time elapsed: 226.71 min
Epoch: 075/100 | Batch 000/156 | Cost: 0.1262
Epoch: 075/100 | Batch 120/156 | Cost: 0.0711
Epoch: 075/100 Train Acc.: 93.93% | Validation Acc.: 86.24%
Time elapsed: 229.76 min
Epoch: 076/100 | Batch 000/156 | Cost: 0.0943
Epoch: 076/100 | Batch 120/156 | Cost: 0.0763
Epoch: 076/100 Train Acc.: 95.06% | Validation Acc.: 88.04%
Time elapsed: 232.82 min
Epoch: 077/100 | Batch 000/156 | Cost: 0.1028
Epoch: 077/100 | Batch 120/156 | Cost: 0.1218
Epoch: 077/100 Train Acc.: 94.57% | Validation Acc.: 87.28%
Time elapsed: 235.87 min
Epoch: 078/100 | Batch 000/156 | Cost: 0.1207
Epoch: 078/100 | Batch 120/156 | Cost: 0.0864
Epoch: 078/100 Train Acc.: 95.92% | Validation Acc.: 87.16%
Time elapsed: 238.93 min
Epoch: 079/100 | Batch 000/156 | Cost: 0.1313
Epoch: 079/100 | Batch 120/156 | Cost: 0.1842
Epoch: 079/100 Train Acc.: 94.69% | Validation Acc.: 84.92%
Time elapsed: 241.98 min
Epoch: 080/100 | Batch 000/156 | Cost: 0.1464
Epoch: 080/100 | Batch 120/156 | Cost: 0.0526
Epoch: 080/100 Train Acc.: 96.01% | Validation Acc.: 86.60%
Time elapsed: 245.03 min
Epoch: 081/100 | Batch 000/156 | Cost: 0.1308
Epoch: 081/100 | Batch 120/156 | Cost: 0.1293
Epoch: 081/100 Train Acc.: 95.16% | Validation Acc.: 86.72%
Time elapsed: 248.08 min
Epoch: 082/100 | Batch 000/156 | Cost: 0.1431
Epoch: 082/100 | Batch 120/156 | Cost: 0.1219
Epoch: 082/100 Train Acc.: 94.94% | Validation Acc.: 85.96%
Time elapsed: 251.14 min
Epoch: 083/100 | Batch 000/156 | Cost: 0.2372
Epoch: 083/100 | Batch 120/156 | Cost: 0.1343
Epoch: 083/100 Train Acc.: 95.80% | Validation Acc.: 86.80%
Time elapsed: 254.21 min
Epoch: 084/100 | Batch 000/156 | Cost: 0.2026
Epoch: 084/100 | Batch 120/156 | Cost: 0.1439
Epoch: 084/100 Train Acc.: 95.57% | Validation Acc.: 87.00%
Time elapsed: 257.27 min
Epoch: 085/100 | Batch 000/156 | Cost: 0.1188
Epoch: 085/100 | Batch 120/156 | Cost: 0.2363
Epoch: 085/100 Train Acc.: 96.18% | Validation Acc.: 85.88%
Time elapsed: 260.33 min
Epoch: 086/100 | Batch 000/156 | Cost: 0.1437
Epoch: 086/100 | Batch 120/156 | Cost: 0.2362
Epoch: 086/100 Train Acc.: 95.99% | Validation Acc.: 86.48%
Time elapsed: 263.39 min
Epoch: 087/100 | Batch 000/156 | Cost: 0.1408
Epoch: 087/100 | Batch 120/156 | Cost: 0.0658
Epoch: 087/100 Train Acc.: 95.34% | Validation Acc.: 86.72%
Time elapsed: 266.45 min
Epoch: 088/100 | Batch 000/156 | Cost: 0.1196
Epoch: 088/100 | Batch 120/156 | Cost: 0.1308
Epoch: 088/100 Train Acc.: 96.32% | Validation Acc.: 87.40%
Time elapsed: 269.51 min
Epoch: 089/100 | Batch 000/156 | Cost: 0.1674
Epoch: 089/100 | Batch 120/156 | Cost: 0.1603
Epoch: 089/100 Train Acc.: 95.83% | Validation Acc.: 86.68%
Time elapsed: 272.57 min
Epoch: 090/100 | Batch 000/156 | Cost: 0.1057
Epoch: 090/100 | Batch 120/156 | Cost: 0.1539
Epoch: 090/100 Train Acc.: 95.88% | Validation Acc.: 86.84%
Time elapsed: 275.63 min
Epoch: 091/100 | Batch 000/156 | Cost: 0.0632
Epoch: 091/100 | Batch 120/156 | Cost: 0.2300
Epoch: 091/100 Train Acc.: 95.76% | Validation Acc.: 87.48%
Time elapsed: 278.68 min
Epoch: 092/100 | Batch 000/156 | Cost: 0.1212
Epoch: 092/100 | Batch 120/156 | Cost: 0.2222
Epoch: 092/100 Train Acc.: 95.41% | Validation Acc.: 85.56%
Time elapsed: 281.73 min
Epoch: 093/100 | Batch 000/156 | Cost: 0.1231
Epoch: 093/100 | Batch 120/156 | Cost: 0.1409
Epoch: 093/100 Train Acc.: 95.46% | Validation Acc.: 84.88%
Time elapsed: 284.80 min
Epoch: 094/100 | Batch 000/156 | Cost: 0.1270
Epoch: 094/100 | Batch 120/156 | Cost: 0.1859
Epoch: 094/100 Train Acc.: 96.99% | Validation Acc.: 88.08%
Time elapsed: 287.85 min
Epoch: 095/100 | Batch 000/156 | Cost: 0.1226
Epoch: 095/100 | Batch 120/156 | Cost: 0.0627
Epoch: 095/100 Train Acc.: 95.69% | Validation Acc.: 85.32%
Time elapsed: 290.90 min
Epoch: 096/100 | Batch 000/156 | Cost: 0.1105
Epoch: 096/100 | Batch 120/156 | Cost: 0.2036
Epoch: 096/100 Train Acc.: 95.08% | Validation Acc.: 86.88%
Time elapsed: 293.96 min
Epoch: 097/100 | Batch 000/156 | Cost: 0.1400
Epoch: 097/100 | Batch 120/156 | Cost: 0.1957
Epoch: 097/100 Train Acc.: 95.10% | Validation Acc.: 85.44%
Time elapsed: 297.01 min
Epoch: 098/100 | Batch 000/156 | Cost: 0.1013
Epoch: 098/100 | Batch 120/156 | Cost: 0.0999
Epoch: 098/100 Train Acc.: 97.33% | Validation Acc.: 87.24%
Time elapsed: 300.07 min
Epoch: 099/100 | Batch 000/156 | Cost: 0.0667
Epoch: 099/100 | Batch 120/156 | Cost: 0.0732
Epoch: 099/100 Train Acc.: 97.28% | Validation Acc.: 87.04%
Time elapsed: 303.13 min
Epoch: 100/100 | Batch 000/156 | Cost: 0.0641
Epoch: 100/100 | Batch 120/156 | Cost: 0.0655
Epoch: 100/100 Train Acc.: 97.69% | Validation Acc.: 88.08%
Time elapsed: 306.19 min
Total Training Time: 306.19 min
In [13]:
plt.plot(range(1, NUM_EPOCHS+1), train_loss_lst, label='Training loss')
plt.plot(range(1, NUM_EPOCHS+1), valid_loss_lst, label='Validation loss')
plt.legend(loc='upper right')
plt.ylabel('Cross entropy')
plt.xlabel('Epoch')
plt.show()
In [14]:
plt.plot(range(1, NUM_EPOCHS+1), train_acc_lst, label='Training accuracy')
plt.plot(range(1, NUM_EPOCHS+1), valid_acc_lst, label='Validation accuracy')
plt.legend(loc='upper left')
plt.ylabel('Cross entropy')
plt.xlabel('Epoch')
plt.show()

Evaluation

In [15]:
model.eval()
with torch.set_grad_enabled(False): # save memory during inference
    test_acc, test_loss = compute_accuracy_and_loss(model, test_loader, DEVICE)
    print(f'Test accuracy: {test_acc:.2f}%')
Test accuracy: 88.28%
In [16]:
class UnNormalize(object):
    def __init__(self, mean, std):
        self.mean = mean
        self.std = std

    def __call__(self, tensor):
        """
        Args:
            tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        Returns:
            Tensor: Normalized image.
        """
        for t, m, s in zip(tensor, self.mean, self.std):
            t.mul_(s).add_(m)
            # The normalize code -> t.sub_(m).div_(s)
        return tensor
    
    
unorm = UnNormalize(mean=train_mean, std=train_std)
In [17]:
test_loader = DataLoader(dataset=train_dataset, 
                         batch_size=BATCH_SIZE, 
                         shuffle=True)

for features, targets in test_loader:
    break
    

_, predictions = model.forward(features[:8].to(DEVICE))
predictions = torch.argmax(predictions, dim=1)

d = {0: 'cat',
     1: 'dog'}
    
fig, ax = plt.subplots(1, 8, figsize=(20, 10))
for i in range(8):
    img = unorm(features[i])
    ax[i].imshow(np.transpose(img, (1, 2, 0)))
    ax[i].set_xlabel(d[predictions[i].item()])

plt.show()