Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka 

CPython 3.6.8
IPython 7.2.0

torch 1.0.0

Model Zoo -- CNN Gender Classifier (ResNet-50 Architecture, CelebA) with Data Parallelism

Network Architecture

The network in this notebook is an implementation of the ResNet-50 [1] architecture on the CelebA face dataset [2] to train a gender classifier.

References

  • [1] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). (CVPR Link)

  • [2] Zhang, K., Tan, L., Li, Z., & Qiao, Y. (2016). Gender and smile classification using deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 34-38).

The ResNet-50 architecture is similar to the ResNet-34 architecture shown below (from [1]):

However, in ResNet-50, the skip connection uses a bottleneck (from [1]):

The following figure illustrates residual blocks with skip connections such that the input passed via the shortcut matches the dimensions of the main path's output, which allows the network to learn identity functions.

The ResNet-34 architecture actually uses residual blocks with skip connections such that the input passed via the shortcut matches is resized to dimensions of the main path's output. Such a residual block is illustrated below:

The ResNet-50 uses a bottleneck as shown below:

For a more detailed explanation see the other notebook, resnet-ex-1.ipynb.

Imports

In [2]:
import os
import time

import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader

from torchvision import datasets
from torchvision import transforms

import matplotlib.pyplot as plt
from PIL import Image


if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True

Model Settings

In [3]:
##########################
### SETTINGS
##########################

# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.0001
BATCH_SIZE = 128
NUM_EPOCHS = 20

# Architecture
NUM_FEATURES = 28*28
NUM_CLASSES = 10

# Other
DEVICE = "cuda:0"
GRAYSCALE = True

MNIST Dataset

In [4]:
##########################
### MNIST DATASET
##########################

# Note transforms.ToTensor() scales input images
# to 0-1 range
train_dataset = datasets.MNIST(root='data', 
                               train=True, 
                               transform=transforms.ToTensor(),
                               download=True)

test_dataset = datasets.MNIST(root='data', 
                              train=False, 
                              transform=transforms.ToTensor())


train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=BATCH_SIZE, 
                          shuffle=True)

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE, 
                         shuffle=False)

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
Image batch dimensions: torch.Size([128, 1, 28, 28])
Image label dimensions: torch.Size([128])
In [5]:
device = torch.device(DEVICE)
torch.manual_seed(0)

for epoch in range(2):

    for batch_idx, (x, y) in enumerate(train_loader):
        
        print('Epoch:', epoch+1, end='')
        print(' | Batch index:', batch_idx, end='')
        print(' | Batch size:', y.size()[0])
        
        x = x.to(device)
        y = y.to(device)
        break
Epoch: 1 | Batch index: 0 | Batch size: 128
Epoch: 2 | Batch index: 0 | Batch size: 128

The following code cell that implements the ResNet-34 architecture is a derivative of the code provided at https://pytorch.org/docs/0.4.0/_modules/torchvision/models/resnet.html.

In [6]:
##########################
### MODEL
##########################


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=False)


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(planes * 4)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out




class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes, grayscale):
        self.inplanes = 64
        if grayscale:
            in_dim = 1
        else:
            in_dim = 3
        super(ResNet, self).__init__()
        self.conv1 = nn.Conv2d(in_dim, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        self.avgpool = nn.AvgPool2d(7, stride=1)
        self.fc = nn.Linear(512 * block.expansion, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, (2. / n)**.5)
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        # because MNIST is already 1x1 here:
        # disable avg pooling
        #x = self.avgpool(x)
        
        x = x.view(x.size(0), -1)
        logits = self.fc(x)
        probas = F.softmax(logits, dim=1)
        return logits, probas



def resnet34(num_classes):
    """Constructs a ResNet-34 model."""
    model = ResNet(block=Bottleneck, 
                   layers=[3, 4, 6, 3],
                   num_classes=NUM_CLASSES,
                   grayscale=GRAYSCALE)
    return model
In [7]:
torch.manual_seed(RANDOM_SEED)

model = resnet34(NUM_CLASSES)
model.to(DEVICE)
 
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)  

Training

In [8]:
def compute_accuracy(model, data_loader, device):
    correct_pred, num_examples = 0, 0
    for i, (features, targets) in enumerate(data_loader):
            
        features = features.to(device)
        targets = targets.to(device)

        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100
    

start_time = time.time()
for epoch in range(NUM_EPOCHS):
    
    model.train()
    for batch_idx, (features, targets) in enumerate(train_loader):
        
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 50:
            print ('Epoch: %03d/%03d | Batch %04d/%04d | Cost: %.4f' 
                   %(epoch+1, NUM_EPOCHS, batch_idx, 
                     len(train_loader), cost))

        

    model.eval()
    with torch.set_grad_enabled(False): # save memory during inference
        print('Epoch: %03d/%03d | Train: %.3f%%' % (
              epoch+1, NUM_EPOCHS, 
              compute_accuracy(model, train_loader, device=DEVICE)))
        
    print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))
    
print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))
Epoch: 001/020 | Batch 0000/0469 | Cost: 2.4814
Epoch: 001/020 | Batch 0050/0469 | Cost: 1.6407
Epoch: 001/020 | Batch 0100/0469 | Cost: 1.2549
Epoch: 001/020 | Batch 0150/0469 | Cost: 0.8013
Epoch: 001/020 | Batch 0200/0469 | Cost: 0.7644
Epoch: 001/020 | Batch 0250/0469 | Cost: 0.6729
Epoch: 001/020 | Batch 0300/0469 | Cost: 0.5566
Epoch: 001/020 | Batch 0350/0469 | Cost: 0.4682
Epoch: 001/020 | Batch 0400/0469 | Cost: 0.3663
Epoch: 001/020 | Batch 0450/0469 | Cost: 0.3904
Epoch: 001/020 | Train: 90.932%
Time elapsed: 0.66 min
Epoch: 002/020 | Batch 0000/0469 | Cost: 0.3419
Epoch: 002/020 | Batch 0050/0469 | Cost: 0.2901
Epoch: 002/020 | Batch 0100/0469 | Cost: 0.1937
Epoch: 002/020 | Batch 0150/0469 | Cost: 0.2761
Epoch: 002/020 | Batch 0200/0469 | Cost: 0.2688
Epoch: 002/020 | Batch 0250/0469 | Cost: 0.1875
Epoch: 002/020 | Batch 0300/0469 | Cost: 0.2401
Epoch: 002/020 | Batch 0350/0469 | Cost: 0.1768
Epoch: 002/020 | Batch 0400/0469 | Cost: 0.2161
Epoch: 002/020 | Batch 0450/0469 | Cost: 0.1624
Epoch: 002/020 | Train: 96.738%
Time elapsed: 1.34 min
Epoch: 003/020 | Batch 0000/0469 | Cost: 0.1413
Epoch: 003/020 | Batch 0050/0469 | Cost: 0.0832
Epoch: 003/020 | Batch 0100/0469 | Cost: 0.0924
Epoch: 003/020 | Batch 0150/0469 | Cost: 0.0587
Epoch: 003/020 | Batch 0200/0469 | Cost: 0.0991
Epoch: 003/020 | Batch 0250/0469 | Cost: 0.1508
Epoch: 003/020 | Batch 0300/0469 | Cost: 0.1367
Epoch: 003/020 | Batch 0350/0469 | Cost: 0.1431
Epoch: 003/020 | Batch 0400/0469 | Cost: 0.1650
Epoch: 003/020 | Batch 0450/0469 | Cost: 0.1842
Epoch: 003/020 | Train: 98.288%
Time elapsed: 2.03 min
Epoch: 004/020 | Batch 0000/0469 | Cost: 0.0812
Epoch: 004/020 | Batch 0050/0469 | Cost: 0.0499
Epoch: 004/020 | Batch 0100/0469 | Cost: 0.0413
Epoch: 004/020 | Batch 0150/0469 | Cost: 0.0929
Epoch: 004/020 | Batch 0200/0469 | Cost: 0.0501
Epoch: 004/020 | Batch 0250/0469 | Cost: 0.1147
Epoch: 004/020 | Batch 0300/0469 | Cost: 0.0277
Epoch: 004/020 | Batch 0350/0469 | Cost: 0.0659
Epoch: 004/020 | Batch 0400/0469 | Cost: 0.0854
Epoch: 004/020 | Batch 0450/0469 | Cost: 0.0368
Epoch: 004/020 | Train: 98.942%
Time elapsed: 2.71 min
Epoch: 005/020 | Batch 0000/0469 | Cost: 0.0120
Epoch: 005/020 | Batch 0050/0469 | Cost: 0.0127
Epoch: 005/020 | Batch 0100/0469 | Cost: 0.0516
Epoch: 005/020 | Batch 0150/0469 | Cost: 0.0341
Epoch: 005/020 | Batch 0200/0469 | Cost: 0.0600
Epoch: 005/020 | Batch 0250/0469 | Cost: 0.1150
Epoch: 005/020 | Batch 0300/0469 | Cost: 0.0312
Epoch: 005/020 | Batch 0350/0469 | Cost: 0.0494
Epoch: 005/020 | Batch 0400/0469 | Cost: 0.0711
Epoch: 005/020 | Batch 0450/0469 | Cost: 0.0531
Epoch: 005/020 | Train: 99.060%
Time elapsed: 3.39 min
Epoch: 006/020 | Batch 0000/0469 | Cost: 0.0589
Epoch: 006/020 | Batch 0050/0469 | Cost: 0.0341
Epoch: 006/020 | Batch 0100/0469 | Cost: 0.0205
Epoch: 006/020 | Batch 0150/0469 | Cost: 0.0219
Epoch: 006/020 | Batch 0200/0469 | Cost: 0.0495
Epoch: 006/020 | Batch 0250/0469 | Cost: 0.0344
Epoch: 006/020 | Batch 0300/0469 | Cost: 0.0298
Epoch: 006/020 | Batch 0350/0469 | Cost: 0.0375
Epoch: 006/020 | Batch 0400/0469 | Cost: 0.1366
Epoch: 006/020 | Batch 0450/0469 | Cost: 0.0469
Epoch: 006/020 | Train: 99.312%
Time elapsed: 4.07 min
Epoch: 007/020 | Batch 0000/0469 | Cost: 0.0116
Epoch: 007/020 | Batch 0050/0469 | Cost: 0.0411
Epoch: 007/020 | Batch 0100/0469 | Cost: 0.0115
Epoch: 007/020 | Batch 0150/0469 | Cost: 0.0110
Epoch: 007/020 | Batch 0200/0469 | Cost: 0.1041
Epoch: 007/020 | Batch 0250/0469 | Cost: 0.0172
Epoch: 007/020 | Batch 0300/0469 | Cost: 0.0614
Epoch: 007/020 | Batch 0350/0469 | Cost: 0.0363
Epoch: 007/020 | Batch 0400/0469 | Cost: 0.0366
Epoch: 007/020 | Batch 0450/0469 | Cost: 0.0660
Epoch: 007/020 | Train: 99.482%
Time elapsed: 4.76 min
Epoch: 008/020 | Batch 0000/0469 | Cost: 0.0132
Epoch: 008/020 | Batch 0050/0469 | Cost: 0.0016
Epoch: 008/020 | Batch 0100/0469 | Cost: 0.0226
Epoch: 008/020 | Batch 0150/0469 | Cost: 0.0283
Epoch: 008/020 | Batch 0200/0469 | Cost: 0.0373
Epoch: 008/020 | Batch 0250/0469 | Cost: 0.0584
Epoch: 008/020 | Batch 0300/0469 | Cost: 0.0115
Epoch: 008/020 | Batch 0350/0469 | Cost: 0.0893
Epoch: 008/020 | Batch 0400/0469 | Cost: 0.0368
Epoch: 008/020 | Batch 0450/0469 | Cost: 0.0184
Epoch: 008/020 | Train: 99.192%
Time elapsed: 5.44 min
Epoch: 009/020 | Batch 0000/0469 | Cost: 0.0047
Epoch: 009/020 | Batch 0050/0469 | Cost: 0.0088
Epoch: 009/020 | Batch 0100/0469 | Cost: 0.0021
Epoch: 009/020 | Batch 0150/0469 | Cost: 0.0861
Epoch: 009/020 | Batch 0200/0469 | Cost: 0.0031
Epoch: 009/020 | Batch 0250/0469 | Cost: 0.0761
Epoch: 009/020 | Batch 0300/0469 | Cost: 0.0123
Epoch: 009/020 | Batch 0350/0469 | Cost: 0.0544
Epoch: 009/020 | Batch 0400/0469 | Cost: 0.0174
Epoch: 009/020 | Batch 0450/0469 | Cost: 0.0093
Epoch: 009/020 | Train: 98.930%
Time elapsed: 6.13 min
Epoch: 010/020 | Batch 0000/0469 | Cost: 0.0164
Epoch: 010/020 | Batch 0050/0469 | Cost: 0.0301
Epoch: 010/020 | Batch 0100/0469 | Cost: 0.0198
Epoch: 010/020 | Batch 0150/0469 | Cost: 0.0171
Epoch: 010/020 | Batch 0200/0469 | Cost: 0.1067
Epoch: 010/020 | Batch 0250/0469 | Cost: 0.0099
Epoch: 010/020 | Batch 0300/0469 | Cost: 0.0169
Epoch: 010/020 | Batch 0350/0469 | Cost: 0.0498
Epoch: 010/020 | Batch 0400/0469 | Cost: 0.0394
Epoch: 010/020 | Batch 0450/0469 | Cost: 0.0366
Epoch: 010/020 | Train: 99.385%
Time elapsed: 6.81 min
Epoch: 011/020 | Batch 0000/0469 | Cost: 0.0049
Epoch: 011/020 | Batch 0050/0469 | Cost: 0.0052
Epoch: 011/020 | Batch 0100/0469 | Cost: 0.0019
Epoch: 011/020 | Batch 0150/0469 | Cost: 0.0270
Epoch: 011/020 | Batch 0200/0469 | Cost: 0.0076
Epoch: 011/020 | Batch 0250/0469 | Cost: 0.0091
Epoch: 011/020 | Batch 0300/0469 | Cost: 0.0114
Epoch: 011/020 | Batch 0350/0469 | Cost: 0.0233
Epoch: 011/020 | Batch 0400/0469 | Cost: 0.0443
Epoch: 011/020 | Batch 0450/0469 | Cost: 0.0027
Epoch: 011/020 | Train: 99.693%
Time elapsed: 7.50 min
Epoch: 012/020 | Batch 0000/0469 | Cost: 0.0361
Epoch: 012/020 | Batch 0050/0469 | Cost: 0.0054
Epoch: 012/020 | Batch 0100/0469 | Cost: 0.0485
Epoch: 012/020 | Batch 0150/0469 | Cost: 0.0220
Epoch: 012/020 | Batch 0200/0469 | Cost: 0.0903
Epoch: 012/020 | Batch 0250/0469 | Cost: 0.0144
Epoch: 012/020 | Batch 0300/0469 | Cost: 0.0148
Epoch: 012/020 | Batch 0350/0469 | Cost: 0.0055
Epoch: 012/020 | Batch 0400/0469 | Cost: 0.0012
Epoch: 012/020 | Batch 0450/0469 | Cost: 0.0228
Epoch: 012/020 | Train: 99.530%
Time elapsed: 8.18 min
Epoch: 013/020 | Batch 0000/0469 | Cost: 0.0038
Epoch: 013/020 | Batch 0050/0469 | Cost: 0.0060
Epoch: 013/020 | Batch 0100/0469 | Cost: 0.0206
Epoch: 013/020 | Batch 0150/0469 | Cost: 0.0092
Epoch: 013/020 | Batch 0200/0469 | Cost: 0.0428
Epoch: 013/020 | Batch 0250/0469 | Cost: 0.0627
Epoch: 013/020 | Batch 0300/0469 | Cost: 0.0374
Epoch: 013/020 | Batch 0350/0469 | Cost: 0.0160
Epoch: 013/020 | Batch 0400/0469 | Cost: 0.0013
Epoch: 013/020 | Batch 0450/0469 | Cost: 0.0477
Epoch: 013/020 | Train: 99.625%
Time elapsed: 8.86 min
Epoch: 014/020 | Batch 0000/0469 | Cost: 0.0087
Epoch: 014/020 | Batch 0050/0469 | Cost: 0.0014
Epoch: 014/020 | Batch 0100/0469 | Cost: 0.0032
Epoch: 014/020 | Batch 0150/0469 | Cost: 0.0096
Epoch: 014/020 | Batch 0200/0469 | Cost: 0.0128
Epoch: 014/020 | Batch 0250/0469 | Cost: 0.0131
Epoch: 014/020 | Batch 0300/0469 | Cost: 0.0137
Epoch: 014/020 | Batch 0350/0469 | Cost: 0.0338
Epoch: 014/020 | Batch 0400/0469 | Cost: 0.0393
Epoch: 014/020 | Batch 0450/0469 | Cost: 0.0372
Epoch: 014/020 | Train: 99.483%
Time elapsed: 9.55 min
Epoch: 015/020 | Batch 0000/0469 | Cost: 0.0263
Epoch: 015/020 | Batch 0050/0469 | Cost: 0.0049
Epoch: 015/020 | Batch 0100/0469 | Cost: 0.0198
Epoch: 015/020 | Batch 0150/0469 | Cost: 0.0455
Epoch: 015/020 | Batch 0200/0469 | Cost: 0.0028
Epoch: 015/020 | Batch 0250/0469 | Cost: 0.0069
Epoch: 015/020 | Batch 0300/0469 | Cost: 0.0319
Epoch: 015/020 | Batch 0350/0469 | Cost: 0.0006
Epoch: 015/020 | Batch 0400/0469 | Cost: 0.0022
Epoch: 015/020 | Batch 0450/0469 | Cost: 0.0024
Epoch: 015/020 | Train: 99.795%
Time elapsed: 10.24 min
Epoch: 016/020 | Batch 0000/0469 | Cost: 0.0010
Epoch: 016/020 | Batch 0050/0469 | Cost: 0.0029
Epoch: 016/020 | Batch 0100/0469 | Cost: 0.0031
Epoch: 016/020 | Batch 0150/0469 | Cost: 0.0041
Epoch: 016/020 | Batch 0200/0469 | Cost: 0.0007
Epoch: 016/020 | Batch 0250/0469 | Cost: 0.0130
Epoch: 016/020 | Batch 0300/0469 | Cost: 0.0172
Epoch: 016/020 | Batch 0350/0469 | Cost: 0.0391
Epoch: 016/020 | Batch 0400/0469 | Cost: 0.0171
Epoch: 016/020 | Batch 0450/0469 | Cost: 0.0763
Epoch: 016/020 | Train: 99.533%
Time elapsed: 10.92 min
Epoch: 017/020 | Batch 0000/0469 | Cost: 0.0575
Epoch: 017/020 | Batch 0050/0469 | Cost: 0.0122
Epoch: 017/020 | Batch 0100/0469 | Cost: 0.0356
Epoch: 017/020 | Batch 0150/0469 | Cost: 0.0309
Epoch: 017/020 | Batch 0200/0469 | Cost: 0.0840
Epoch: 017/020 | Batch 0250/0469 | Cost: 0.0178
Epoch: 017/020 | Batch 0300/0469 | Cost: 0.0083
Epoch: 017/020 | Batch 0350/0469 | Cost: 0.0006
Epoch: 017/020 | Batch 0400/0469 | Cost: 0.0114
Epoch: 017/020 | Batch 0450/0469 | Cost: 0.0281
Epoch: 017/020 | Train: 99.777%
Time elapsed: 11.62 min
Epoch: 018/020 | Batch 0000/0469 | Cost: 0.0116
Epoch: 018/020 | Batch 0050/0469 | Cost: 0.0014
Epoch: 018/020 | Batch 0100/0469 | Cost: 0.0149
Epoch: 018/020 | Batch 0150/0469 | Cost: 0.0258
Epoch: 018/020 | Batch 0200/0469 | Cost: 0.0032
Epoch: 018/020 | Batch 0250/0469 | Cost: 0.0026
Epoch: 018/020 | Batch 0300/0469 | Cost: 0.0010
Epoch: 018/020 | Batch 0350/0469 | Cost: 0.0109
Epoch: 018/020 | Batch 0400/0469 | Cost: 0.0003
Epoch: 018/020 | Batch 0450/0469 | Cost: 0.0052
Epoch: 018/020 | Train: 99.540%
Time elapsed: 12.30 min
Epoch: 019/020 | Batch 0000/0469 | Cost: 0.0215
Epoch: 019/020 | Batch 0050/0469 | Cost: 0.0025
Epoch: 019/020 | Batch 0100/0469 | Cost: 0.0884
Epoch: 019/020 | Batch 0150/0469 | Cost: 0.0038
Epoch: 019/020 | Batch 0200/0469 | Cost: 0.0036
Epoch: 019/020 | Batch 0250/0469 | Cost: 0.0061
Epoch: 019/020 | Batch 0300/0469 | Cost: 0.0015
Epoch: 019/020 | Batch 0350/0469 | Cost: 0.0406
Epoch: 019/020 | Batch 0400/0469 | Cost: 0.1211
Epoch: 019/020 | Batch 0450/0469 | Cost: 0.0135
Epoch: 019/020 | Train: 99.617%
Time elapsed: 12.98 min
Epoch: 020/020 | Batch 0000/0469 | Cost: 0.0983
Epoch: 020/020 | Batch 0050/0469 | Cost: 0.0043
Epoch: 020/020 | Batch 0100/0469 | Cost: 0.0492
Epoch: 020/020 | Batch 0150/0469 | Cost: 0.0634
Epoch: 020/020 | Batch 0200/0469 | Cost: 0.0052
Epoch: 020/020 | Batch 0250/0469 | Cost: 0.0082
Epoch: 020/020 | Batch 0300/0469 | Cost: 0.0044
Epoch: 020/020 | Batch 0350/0469 | Cost: 0.0015
Epoch: 020/020 | Batch 0400/0469 | Cost: 0.0153
Epoch: 020/020 | Batch 0450/0469 | Cost: 0.0085
Epoch: 020/020 | Train: 99.685%
Time elapsed: 13.67 min
Total Training Time: 13.67 min

Evaluation

In [9]:
with torch.set_grad_enabled(False): # save memory during inference
    print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader, device=DEVICE)))
Test accuracy: 98.39%
In [10]:
for batch_idx, (features, targets) in enumerate(test_loader):

    features = features
    targets = targets
    break
    
    
nhwc_img = np.transpose(features[0], axes=(1, 2, 0))
nhw_img = np.squeeze(nhwc_img.numpy(), axis=2)
plt.imshow(nhw_img, cmap='Greys');
In [11]:
model.eval()
logits, probas = model(features.to(device)[0, None])
print('Probability 7 %.2f%%' % (probas[0][7]*100))
Probability 7 100.00%
In [12]:
%watermark -iv
numpy       1.15.4
pandas      0.23.4
torch       1.0.0
PIL.Image   5.3.0