Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka 

CPython 3.6.8
IPython 7.2.0

torch 1.1.0

ResNet-101 (on CIFAR-10)

Network Architecture

The network in this notebook is an implementation of the ResNet-101 [1] architecture on the CelebA face dataset [2] to train a gender classifier.

References

  • [1] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). (CVPR Link)

  • [2] Zhang, K., Tan, L., Li, Z., & Qiao, Y. (2016). Gender and smile classification using deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 34-38).

The ResNet-101 architecture is similar to the ResNet-50 architecture, which is in turn similar to the ResNet-34 architecture shown below (from [1]) except that the ResNet 101 is using a Bootleneck block (compared to ResNet-34) and more layers than ResNet-50 (figure shows a screenshot from [1]):

The following figure illustrates residual blocks with skip connections such that the input passed via the shortcut matches the dimensions of the main path's output, which allows the network to learn identity functions.

The ResNet-34 architecture actually uses residual blocks with modified skip connections such that the input passed via the shortcut matches is resized to dimensions of the main path's output. Such a residual block is illustrated below:

The ResNet-50/101/151 then uses a bottleneck as shown below:

For a more detailed explanation see the other notebook, resnet-ex-1.ipynb.

Imports

In [2]:
import os
import time

import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.nn.functional as F

from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torch.utils.data.dataset import Subset

from torchvision import datasets
from torchvision import transforms

import time

import matplotlib.pyplot as plt
from PIL import Image


if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True

Settings

In [3]:
##########################
### SETTINGS
##########################

# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.01
NUM_EPOCHS = 50

# Architecture
NUM_CLASSES = 10
BATCH_SIZE = 128
DEVICE = torch.device('cuda:3')
GRAYSCALE = False

Dataset

In [4]:
##########################
### CIFAR-10 Dataset
##########################


# Note transforms.ToTensor() scales input images
# to 0-1 range


train_indices = torch.arange(0, 49000)
valid_indices = torch.arange(49000, 50000)


train_and_valid = datasets.CIFAR10(root='data', 
                                   train=True, 
                                   transform=transforms.ToTensor(),
                                   download=True)

train_dataset = Subset(train_and_valid, train_indices)
valid_dataset = Subset(train_and_valid, valid_indices)


test_dataset = datasets.CIFAR10(root='data', 
                                train=False, 
                                transform=transforms.ToTensor())


#####################################################
### Data Loaders
#####################################################

train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=8,
                          shuffle=True)

valid_loader = DataLoader(dataset=valid_dataset, 
                          batch_size=BATCH_SIZE,
                          num_workers=8,
                          shuffle=False)

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=BATCH_SIZE,
                         num_workers=8,
                         shuffle=False)

#####################################################

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break

for images, labels in test_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
    
for images, labels in valid_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break
Files already downloaded and verified
Image batch dimensions: torch.Size([128, 3, 32, 32])
Image label dimensions: torch.Size([128])
Image batch dimensions: torch.Size([128, 3, 32, 32])
Image label dimensions: torch.Size([128])
Image batch dimensions: torch.Size([128, 3, 32, 32])
Image label dimensions: torch.Size([128])

Model

The following code cell that implements the ResNet-34 architecture is a derivative of the code provided at https://pytorch.org/docs/0.4.0/_modules/torchvision/models/resnet.html.

In [5]:
##########################
### MODEL
##########################


def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=False)


class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(planes * 4)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out




class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes, grayscale):
        self.inplanes = 64
        if grayscale:
            in_dim = 1
        else:
            in_dim = 3
        super(ResNet, self).__init__()
        self.conv1 = nn.Conv2d(in_dim, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        self.avgpool = nn.AvgPool2d(7, stride=1, padding=2)
        #self.fc = nn.Linear(2048 * block.expansion, num_classes)
        self.fc = nn.Linear(2048, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, (2. / n)**.5)
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        #x = self.avgpool(x)
        x = x.view(x.size(0), -1)
        logits = self.fc(x)
        probas = F.softmax(logits, dim=1)
        return logits, probas



def resnet101(num_classes, grayscale):
    """Constructs a ResNet-101 model."""
    model = ResNet(block=Bottleneck, 
                   layers=[3, 4, 23, 3],
                   num_classes=NUM_CLASSES,
                   grayscale=grayscale)
    return model
In [6]:
torch.manual_seed(RANDOM_SEED)

##########################
### COST AND OPTIMIZER
##########################

model = resnet101(NUM_CLASSES, GRAYSCALE)
model.to(DEVICE)
 
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)  

Training

In [7]:
def compute_accuracy(model, data_loader, device):
    correct_pred, num_examples = 0, 0
    for i, (features, targets) in enumerate(data_loader):
            
        features = features.to(device)
        targets = targets.to(device)

        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels == targets).sum()
    return correct_pred.float()/num_examples * 100
    

start_time = time.time()

# use random seed for reproducibility (here batch shuffling)
torch.manual_seed(RANDOM_SEED)

for epoch in range(NUM_EPOCHS):
    
    model.train()
    
    for batch_idx, (features, targets) in enumerate(train_loader):
    
        ### PREPARE MINIBATCH
        features = features.to(DEVICE)
        targets = targets.to(DEVICE)
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = F.cross_entropy(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 120:
            print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
                   f'Batch {batch_idx:03d}/{len(train_loader):03d} |' 
                   f' Cost: {cost:.4f}')

    # no need to build the computation graph for backprop when computing accuracy
    with torch.set_grad_enabled(False):
        train_acc = compute_accuracy(model, train_loader, device=DEVICE)
        valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
        print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
              f' | Validation Acc.: {valid_acc:.2f}%')
        
    elapsed = (time.time() - start_time)/60
    print(f'Time elapsed: {elapsed:.2f} min')
  
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/050 | Batch 000/383 | Cost: 2.5163
Epoch: 001/050 | Batch 120/383 | Cost: 2.6981
Epoch: 001/050 | Batch 240/383 | Cost: 2.4884
Epoch: 001/050 | Batch 360/383 | Cost: 2.2649
Epoch: 001/050 Train Acc.: 22.23% | Validation Acc.: 20.20%
Time elapsed: 0.91 min
Epoch: 002/050 | Batch 000/383 | Cost: 2.0539
Epoch: 002/050 | Batch 120/383 | Cost: 1.9494
Epoch: 002/050 | Batch 240/383 | Cost: 1.7358
Epoch: 002/050 | Batch 360/383 | Cost: 1.6881
Epoch: 002/050 Train Acc.: 34.62% | Validation Acc.: 33.30%
Time elapsed: 1.82 min
Epoch: 003/050 | Batch 000/383 | Cost: 1.5947
Epoch: 003/050 | Batch 120/383 | Cost: 1.6122
Epoch: 003/050 | Batch 240/383 | Cost: 1.6285
Epoch: 003/050 | Batch 360/383 | Cost: 1.5403
Epoch: 003/050 Train Acc.: 44.03% | Validation Acc.: 44.20%
Time elapsed: 2.75 min
Epoch: 004/050 | Batch 000/383 | Cost: 1.4653
Epoch: 004/050 | Batch 120/383 | Cost: 1.3565
Epoch: 004/050 | Batch 240/383 | Cost: 1.4571
Epoch: 004/050 | Batch 360/383 | Cost: 1.2938
Epoch: 004/050 Train Acc.: 52.26% | Validation Acc.: 51.60%
Time elapsed: 3.68 min
Epoch: 005/050 | Batch 000/383 | Cost: 1.3441
Epoch: 005/050 | Batch 120/383 | Cost: 1.2296
Epoch: 005/050 | Batch 240/383 | Cost: 1.1155
Epoch: 005/050 | Batch 360/383 | Cost: 1.2562
Epoch: 005/050 Train Acc.: 57.62% | Validation Acc.: 56.30%
Time elapsed: 4.62 min
Epoch: 006/050 | Batch 000/383 | Cost: 1.0643
Epoch: 006/050 | Batch 120/383 | Cost: 1.1455
Epoch: 006/050 | Batch 240/383 | Cost: 1.2560
Epoch: 006/050 | Batch 360/383 | Cost: 1.1945
Epoch: 006/050 Train Acc.: 61.26% | Validation Acc.: 57.10%
Time elapsed: 5.55 min
Epoch: 007/050 | Batch 000/383 | Cost: 1.0354
Epoch: 007/050 | Batch 120/383 | Cost: 1.0614
Epoch: 007/050 | Batch 240/383 | Cost: 0.8850
Epoch: 007/050 | Batch 360/383 | Cost: 1.2730
Epoch: 007/050 Train Acc.: 65.91% | Validation Acc.: 61.50%
Time elapsed: 6.48 min
Epoch: 008/050 | Batch 000/383 | Cost: 0.8989
Epoch: 008/050 | Batch 120/383 | Cost: 0.9142
Epoch: 008/050 | Batch 240/383 | Cost: 1.1074
Epoch: 008/050 | Batch 360/383 | Cost: 0.8856
Epoch: 008/050 Train Acc.: 70.48% | Validation Acc.: 64.60%
Time elapsed: 7.40 min
Epoch: 009/050 | Batch 000/383 | Cost: 0.9335
Epoch: 009/050 | Batch 120/383 | Cost: 0.9332
Epoch: 009/050 | Batch 240/383 | Cost: 0.8908
Epoch: 009/050 | Batch 360/383 | Cost: 0.6551
Epoch: 009/050 Train Acc.: 74.30% | Validation Acc.: 68.60%
Time elapsed: 8.33 min
Epoch: 010/050 | Batch 000/383 | Cost: 0.7369
Epoch: 010/050 | Batch 120/383 | Cost: 1.1510
Epoch: 010/050 | Batch 240/383 | Cost: 1.0534
Epoch: 010/050 | Batch 360/383 | Cost: 1.6784
Epoch: 010/050 Train Acc.: 43.73% | Validation Acc.: 46.20%
Time elapsed: 9.25 min
Epoch: 011/050 | Batch 000/383 | Cost: 1.6847
Epoch: 011/050 | Batch 120/383 | Cost: 1.2999
Epoch: 011/050 | Batch 240/383 | Cost: 1.3228
Epoch: 011/050 | Batch 360/383 | Cost: 1.2020
Epoch: 011/050 Train Acc.: 64.80% | Validation Acc.: 63.40%
Time elapsed: 10.19 min
Epoch: 012/050 | Batch 000/383 | Cost: 0.8810
Epoch: 012/050 | Batch 120/383 | Cost: 1.0206
Epoch: 012/050 | Batch 240/383 | Cost: 0.9342
Epoch: 012/050 | Batch 360/383 | Cost: 0.9090
Epoch: 012/050 Train Acc.: 72.51% | Validation Acc.: 69.50%
Time elapsed: 11.12 min
Epoch: 013/050 | Batch 000/383 | Cost: 0.7491
Epoch: 013/050 | Batch 120/383 | Cost: 0.7770
Epoch: 013/050 | Batch 240/383 | Cost: 0.6177
Epoch: 013/050 | Batch 360/383 | Cost: 0.8193
Epoch: 013/050 Train Acc.: 77.19% | Validation Acc.: 71.10%
Time elapsed: 12.05 min
Epoch: 014/050 | Batch 000/383 | Cost: 0.7384
Epoch: 014/050 | Batch 120/383 | Cost: 0.7387
Epoch: 014/050 | Batch 240/383 | Cost: 0.8657
Epoch: 014/050 | Batch 360/383 | Cost: 0.8790
Epoch: 014/050 Train Acc.: 76.11% | Validation Acc.: 70.50%
Time elapsed: 12.98 min
Epoch: 015/050 | Batch 000/383 | Cost: 0.8404
Epoch: 015/050 | Batch 120/383 | Cost: 0.5830
Epoch: 015/050 | Batch 240/383 | Cost: 0.5412
Epoch: 015/050 | Batch 360/383 | Cost: 0.5490
Epoch: 015/050 Train Acc.: 81.81% | Validation Acc.: 73.10%
Time elapsed: 13.91 min
Epoch: 016/050 | Batch 000/383 | Cost: 0.4776
Epoch: 016/050 | Batch 120/383 | Cost: 0.6313
Epoch: 016/050 | Batch 240/383 | Cost: 0.6662
Epoch: 016/050 | Batch 360/383 | Cost: 1.2079
Epoch: 016/050 Train Acc.: 70.04% | Validation Acc.: 65.80%
Time elapsed: 14.85 min
Epoch: 017/050 | Batch 000/383 | Cost: 0.8772
Epoch: 017/050 | Batch 120/383 | Cost: 0.5628
Epoch: 017/050 | Batch 240/383 | Cost: 0.6132
Epoch: 017/050 | Batch 360/383 | Cost: 0.5744
Epoch: 017/050 Train Acc.: 85.69% | Validation Acc.: 76.20%
Time elapsed: 15.78 min
Epoch: 018/050 | Batch 000/383 | Cost: 0.2691
Epoch: 018/050 | Batch 120/383 | Cost: 0.4488
Epoch: 018/050 | Batch 240/383 | Cost: 0.4294
Epoch: 018/050 | Batch 360/383 | Cost: 0.4160
Epoch: 018/050 Train Acc.: 85.21% | Validation Acc.: 74.30%
Time elapsed: 16.70 min
Epoch: 019/050 | Batch 000/383 | Cost: 0.3669
Epoch: 019/050 | Batch 120/383 | Cost: 0.4060
Epoch: 019/050 | Batch 240/383 | Cost: 0.3356
Epoch: 019/050 | Batch 360/383 | Cost: 0.4026
Epoch: 019/050 Train Acc.: 88.68% | Validation Acc.: 75.60%
Time elapsed: 17.63 min
Epoch: 020/050 | Batch 000/383 | Cost: 0.3351
Epoch: 020/050 | Batch 120/383 | Cost: 0.2692
Epoch: 020/050 | Batch 240/383 | Cost: 0.4012
Epoch: 020/050 | Batch 360/383 | Cost: 0.4054
Epoch: 020/050 Train Acc.: 78.68% | Validation Acc.: 69.30%
Time elapsed: 18.56 min
Epoch: 021/050 | Batch 000/383 | Cost: 0.6588
Epoch: 021/050 | Batch 120/383 | Cost: 0.2520
Epoch: 021/050 | Batch 240/383 | Cost: 0.3063
Epoch: 021/050 | Batch 360/383 | Cost: 0.3941
Epoch: 021/050 Train Acc.: 91.77% | Validation Acc.: 75.10%
Time elapsed: 19.46 min
Epoch: 022/050 | Batch 000/383 | Cost: 0.1590
Epoch: 022/050 | Batch 120/383 | Cost: 0.2021
Epoch: 022/050 | Batch 240/383 | Cost: 0.2791
Epoch: 022/050 | Batch 360/383 | Cost: 0.3070
Epoch: 022/050 Train Acc.: 94.08% | Validation Acc.: 75.00%
Time elapsed: 20.41 min
Epoch: 023/050 | Batch 000/383 | Cost: 0.2246
Epoch: 023/050 | Batch 120/383 | Cost: 0.1973
Epoch: 023/050 | Batch 240/383 | Cost: 0.3057
Epoch: 023/050 | Batch 360/383 | Cost: 0.3288
Epoch: 023/050 Train Acc.: 94.23% | Validation Acc.: 75.90%
Time elapsed: 21.34 min
Epoch: 024/050 | Batch 000/383 | Cost: 0.1660
Epoch: 024/050 | Batch 120/383 | Cost: 0.2438
Epoch: 024/050 | Batch 240/383 | Cost: 0.1335
Epoch: 024/050 | Batch 360/383 | Cost: 0.2232
Epoch: 024/050 Train Acc.: 95.17% | Validation Acc.: 77.60%
Time elapsed: 22.29 min
Epoch: 025/050 | Batch 000/383 | Cost: 0.0962
Epoch: 025/050 | Batch 120/383 | Cost: 0.2140
Epoch: 025/050 | Batch 240/383 | Cost: 0.3161
Epoch: 025/050 | Batch 360/383 | Cost: 0.2601
Epoch: 025/050 Train Acc.: 95.61% | Validation Acc.: 75.50%
Time elapsed: 23.22 min
Epoch: 026/050 | Batch 000/383 | Cost: 0.1091
Epoch: 026/050 | Batch 120/383 | Cost: 0.1121
Epoch: 026/050 | Batch 240/383 | Cost: 0.2550
Epoch: 026/050 | Batch 360/383 | Cost: 0.1442
Epoch: 026/050 Train Acc.: 95.77% | Validation Acc.: 75.80%
Time elapsed: 24.14 min
Epoch: 027/050 | Batch 000/383 | Cost: 0.1545
Epoch: 027/050 | Batch 120/383 | Cost: 0.1142
Epoch: 027/050 | Batch 240/383 | Cost: 0.1245
Epoch: 027/050 | Batch 360/383 | Cost: 0.1778
Epoch: 027/050 Train Acc.: 96.51% | Validation Acc.: 76.60%
Time elapsed: 25.07 min
Epoch: 028/050 | Batch 000/383 | Cost: 0.0774
Epoch: 028/050 | Batch 120/383 | Cost: 0.0916
Epoch: 028/050 | Batch 240/383 | Cost: 0.2275
Epoch: 028/050 | Batch 360/383 | Cost: 0.0742
Epoch: 028/050 Train Acc.: 95.90% | Validation Acc.: 77.10%
Time elapsed: 26.00 min
Epoch: 029/050 | Batch 000/383 | Cost: 0.0556
Epoch: 029/050 | Batch 120/383 | Cost: 0.0649
Epoch: 029/050 | Batch 240/383 | Cost: 0.1699
Epoch: 029/050 | Batch 360/383 | Cost: 0.0963
Epoch: 029/050 Train Acc.: 95.77% | Validation Acc.: 74.80%
Time elapsed: 26.93 min
Epoch: 030/050 | Batch 000/383 | Cost: 0.2278
Epoch: 030/050 | Batch 120/383 | Cost: 0.1565
Epoch: 030/050 | Batch 240/383 | Cost: 0.0929
Epoch: 030/050 | Batch 360/383 | Cost: 1.0334
Epoch: 030/050 Train Acc.: 53.82% | Validation Acc.: 52.10%
Time elapsed: 27.85 min
Epoch: 031/050 | Batch 000/383 | Cost: 1.5570
Epoch: 031/050 | Batch 120/383 | Cost: 0.6029
Epoch: 031/050 | Batch 240/383 | Cost: 0.4034
Epoch: 031/050 | Batch 360/383 | Cost: 0.3380
Epoch: 031/050 Train Acc.: 94.16% | Validation Acc.: 73.70%
Time elapsed: 28.78 min
Epoch: 032/050 | Batch 000/383 | Cost: 0.1454
Epoch: 032/050 | Batch 120/383 | Cost: 0.1692
Epoch: 032/050 | Batch 240/383 | Cost: 0.0922
Epoch: 032/050 | Batch 360/383 | Cost: 0.1078
Epoch: 032/050 Train Acc.: 97.44% | Validation Acc.: 77.20%
Time elapsed: 29.71 min
Epoch: 033/050 | Batch 000/383 | Cost: 0.1039
Epoch: 033/050 | Batch 120/383 | Cost: 0.0764
Epoch: 033/050 | Batch 240/383 | Cost: 0.1007
Epoch: 033/050 | Batch 360/383 | Cost: 0.0518
Epoch: 033/050 Train Acc.: 97.78% | Validation Acc.: 76.20%
Time elapsed: 30.64 min
Epoch: 034/050 | Batch 000/383 | Cost: 0.0672
Epoch: 034/050 | Batch 120/383 | Cost: 0.0719
Epoch: 034/050 | Batch 240/383 | Cost: 0.1163
Epoch: 034/050 | Batch 360/383 | Cost: 0.1522
Epoch: 034/050 Train Acc.: 97.79% | Validation Acc.: 75.80%
Time elapsed: 31.58 min
Epoch: 035/050 | Batch 000/383 | Cost: 0.1177
Epoch: 035/050 | Batch 120/383 | Cost: 0.0802
Epoch: 035/050 | Batch 240/383 | Cost: 0.1278
Epoch: 035/050 | Batch 360/383 | Cost: 0.0857
Epoch: 035/050 Train Acc.: 97.90% | Validation Acc.: 76.40%
Time elapsed: 32.52 min
Epoch: 036/050 | Batch 000/383 | Cost: 0.0574
Epoch: 036/050 | Batch 120/383 | Cost: 0.0644
Epoch: 036/050 | Batch 240/383 | Cost: 0.1070
Epoch: 036/050 | Batch 360/383 | Cost: 0.0326
Epoch: 036/050 Train Acc.: 97.71% | Validation Acc.: 75.60%
Time elapsed: 33.43 min
Epoch: 037/050 | Batch 000/383 | Cost: 0.0406
Epoch: 037/050 | Batch 120/383 | Cost: 0.0697
Epoch: 037/050 | Batch 240/383 | Cost: 0.0651
Epoch: 037/050 | Batch 360/383 | Cost: 0.0908
Epoch: 037/050 Train Acc.: 97.94% | Validation Acc.: 77.20%
Time elapsed: 34.37 min
Epoch: 038/050 | Batch 000/383 | Cost: 0.0772
Epoch: 038/050 | Batch 120/383 | Cost: 0.0609
Epoch: 038/050 | Batch 240/383 | Cost: 0.1069
Epoch: 038/050 | Batch 360/383 | Cost: 0.0757
Epoch: 038/050 Train Acc.: 98.20% | Validation Acc.: 76.80%
Time elapsed: 35.31 min
Epoch: 039/050 | Batch 000/383 | Cost: 0.0176
Epoch: 039/050 | Batch 120/383 | Cost: 0.0788
Epoch: 039/050 | Batch 240/383 | Cost: 0.1234
Epoch: 039/050 | Batch 360/383 | Cost: 0.0626
Epoch: 039/050 Train Acc.: 97.88% | Validation Acc.: 76.70%
Time elapsed: 36.24 min
Epoch: 040/050 | Batch 000/383 | Cost: 0.1171
Epoch: 040/050 | Batch 120/383 | Cost: 0.0533
Epoch: 040/050 | Batch 240/383 | Cost: 0.1050
Epoch: 040/050 | Batch 360/383 | Cost: 0.0686
Epoch: 040/050 Train Acc.: 98.21% | Validation Acc.: 75.10%
Time elapsed: 37.17 min
Epoch: 041/050 | Batch 000/383 | Cost: 0.0568
Epoch: 041/050 | Batch 120/383 | Cost: 0.0160
Epoch: 041/050 | Batch 240/383 | Cost: 0.0414
Epoch: 041/050 | Batch 360/383 | Cost: 0.1025
Epoch: 041/050 Train Acc.: 98.27% | Validation Acc.: 77.00%
Time elapsed: 38.09 min
Epoch: 042/050 | Batch 000/383 | Cost: 0.0302
Epoch: 042/050 | Batch 120/383 | Cost: 0.0280
Epoch: 042/050 | Batch 240/383 | Cost: 0.0703
Epoch: 042/050 | Batch 360/383 | Cost: 0.0316
Epoch: 042/050 Train Acc.: 98.09% | Validation Acc.: 75.70%
Time elapsed: 39.02 min
Epoch: 043/050 | Batch 000/383 | Cost: 0.0589
Epoch: 043/050 | Batch 120/383 | Cost: 0.0294
Epoch: 043/050 | Batch 240/383 | Cost: 0.0760
Epoch: 043/050 | Batch 360/383 | Cost: 0.0859
Epoch: 043/050 Train Acc.: 98.11% | Validation Acc.: 75.90%
Time elapsed: 39.96 min
Epoch: 044/050 | Batch 000/383 | Cost: 0.0466
Epoch: 044/050 | Batch 120/383 | Cost: 0.0802
Epoch: 044/050 | Batch 240/383 | Cost: 0.0781
Epoch: 044/050 | Batch 360/383 | Cost: 0.0717
Epoch: 044/050 Train Acc.: 98.52% | Validation Acc.: 77.00%
Time elapsed: 40.89 min
Epoch: 045/050 | Batch 000/383 | Cost: 0.0566
Epoch: 045/050 | Batch 120/383 | Cost: 0.0935
Epoch: 045/050 | Batch 240/383 | Cost: 0.0267
Epoch: 045/050 | Batch 360/383 | Cost: 0.0600
Epoch: 045/050 Train Acc.: 98.18% | Validation Acc.: 77.90%
Time elapsed: 41.83 min
Epoch: 046/050 | Batch 000/383 | Cost: 0.0506
Epoch: 046/050 | Batch 120/383 | Cost: 0.0650
Epoch: 046/050 | Batch 240/383 | Cost: 0.0094
Epoch: 046/050 | Batch 360/383 | Cost: 0.0197
Epoch: 046/050 Train Acc.: 98.54% | Validation Acc.: 77.50%
Time elapsed: 42.76 min
Epoch: 047/050 | Batch 000/383 | Cost: 0.0694
Epoch: 047/050 | Batch 120/383 | Cost: 0.0149
Epoch: 047/050 | Batch 240/383 | Cost: 0.0602
Epoch: 047/050 | Batch 360/383 | Cost: 0.0592
Epoch: 047/050 Train Acc.: 98.01% | Validation Acc.: 77.80%
Time elapsed: 43.69 min
Epoch: 048/050 | Batch 000/383 | Cost: 0.0822
Epoch: 048/050 | Batch 120/383 | Cost: 0.1484
Epoch: 048/050 | Batch 240/383 | Cost: 0.0758
Epoch: 048/050 | Batch 360/383 | Cost: 0.0467
Epoch: 048/050 Train Acc.: 98.75% | Validation Acc.: 76.40%
Time elapsed: 44.62 min
Epoch: 049/050 | Batch 000/383 | Cost: 0.0427
Epoch: 049/050 | Batch 120/383 | Cost: 0.0119
Epoch: 049/050 | Batch 240/383 | Cost: 0.0414
Epoch: 049/050 | Batch 360/383 | Cost: 0.0574
Epoch: 049/050 Train Acc.: 98.89% | Validation Acc.: 77.20%
Time elapsed: 45.57 min
Epoch: 050/050 | Batch 000/383 | Cost: 0.0492
Epoch: 050/050 | Batch 120/383 | Cost: 0.0210
Epoch: 050/050 | Batch 240/383 | Cost: 0.0981
Epoch: 050/050 | Batch 360/383 | Cost: 0.0344
Epoch: 050/050 Train Acc.: 98.27% | Validation Acc.: 76.50%
Time elapsed: 46.50 min
Total Training Time: 46.50 min

Evaluation

In [8]:
with torch.set_grad_enabled(False): # save memory during inference
    print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader, device=DEVICE)))
Test accuracy: 75.15%
In [9]:
%watermark -iv
numpy       1.15.4
pandas      0.23.4
torch       1.1.0
PIL.Image   5.3.0