Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka CPython 3.6.8 IPython 7.2.0 torch 1.1.0
The network in this notebook is an implementation of the ResNet-101 [1] architecture on the CelebA face dataset [2] to train a gender classifier.
References
[1] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). (CVPR Link)
[2] Zhang, K., Tan, L., Li, Z., & Qiao, Y. (2016). Gender and smile classification using deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 34-38).
The ResNet-101 architecture is similar to the ResNet-50 architecture, which is in turn similar to the ResNet-34 architecture shown below (from [1]) except that the ResNet 101 is using a Bootleneck block (compared to ResNet-34) and more layers than ResNet-50 (figure shows a screenshot from [1]):
The following figure illustrates residual blocks with skip connections such that the input passed via the shortcut matches the dimensions of the main path's output, which allows the network to learn identity functions.
The ResNet-34 architecture actually uses residual blocks with modified skip connections such that the input passed via the shortcut matches is resized to dimensions of the main path's output. Such a residual block is illustrated below:
The ResNet-50/101/151 then uses a bottleneck as shown below:
For a more detailed explanation see the other notebook, resnet-ex-1.ipynb.
import os
import time
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torch.utils.data.dataset import Subset
from torchvision import datasets
from torchvision import transforms
import time
import matplotlib.pyplot as plt
from PIL import Image
if torch.cuda.is_available():
torch.backends.cudnn.deterministic = True
##########################
### SETTINGS
##########################
# Hyperparameters
RANDOM_SEED = 1
LEARNING_RATE = 0.01
NUM_EPOCHS = 50
# Architecture
NUM_CLASSES = 10
BATCH_SIZE = 128
DEVICE = torch.device('cuda:3')
GRAYSCALE = False
##########################
### CIFAR-10 Dataset
##########################
# Note transforms.ToTensor() scales input images
# to 0-1 range
train_indices = torch.arange(0, 49000)
valid_indices = torch.arange(49000, 50000)
train_and_valid = datasets.CIFAR10(root='data',
train=True,
transform=transforms.ToTensor(),
download=True)
train_dataset = Subset(train_and_valid, train_indices)
valid_dataset = Subset(train_and_valid, valid_indices)
test_dataset = datasets.CIFAR10(root='data',
train=False,
transform=transforms.ToTensor())
#####################################################
### Data Loaders
#####################################################
train_loader = DataLoader(dataset=train_dataset,
batch_size=BATCH_SIZE,
num_workers=8,
shuffle=True)
valid_loader = DataLoader(dataset=valid_dataset,
batch_size=BATCH_SIZE,
num_workers=8,
shuffle=False)
test_loader = DataLoader(dataset=test_dataset,
batch_size=BATCH_SIZE,
num_workers=8,
shuffle=False)
#####################################################
# Checking the dataset
for images, labels in train_loader:
print('Image batch dimensions:', images.shape)
print('Image label dimensions:', labels.shape)
break
for images, labels in test_loader:
print('Image batch dimensions:', images.shape)
print('Image label dimensions:', labels.shape)
break
for images, labels in valid_loader:
print('Image batch dimensions:', images.shape)
print('Image label dimensions:', labels.shape)
break
Files already downloaded and verified Image batch dimensions: torch.Size([128, 3, 32, 32]) Image label dimensions: torch.Size([128]) Image batch dimensions: torch.Size([128, 3, 32, 32]) Image label dimensions: torch.Size([128]) Image batch dimensions: torch.Size([128, 3, 32, 32]) Image label dimensions: torch.Size([128])
The following code cell that implements the ResNet-34 architecture is a derivative of the code provided at https://pytorch.org/docs/0.4.0/_modules/torchvision/models/resnet.html.
##########################
### MODEL
##########################
def conv3x3(in_planes, out_planes, stride=1):
"""3x3 convolution with padding"""
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
padding=1, bias=False)
class Bottleneck(nn.Module):
expansion = 4
def __init__(self, inplanes, planes, stride=1, downsample=None):
super(Bottleneck, self).__init__()
self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes)
self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
self.bn3 = nn.BatchNorm2d(planes * 4)
self.relu = nn.ReLU(inplace=True)
self.downsample = downsample
self.stride = stride
def forward(self, x):
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
residual = self.downsample(x)
out += residual
out = self.relu(out)
return out
class ResNet(nn.Module):
def __init__(self, block, layers, num_classes, grayscale):
self.inplanes = 64
if grayscale:
in_dim = 1
else:
in_dim = 3
super(ResNet, self).__init__()
self.conv1 = nn.Conv2d(in_dim, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
self.avgpool = nn.AvgPool2d(7, stride=1, padding=2)
#self.fc = nn.Linear(2048 * block.expansion, num_classes)
self.fc = nn.Linear(2048, num_classes)
for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, (2. / n)**.5)
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
def _make_layer(self, block, planes, blocks, stride=1):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion:
downsample = nn.Sequential(
nn.Conv2d(self.inplanes, planes * block.expansion,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(planes * block.expansion),
)
layers = []
layers.append(block(self.inplanes, planes, stride, downsample))
self.inplanes = planes * block.expansion
for i in range(1, blocks):
layers.append(block(self.inplanes, planes))
return nn.Sequential(*layers)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
#x = self.avgpool(x)
x = x.view(x.size(0), -1)
logits = self.fc(x)
probas = F.softmax(logits, dim=1)
return logits, probas
def resnet101(num_classes, grayscale):
"""Constructs a ResNet-101 model."""
model = ResNet(block=Bottleneck,
layers=[3, 4, 23, 3],
num_classes=NUM_CLASSES,
grayscale=grayscale)
return model
torch.manual_seed(RANDOM_SEED)
##########################
### COST AND OPTIMIZER
##########################
model = resnet101(NUM_CLASSES, GRAYSCALE)
model.to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)
def compute_accuracy(model, data_loader, device):
correct_pred, num_examples = 0, 0
for i, (features, targets) in enumerate(data_loader):
features = features.to(device)
targets = targets.to(device)
logits, probas = model(features)
_, predicted_labels = torch.max(probas, 1)
num_examples += targets.size(0)
correct_pred += (predicted_labels == targets).sum()
return correct_pred.float()/num_examples * 100
start_time = time.time()
# use random seed for reproducibility (here batch shuffling)
torch.manual_seed(RANDOM_SEED)
for epoch in range(NUM_EPOCHS):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
### PREPARE MINIBATCH
features = features.to(DEVICE)
targets = targets.to(DEVICE)
### FORWARD AND BACK PROP
logits, probas = model(features)
cost = F.cross_entropy(logits, targets)
optimizer.zero_grad()
cost.backward()
### UPDATE MODEL PARAMETERS
optimizer.step()
### LOGGING
if not batch_idx % 120:
print (f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} | '
f'Batch {batch_idx:03d}/{len(train_loader):03d} |'
f' Cost: {cost:.4f}')
# no need to build the computation graph for backprop when computing accuracy
with torch.set_grad_enabled(False):
train_acc = compute_accuracy(model, train_loader, device=DEVICE)
valid_acc = compute_accuracy(model, valid_loader, device=DEVICE)
print(f'Epoch: {epoch+1:03d}/{NUM_EPOCHS:03d} Train Acc.: {train_acc:.2f}%'
f' | Validation Acc.: {valid_acc:.2f}%')
elapsed = (time.time() - start_time)/60
print(f'Time elapsed: {elapsed:.2f} min')
elapsed = (time.time() - start_time)/60
print(f'Total Training Time: {elapsed:.2f} min')
Epoch: 001/050 | Batch 000/383 | Cost: 2.5163 Epoch: 001/050 | Batch 120/383 | Cost: 2.6981 Epoch: 001/050 | Batch 240/383 | Cost: 2.4884 Epoch: 001/050 | Batch 360/383 | Cost: 2.2649 Epoch: 001/050 Train Acc.: 22.23% | Validation Acc.: 20.20% Time elapsed: 0.91 min Epoch: 002/050 | Batch 000/383 | Cost: 2.0539 Epoch: 002/050 | Batch 120/383 | Cost: 1.9494 Epoch: 002/050 | Batch 240/383 | Cost: 1.7358 Epoch: 002/050 | Batch 360/383 | Cost: 1.6881 Epoch: 002/050 Train Acc.: 34.62% | Validation Acc.: 33.30% Time elapsed: 1.82 min Epoch: 003/050 | Batch 000/383 | Cost: 1.5947 Epoch: 003/050 | Batch 120/383 | Cost: 1.6122 Epoch: 003/050 | Batch 240/383 | Cost: 1.6285 Epoch: 003/050 | Batch 360/383 | Cost: 1.5403 Epoch: 003/050 Train Acc.: 44.03% | Validation Acc.: 44.20% Time elapsed: 2.75 min Epoch: 004/050 | Batch 000/383 | Cost: 1.4653 Epoch: 004/050 | Batch 120/383 | Cost: 1.3565 Epoch: 004/050 | Batch 240/383 | Cost: 1.4571 Epoch: 004/050 | Batch 360/383 | Cost: 1.2938 Epoch: 004/050 Train Acc.: 52.26% | Validation Acc.: 51.60% Time elapsed: 3.68 min Epoch: 005/050 | Batch 000/383 | Cost: 1.3441 Epoch: 005/050 | Batch 120/383 | Cost: 1.2296 Epoch: 005/050 | Batch 240/383 | Cost: 1.1155 Epoch: 005/050 | Batch 360/383 | Cost: 1.2562 Epoch: 005/050 Train Acc.: 57.62% | Validation Acc.: 56.30% Time elapsed: 4.62 min Epoch: 006/050 | Batch 000/383 | Cost: 1.0643 Epoch: 006/050 | Batch 120/383 | Cost: 1.1455 Epoch: 006/050 | Batch 240/383 | Cost: 1.2560 Epoch: 006/050 | Batch 360/383 | Cost: 1.1945 Epoch: 006/050 Train Acc.: 61.26% | Validation Acc.: 57.10% Time elapsed: 5.55 min Epoch: 007/050 | Batch 000/383 | Cost: 1.0354 Epoch: 007/050 | Batch 120/383 | Cost: 1.0614 Epoch: 007/050 | Batch 240/383 | Cost: 0.8850 Epoch: 007/050 | Batch 360/383 | Cost: 1.2730 Epoch: 007/050 Train Acc.: 65.91% | Validation Acc.: 61.50% Time elapsed: 6.48 min Epoch: 008/050 | Batch 000/383 | Cost: 0.8989 Epoch: 008/050 | Batch 120/383 | Cost: 0.9142 Epoch: 008/050 | Batch 240/383 | Cost: 1.1074 Epoch: 008/050 | Batch 360/383 | Cost: 0.8856 Epoch: 008/050 Train Acc.: 70.48% | Validation Acc.: 64.60% Time elapsed: 7.40 min Epoch: 009/050 | Batch 000/383 | Cost: 0.9335 Epoch: 009/050 | Batch 120/383 | Cost: 0.9332 Epoch: 009/050 | Batch 240/383 | Cost: 0.8908 Epoch: 009/050 | Batch 360/383 | Cost: 0.6551 Epoch: 009/050 Train Acc.: 74.30% | Validation Acc.: 68.60% Time elapsed: 8.33 min Epoch: 010/050 | Batch 000/383 | Cost: 0.7369 Epoch: 010/050 | Batch 120/383 | Cost: 1.1510 Epoch: 010/050 | Batch 240/383 | Cost: 1.0534 Epoch: 010/050 | Batch 360/383 | Cost: 1.6784 Epoch: 010/050 Train Acc.: 43.73% | Validation Acc.: 46.20% Time elapsed: 9.25 min Epoch: 011/050 | Batch 000/383 | Cost: 1.6847 Epoch: 011/050 | Batch 120/383 | Cost: 1.2999 Epoch: 011/050 | Batch 240/383 | Cost: 1.3228 Epoch: 011/050 | Batch 360/383 | Cost: 1.2020 Epoch: 011/050 Train Acc.: 64.80% | Validation Acc.: 63.40% Time elapsed: 10.19 min Epoch: 012/050 | Batch 000/383 | Cost: 0.8810 Epoch: 012/050 | Batch 120/383 | Cost: 1.0206 Epoch: 012/050 | Batch 240/383 | Cost: 0.9342 Epoch: 012/050 | Batch 360/383 | Cost: 0.9090 Epoch: 012/050 Train Acc.: 72.51% | Validation Acc.: 69.50% Time elapsed: 11.12 min Epoch: 013/050 | Batch 000/383 | Cost: 0.7491 Epoch: 013/050 | Batch 120/383 | Cost: 0.7770 Epoch: 013/050 | Batch 240/383 | Cost: 0.6177 Epoch: 013/050 | Batch 360/383 | Cost: 0.8193 Epoch: 013/050 Train Acc.: 77.19% | Validation Acc.: 71.10% Time elapsed: 12.05 min Epoch: 014/050 | Batch 000/383 | Cost: 0.7384 Epoch: 014/050 | Batch 120/383 | Cost: 0.7387 Epoch: 014/050 | Batch 240/383 | Cost: 0.8657 Epoch: 014/050 | Batch 360/383 | Cost: 0.8790 Epoch: 014/050 Train Acc.: 76.11% | Validation Acc.: 70.50% Time elapsed: 12.98 min Epoch: 015/050 | Batch 000/383 | Cost: 0.8404 Epoch: 015/050 | Batch 120/383 | Cost: 0.5830 Epoch: 015/050 | Batch 240/383 | Cost: 0.5412 Epoch: 015/050 | Batch 360/383 | Cost: 0.5490 Epoch: 015/050 Train Acc.: 81.81% | Validation Acc.: 73.10% Time elapsed: 13.91 min Epoch: 016/050 | Batch 000/383 | Cost: 0.4776 Epoch: 016/050 | Batch 120/383 | Cost: 0.6313 Epoch: 016/050 | Batch 240/383 | Cost: 0.6662 Epoch: 016/050 | Batch 360/383 | Cost: 1.2079 Epoch: 016/050 Train Acc.: 70.04% | Validation Acc.: 65.80% Time elapsed: 14.85 min Epoch: 017/050 | Batch 000/383 | Cost: 0.8772 Epoch: 017/050 | Batch 120/383 | Cost: 0.5628 Epoch: 017/050 | Batch 240/383 | Cost: 0.6132 Epoch: 017/050 | Batch 360/383 | Cost: 0.5744 Epoch: 017/050 Train Acc.: 85.69% | Validation Acc.: 76.20% Time elapsed: 15.78 min Epoch: 018/050 | Batch 000/383 | Cost: 0.2691 Epoch: 018/050 | Batch 120/383 | Cost: 0.4488 Epoch: 018/050 | Batch 240/383 | Cost: 0.4294 Epoch: 018/050 | Batch 360/383 | Cost: 0.4160 Epoch: 018/050 Train Acc.: 85.21% | Validation Acc.: 74.30% Time elapsed: 16.70 min Epoch: 019/050 | Batch 000/383 | Cost: 0.3669 Epoch: 019/050 | Batch 120/383 | Cost: 0.4060 Epoch: 019/050 | Batch 240/383 | Cost: 0.3356 Epoch: 019/050 | Batch 360/383 | Cost: 0.4026 Epoch: 019/050 Train Acc.: 88.68% | Validation Acc.: 75.60% Time elapsed: 17.63 min Epoch: 020/050 | Batch 000/383 | Cost: 0.3351 Epoch: 020/050 | Batch 120/383 | Cost: 0.2692 Epoch: 020/050 | Batch 240/383 | Cost: 0.4012 Epoch: 020/050 | Batch 360/383 | Cost: 0.4054 Epoch: 020/050 Train Acc.: 78.68% | Validation Acc.: 69.30% Time elapsed: 18.56 min Epoch: 021/050 | Batch 000/383 | Cost: 0.6588 Epoch: 021/050 | Batch 120/383 | Cost: 0.2520 Epoch: 021/050 | Batch 240/383 | Cost: 0.3063 Epoch: 021/050 | Batch 360/383 | Cost: 0.3941 Epoch: 021/050 Train Acc.: 91.77% | Validation Acc.: 75.10% Time elapsed: 19.46 min Epoch: 022/050 | Batch 000/383 | Cost: 0.1590 Epoch: 022/050 | Batch 120/383 | Cost: 0.2021 Epoch: 022/050 | Batch 240/383 | Cost: 0.2791 Epoch: 022/050 | Batch 360/383 | Cost: 0.3070 Epoch: 022/050 Train Acc.: 94.08% | Validation Acc.: 75.00% Time elapsed: 20.41 min Epoch: 023/050 | Batch 000/383 | Cost: 0.2246 Epoch: 023/050 | Batch 120/383 | Cost: 0.1973 Epoch: 023/050 | Batch 240/383 | Cost: 0.3057 Epoch: 023/050 | Batch 360/383 | Cost: 0.3288 Epoch: 023/050 Train Acc.: 94.23% | Validation Acc.: 75.90% Time elapsed: 21.34 min Epoch: 024/050 | Batch 000/383 | Cost: 0.1660 Epoch: 024/050 | Batch 120/383 | Cost: 0.2438 Epoch: 024/050 | Batch 240/383 | Cost: 0.1335 Epoch: 024/050 | Batch 360/383 | Cost: 0.2232 Epoch: 024/050 Train Acc.: 95.17% | Validation Acc.: 77.60% Time elapsed: 22.29 min Epoch: 025/050 | Batch 000/383 | Cost: 0.0962 Epoch: 025/050 | Batch 120/383 | Cost: 0.2140 Epoch: 025/050 | Batch 240/383 | Cost: 0.3161 Epoch: 025/050 | Batch 360/383 | Cost: 0.2601 Epoch: 025/050 Train Acc.: 95.61% | Validation Acc.: 75.50% Time elapsed: 23.22 min Epoch: 026/050 | Batch 000/383 | Cost: 0.1091 Epoch: 026/050 | Batch 120/383 | Cost: 0.1121 Epoch: 026/050 | Batch 240/383 | Cost: 0.2550 Epoch: 026/050 | Batch 360/383 | Cost: 0.1442 Epoch: 026/050 Train Acc.: 95.77% | Validation Acc.: 75.80% Time elapsed: 24.14 min Epoch: 027/050 | Batch 000/383 | Cost: 0.1545 Epoch: 027/050 | Batch 120/383 | Cost: 0.1142 Epoch: 027/050 | Batch 240/383 | Cost: 0.1245 Epoch: 027/050 | Batch 360/383 | Cost: 0.1778 Epoch: 027/050 Train Acc.: 96.51% | Validation Acc.: 76.60% Time elapsed: 25.07 min Epoch: 028/050 | Batch 000/383 | Cost: 0.0774 Epoch: 028/050 | Batch 120/383 | Cost: 0.0916 Epoch: 028/050 | Batch 240/383 | Cost: 0.2275 Epoch: 028/050 | Batch 360/383 | Cost: 0.0742 Epoch: 028/050 Train Acc.: 95.90% | Validation Acc.: 77.10% Time elapsed: 26.00 min Epoch: 029/050 | Batch 000/383 | Cost: 0.0556 Epoch: 029/050 | Batch 120/383 | Cost: 0.0649 Epoch: 029/050 | Batch 240/383 | Cost: 0.1699 Epoch: 029/050 | Batch 360/383 | Cost: 0.0963 Epoch: 029/050 Train Acc.: 95.77% | Validation Acc.: 74.80% Time elapsed: 26.93 min Epoch: 030/050 | Batch 000/383 | Cost: 0.2278 Epoch: 030/050 | Batch 120/383 | Cost: 0.1565 Epoch: 030/050 | Batch 240/383 | Cost: 0.0929 Epoch: 030/050 | Batch 360/383 | Cost: 1.0334 Epoch: 030/050 Train Acc.: 53.82% | Validation Acc.: 52.10% Time elapsed: 27.85 min Epoch: 031/050 | Batch 000/383 | Cost: 1.5570 Epoch: 031/050 | Batch 120/383 | Cost: 0.6029 Epoch: 031/050 | Batch 240/383 | Cost: 0.4034 Epoch: 031/050 | Batch 360/383 | Cost: 0.3380 Epoch: 031/050 Train Acc.: 94.16% | Validation Acc.: 73.70% Time elapsed: 28.78 min Epoch: 032/050 | Batch 000/383 | Cost: 0.1454 Epoch: 032/050 | Batch 120/383 | Cost: 0.1692 Epoch: 032/050 | Batch 240/383 | Cost: 0.0922 Epoch: 032/050 | Batch 360/383 | Cost: 0.1078 Epoch: 032/050 Train Acc.: 97.44% | Validation Acc.: 77.20% Time elapsed: 29.71 min Epoch: 033/050 | Batch 000/383 | Cost: 0.1039 Epoch: 033/050 | Batch 120/383 | Cost: 0.0764 Epoch: 033/050 | Batch 240/383 | Cost: 0.1007 Epoch: 033/050 | Batch 360/383 | Cost: 0.0518 Epoch: 033/050 Train Acc.: 97.78% | Validation Acc.: 76.20% Time elapsed: 30.64 min Epoch: 034/050 | Batch 000/383 | Cost: 0.0672 Epoch: 034/050 | Batch 120/383 | Cost: 0.0719 Epoch: 034/050 | Batch 240/383 | Cost: 0.1163 Epoch: 034/050 | Batch 360/383 | Cost: 0.1522 Epoch: 034/050 Train Acc.: 97.79% | Validation Acc.: 75.80% Time elapsed: 31.58 min Epoch: 035/050 | Batch 000/383 | Cost: 0.1177 Epoch: 035/050 | Batch 120/383 | Cost: 0.0802 Epoch: 035/050 | Batch 240/383 | Cost: 0.1278 Epoch: 035/050 | Batch 360/383 | Cost: 0.0857 Epoch: 035/050 Train Acc.: 97.90% | Validation Acc.: 76.40% Time elapsed: 32.52 min Epoch: 036/050 | Batch 000/383 | Cost: 0.0574 Epoch: 036/050 | Batch 120/383 | Cost: 0.0644 Epoch: 036/050 | Batch 240/383 | Cost: 0.1070 Epoch: 036/050 | Batch 360/383 | Cost: 0.0326 Epoch: 036/050 Train Acc.: 97.71% | Validation Acc.: 75.60% Time elapsed: 33.43 min Epoch: 037/050 | Batch 000/383 | Cost: 0.0406 Epoch: 037/050 | Batch 120/383 | Cost: 0.0697 Epoch: 037/050 | Batch 240/383 | Cost: 0.0651 Epoch: 037/050 | Batch 360/383 | Cost: 0.0908 Epoch: 037/050 Train Acc.: 97.94% | Validation Acc.: 77.20% Time elapsed: 34.37 min Epoch: 038/050 | Batch 000/383 | Cost: 0.0772 Epoch: 038/050 | Batch 120/383 | Cost: 0.0609 Epoch: 038/050 | Batch 240/383 | Cost: 0.1069 Epoch: 038/050 | Batch 360/383 | Cost: 0.0757 Epoch: 038/050 Train Acc.: 98.20% | Validation Acc.: 76.80% Time elapsed: 35.31 min Epoch: 039/050 | Batch 000/383 | Cost: 0.0176 Epoch: 039/050 | Batch 120/383 | Cost: 0.0788 Epoch: 039/050 | Batch 240/383 | Cost: 0.1234 Epoch: 039/050 | Batch 360/383 | Cost: 0.0626 Epoch: 039/050 Train Acc.: 97.88% | Validation Acc.: 76.70% Time elapsed: 36.24 min Epoch: 040/050 | Batch 000/383 | Cost: 0.1171 Epoch: 040/050 | Batch 120/383 | Cost: 0.0533 Epoch: 040/050 | Batch 240/383 | Cost: 0.1050 Epoch: 040/050 | Batch 360/383 | Cost: 0.0686 Epoch: 040/050 Train Acc.: 98.21% | Validation Acc.: 75.10% Time elapsed: 37.17 min Epoch: 041/050 | Batch 000/383 | Cost: 0.0568 Epoch: 041/050 | Batch 120/383 | Cost: 0.0160 Epoch: 041/050 | Batch 240/383 | Cost: 0.0414 Epoch: 041/050 | Batch 360/383 | Cost: 0.1025 Epoch: 041/050 Train Acc.: 98.27% | Validation Acc.: 77.00% Time elapsed: 38.09 min Epoch: 042/050 | Batch 000/383 | Cost: 0.0302 Epoch: 042/050 | Batch 120/383 | Cost: 0.0280 Epoch: 042/050 | Batch 240/383 | Cost: 0.0703 Epoch: 042/050 | Batch 360/383 | Cost: 0.0316 Epoch: 042/050 Train Acc.: 98.09% | Validation Acc.: 75.70% Time elapsed: 39.02 min Epoch: 043/050 | Batch 000/383 | Cost: 0.0589 Epoch: 043/050 | Batch 120/383 | Cost: 0.0294 Epoch: 043/050 | Batch 240/383 | Cost: 0.0760 Epoch: 043/050 | Batch 360/383 | Cost: 0.0859 Epoch: 043/050 Train Acc.: 98.11% | Validation Acc.: 75.90% Time elapsed: 39.96 min Epoch: 044/050 | Batch 000/383 | Cost: 0.0466 Epoch: 044/050 | Batch 120/383 | Cost: 0.0802 Epoch: 044/050 | Batch 240/383 | Cost: 0.0781 Epoch: 044/050 | Batch 360/383 | Cost: 0.0717 Epoch: 044/050 Train Acc.: 98.52% | Validation Acc.: 77.00% Time elapsed: 40.89 min Epoch: 045/050 | Batch 000/383 | Cost: 0.0566 Epoch: 045/050 | Batch 120/383 | Cost: 0.0935 Epoch: 045/050 | Batch 240/383 | Cost: 0.0267 Epoch: 045/050 | Batch 360/383 | Cost: 0.0600 Epoch: 045/050 Train Acc.: 98.18% | Validation Acc.: 77.90% Time elapsed: 41.83 min Epoch: 046/050 | Batch 000/383 | Cost: 0.0506 Epoch: 046/050 | Batch 120/383 | Cost: 0.0650 Epoch: 046/050 | Batch 240/383 | Cost: 0.0094 Epoch: 046/050 | Batch 360/383 | Cost: 0.0197 Epoch: 046/050 Train Acc.: 98.54% | Validation Acc.: 77.50% Time elapsed: 42.76 min Epoch: 047/050 | Batch 000/383 | Cost: 0.0694 Epoch: 047/050 | Batch 120/383 | Cost: 0.0149 Epoch: 047/050 | Batch 240/383 | Cost: 0.0602 Epoch: 047/050 | Batch 360/383 | Cost: 0.0592 Epoch: 047/050 Train Acc.: 98.01% | Validation Acc.: 77.80% Time elapsed: 43.69 min Epoch: 048/050 | Batch 000/383 | Cost: 0.0822 Epoch: 048/050 | Batch 120/383 | Cost: 0.1484 Epoch: 048/050 | Batch 240/383 | Cost: 0.0758 Epoch: 048/050 | Batch 360/383 | Cost: 0.0467 Epoch: 048/050 Train Acc.: 98.75% | Validation Acc.: 76.40% Time elapsed: 44.62 min Epoch: 049/050 | Batch 000/383 | Cost: 0.0427 Epoch: 049/050 | Batch 120/383 | Cost: 0.0119 Epoch: 049/050 | Batch 240/383 | Cost: 0.0414 Epoch: 049/050 | Batch 360/383 | Cost: 0.0574 Epoch: 049/050 Train Acc.: 98.89% | Validation Acc.: 77.20% Time elapsed: 45.57 min Epoch: 050/050 | Batch 000/383 | Cost: 0.0492 Epoch: 050/050 | Batch 120/383 | Cost: 0.0210 Epoch: 050/050 | Batch 240/383 | Cost: 0.0981 Epoch: 050/050 | Batch 360/383 | Cost: 0.0344 Epoch: 050/050 Train Acc.: 98.27% | Validation Acc.: 76.50% Time elapsed: 46.50 min Total Training Time: 46.50 min
with torch.set_grad_enabled(False): # save memory during inference
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader, device=DEVICE)))
Test accuracy: 75.15%
%watermark -iv
numpy 1.15.4 pandas 0.23.4 torch 1.1.0 PIL.Image 5.3.0