Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch
Sebastian Raschka CPython 3.7.3 IPython 7.6.1 torch 1.2.0
Certain types of deep neural networks, especially, simple ones without any other type regularization and a relatively large number of layers, can suffer from exploding gradient problems. The exploding gradient problem is a scenario where large loss gradients accumulate during backpropagation, which will eventually result in very large weight updates during training. As a consequence, the updates will be very unstable and fluctuate a lot, which often causes severe problems during training. This is also a particular problem for unbounded activation functions such as ReLU.
One common, classic technique for avoiding exploding gradient problems is the so-called gradient clipping approach. Here, we simply set gradient values above or below a certain threshold to a user-specified min or max value. In PyTorch, there are several ways for performing gradient clipping.
1 - Basic Clipping
The simplest approach to gradient clipping in PyTorch is by using the torch.nn.utils.clip_grad_value_
function. For example, if we have instantiated a PyTorch model from a model class based on torch.nn.Module
(as usual), we can add the following line of code in order to clip the gradients to [-1, 1] range:
torch.nn.utils.clip_grad_value_(parameters=model.parameters(),
clip_value=1.)
However, notice that via this approach, we can only specify a single clip value, which will be used for both the upper and lower bound such that gradients will be clipped to the range [-clip_value
, clip_value
].
2 - Custom Lower and Upper Bounds
If we want to clip the gradients to an unsymmetric interval around zero, say [-0.1, 1.0], we can take a different approach by defining a backwards hook:
for param in model.parameters():
param.register_hook(lambda gradient: torch.clamp(gradient, -0.1, 1.0))
This backward hook only needs to be defined once after instantiating the model. Then, each time after calling the backward
method, it will clip the gradients before running the model.step()
method.
3 - Norm-clipping
Lastly, there's a third clipping option, torch.nn.utils.clip_grad_norm_
, which clips the gradients using a vector norm as follows:
torch.nn.utils.clip_grad_norm_(parameters, max_norm, norm_type=2)
Clips gradient norm of an iterable of parameters. The norm is computed over all gradients together, as if they were concatenated into a single vector. Gradients are modified in-place.
import time
import numpy as np
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
import torch.nn.functional as F
import torch
if torch.cuda.is_available():
torch.backends.cudnn.deterministic = True
##########################
### SETTINGS
##########################
# Device
device = torch.device("cuda:2" if torch.cuda.is_available() else "cpu")
# Hyperparameters
random_seed = 1
learning_rate = 0.01
num_epochs = 10
batch_size = 64
# Architecture
num_features = 784
num_hidden_1 = 256
num_hidden_2 = 128
num_hidden_3 = 64
num_hidden_4 = 32
num_classes = 10
##########################
### MNIST DATASET
##########################
# Note transforms.ToTensor() scales input images
# to 0-1 range
train_dataset = datasets.MNIST(root='data',
train=True,
transform=transforms.ToTensor(),
download=True)
test_dataset = datasets.MNIST(root='data',
train=False,
transform=transforms.ToTensor())
train_loader = DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True)
test_loader = DataLoader(dataset=test_dataset,
batch_size=batch_size,
shuffle=False)
# Checking the dataset
for images, labels in train_loader:
print('Image batch dimensions:', images.shape)
print('Image label dimensions:', labels.shape)
break
Image batch dimensions: torch.Size([64, 1, 28, 28]) Image label dimensions: torch.Size([64])
def compute_accuracy(net, data_loader):
net.eval()
correct_pred, num_examples = 0, 0
with torch.no_grad():
for features, targets in data_loader:
features = features.view(-1, 28*28).to(device)
targets = targets.to(device)
logits, probas = net(features)
_, predicted_labels = torch.max(probas, 1)
num_examples += targets.size(0)
correct_pred += (predicted_labels == targets).sum()
return correct_pred.float()/num_examples * 100
##########################
### MODEL
##########################
class MultilayerPerceptron(torch.nn.Module):
def __init__(self, num_features, num_classes):
super(MultilayerPerceptron, self).__init__()
### 1st hidden layer
self.linear_1 = torch.nn.Linear(num_features, num_hidden_1)
### 2nd hidden layer
self.linear_2 = torch.nn.Linear(num_hidden_1, num_hidden_2)
### 3rd hidden layer
self.linear_3 = torch.nn.Linear(num_hidden_2, num_hidden_3)
### 4th hidden layer
self.linear_4 = torch.nn.Linear(num_hidden_3, num_hidden_4)
### Output layer
self.linear_out = torch.nn.Linear(num_hidden_4, num_classes)
def forward(self, x):
out = self.linear_1(x)
out = F.relu(out)
out = self.linear_2(out)
out = F.relu(out)
out = self.linear_3(out)
out = F.relu(out)
out = self.linear_4(out)
out = F.relu(out)
logits = self.linear_out(out)
probas = F.log_softmax(logits, dim=1)
return logits, probas
torch.manual_seed(random_seed)
model = MultilayerPerceptron(num_features=num_features,
num_classes=num_classes)
model = model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
###################################################################
start_time = time.time()
for epoch in range(num_epochs):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
features = features.view(-1, 28*28).to(device)
targets = targets.to(device)
### FORWARD AND BACK PROP
logits, probas = model(features)
cost = F.cross_entropy(logits, targets)
optimizer.zero_grad()
cost.backward()
### UPDATE MODEL PARAMETERS
#########################################################
#########################################################
### GRADIENT CLIPPING
torch.nn.utils.clip_grad_value_(model.parameters(), 1.)
#########################################################
#########################################################
optimizer.step()
### LOGGING
if not batch_idx % 50:
print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f'
%(epoch+1, num_epochs, batch_idx,
len(train_loader), cost))
with torch.set_grad_enabled(False):
print('Epoch: %03d/%03d training accuracy: %.2f%%' % (
epoch+1, num_epochs,
compute_accuracy(model, train_loader)))
print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))
print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))
Epoch: 001/010 | Batch 000/938 | Cost: 2.3054 Epoch: 001/010 | Batch 050/938 | Cost: 0.6427 Epoch: 001/010 | Batch 100/938 | Cost: 0.3220 Epoch: 001/010 | Batch 150/938 | Cost: 0.3492 Epoch: 001/010 | Batch 200/938 | Cost: 0.4505 Epoch: 001/010 | Batch 250/938 | Cost: 0.1510 Epoch: 001/010 | Batch 300/938 | Cost: 0.2062 Epoch: 001/010 | Batch 350/938 | Cost: 0.1287 Epoch: 001/010 | Batch 400/938 | Cost: 0.1714 Epoch: 001/010 | Batch 450/938 | Cost: 0.3522 Epoch: 001/010 | Batch 500/938 | Cost: 0.4268 Epoch: 001/010 | Batch 550/938 | Cost: 0.0133 Epoch: 001/010 | Batch 600/938 | Cost: 0.1868 Epoch: 001/010 | Batch 650/938 | Cost: 0.2312 Epoch: 001/010 | Batch 700/938 | Cost: 0.1471 Epoch: 001/010 | Batch 750/938 | Cost: 0.1321 Epoch: 001/010 | Batch 800/938 | Cost: 0.2776 Epoch: 001/010 | Batch 850/938 | Cost: 0.2223 Epoch: 001/010 | Batch 900/938 | Cost: 0.1812 Epoch: 001/010 training accuracy: 94.72% Time elapsed: 0.25 min Epoch: 002/010 | Batch 000/938 | Cost: 0.2080 Epoch: 002/010 | Batch 050/938 | Cost: 0.2177 Epoch: 002/010 | Batch 100/938 | Cost: 0.1090 Epoch: 002/010 | Batch 150/938 | Cost: 0.1225 Epoch: 002/010 | Batch 200/938 | Cost: 0.2514 Epoch: 002/010 | Batch 250/938 | Cost: 0.1093 Epoch: 002/010 | Batch 300/938 | Cost: 0.0626 Epoch: 002/010 | Batch 350/938 | Cost: 0.1242 Epoch: 002/010 | Batch 400/938 | Cost: 0.0168 Epoch: 002/010 | Batch 450/938 | Cost: 0.2678 Epoch: 002/010 | Batch 500/938 | Cost: 0.1761 Epoch: 002/010 | Batch 550/938 | Cost: 0.2607 Epoch: 002/010 | Batch 600/938 | Cost: 0.1324 Epoch: 002/010 | Batch 650/938 | Cost: 0.2334 Epoch: 002/010 | Batch 700/938 | Cost: 0.1510 Epoch: 002/010 | Batch 750/938 | Cost: 0.1456 Epoch: 002/010 | Batch 800/938 | Cost: 0.2882 Epoch: 002/010 | Batch 850/938 | Cost: 0.1485 Epoch: 002/010 | Batch 900/938 | Cost: 0.2007 Epoch: 002/010 training accuracy: 96.83% Time elapsed: 0.49 min Epoch: 003/010 | Batch 000/938 | Cost: 0.0550 Epoch: 003/010 | Batch 050/938 | Cost: 0.0555 Epoch: 003/010 | Batch 100/938 | Cost: 0.1040 Epoch: 003/010 | Batch 150/938 | Cost: 0.2290 Epoch: 003/010 | Batch 200/938 | Cost: 0.0506 Epoch: 003/010 | Batch 250/938 | Cost: 0.1028 Epoch: 003/010 | Batch 300/938 | Cost: 0.0381 Epoch: 003/010 | Batch 350/938 | Cost: 0.1593 Epoch: 003/010 | Batch 400/938 | Cost: 0.0637 Epoch: 003/010 | Batch 450/938 | Cost: 0.0127 Epoch: 003/010 | Batch 500/938 | Cost: 0.4391 Epoch: 003/010 | Batch 550/938 | Cost: 0.0110 Epoch: 003/010 | Batch 600/938 | Cost: 0.1959 Epoch: 003/010 | Batch 650/938 | Cost: 0.1020 Epoch: 003/010 | Batch 700/938 | Cost: 0.0206 Epoch: 003/010 | Batch 750/938 | Cost: 0.2747 Epoch: 003/010 | Batch 800/938 | Cost: 0.1192 Epoch: 003/010 | Batch 850/938 | Cost: 0.0115 Epoch: 003/010 | Batch 900/938 | Cost: 0.2476 Epoch: 003/010 training accuracy: 97.65% Time elapsed: 0.74 min Epoch: 004/010 | Batch 000/938 | Cost: 0.0875 Epoch: 004/010 | Batch 050/938 | Cost: 0.0335 Epoch: 004/010 | Batch 100/938 | Cost: 0.0530 Epoch: 004/010 | Batch 150/938 | Cost: 0.4291 Epoch: 004/010 | Batch 200/938 | Cost: 0.0634 Epoch: 004/010 | Batch 250/938 | Cost: 0.0437 Epoch: 004/010 | Batch 300/938 | Cost: 0.0547 Epoch: 004/010 | Batch 350/938 | Cost: 0.1602 Epoch: 004/010 | Batch 400/938 | Cost: 0.1071 Epoch: 004/010 | Batch 450/938 | Cost: 0.0351 Epoch: 004/010 | Batch 500/938 | Cost: 0.0712 Epoch: 004/010 | Batch 550/938 | Cost: 0.1261 Epoch: 004/010 | Batch 600/938 | Cost: 0.1212 Epoch: 004/010 | Batch 650/938 | Cost: 0.0802 Epoch: 004/010 | Batch 700/938 | Cost: 0.0844 Epoch: 004/010 | Batch 750/938 | Cost: 0.1496 Epoch: 004/010 | Batch 800/938 | Cost: 0.1543 Epoch: 004/010 | Batch 850/938 | Cost: 0.0182 Epoch: 004/010 | Batch 900/938 | Cost: 0.0433 Epoch: 004/010 training accuracy: 97.08% Time elapsed: 0.98 min Epoch: 005/010 | Batch 000/938 | Cost: 0.1570 Epoch: 005/010 | Batch 050/938 | Cost: 0.0291 Epoch: 005/010 | Batch 100/938 | Cost: 0.0363 Epoch: 005/010 | Batch 150/938 | Cost: 0.0320 Epoch: 005/010 | Batch 200/938 | Cost: 0.0322 Epoch: 005/010 | Batch 250/938 | Cost: 0.0720 Epoch: 005/010 | Batch 300/938 | Cost: 0.0497 Epoch: 005/010 | Batch 350/938 | Cost: 0.1058 Epoch: 005/010 | Batch 400/938 | Cost: 0.2139 Epoch: 005/010 | Batch 450/938 | Cost: 0.0602 Epoch: 005/010 | Batch 500/938 | Cost: 0.0689 Epoch: 005/010 | Batch 550/938 | Cost: 0.1355 Epoch: 005/010 | Batch 600/938 | Cost: 0.1659 Epoch: 005/010 | Batch 650/938 | Cost: 0.1504 Epoch: 005/010 | Batch 700/938 | Cost: 0.0403 Epoch: 005/010 | Batch 750/938 | Cost: 0.3422 Epoch: 005/010 | Batch 800/938 | Cost: 0.3299 Epoch: 005/010 | Batch 850/938 | Cost: 0.2327 Epoch: 005/010 | Batch 900/938 | Cost: 0.0171 Epoch: 005/010 training accuracy: 97.51% Time elapsed: 1.23 min Epoch: 006/010 | Batch 000/938 | Cost: 0.0548 Epoch: 006/010 | Batch 050/938 | Cost: 0.2781 Epoch: 006/010 | Batch 100/938 | Cost: 0.0657 Epoch: 006/010 | Batch 150/938 | Cost: 0.0444 Epoch: 006/010 | Batch 200/938 | Cost: 0.0057 Epoch: 006/010 | Batch 250/938 | Cost: 0.1058 Epoch: 006/010 | Batch 300/938 | Cost: 0.1610 Epoch: 006/010 | Batch 350/938 | Cost: 0.0353 Epoch: 006/010 | Batch 400/938 | Cost: 0.2474 Epoch: 006/010 | Batch 450/938 | Cost: 0.1038 Epoch: 006/010 | Batch 500/938 | Cost: 0.2918 Epoch: 006/010 | Batch 550/938 | Cost: 0.1360 Epoch: 006/010 | Batch 600/938 | Cost: 0.1977 Epoch: 006/010 | Batch 650/938 | Cost: 0.0314 Epoch: 006/010 | Batch 700/938 | Cost: 0.0968 Epoch: 006/010 | Batch 750/938 | Cost: 0.2215 Epoch: 006/010 | Batch 800/938 | Cost: 0.0328 Epoch: 006/010 | Batch 850/938 | Cost: 0.2423 Epoch: 006/010 | Batch 900/938 | Cost: 0.1192 Epoch: 006/010 training accuracy: 97.47% Time elapsed: 1.48 min Epoch: 007/010 | Batch 000/938 | Cost: 0.0126 Epoch: 007/010 | Batch 050/938 | Cost: 0.0735 Epoch: 007/010 | Batch 100/938 | Cost: 0.2426 Epoch: 007/010 | Batch 150/938 | Cost: 0.0736 Epoch: 007/010 | Batch 200/938 | Cost: 0.1387 Epoch: 007/010 | Batch 250/938 | Cost: 0.2173 Epoch: 007/010 | Batch 300/938 | Cost: 0.0127 Epoch: 007/010 | Batch 350/938 | Cost: 0.1131 Epoch: 007/010 | Batch 400/938 | Cost: 0.2219 Epoch: 007/010 | Batch 450/938 | Cost: 0.0127 Epoch: 007/010 | Batch 500/938 | Cost: 0.0905 Epoch: 007/010 | Batch 550/938 | Cost: 0.2466 Epoch: 007/010 | Batch 600/938 | Cost: 0.0065 Epoch: 007/010 | Batch 650/938 | Cost: 0.1477 Epoch: 007/010 | Batch 700/938 | Cost: 0.0183 Epoch: 007/010 | Batch 750/938 | Cost: 0.0534 Epoch: 007/010 | Batch 800/938 | Cost: 0.1139 Epoch: 007/010 | Batch 850/938 | Cost: 0.1177 Epoch: 007/010 | Batch 900/938 | Cost: 0.0662 Epoch: 007/010 training accuracy: 97.74% Time elapsed: 1.72 min Epoch: 008/010 | Batch 000/938 | Cost: 0.0276 Epoch: 008/010 | Batch 050/938 | Cost: 0.1275 Epoch: 008/010 | Batch 100/938 | Cost: 0.2151 Epoch: 008/010 | Batch 150/938 | Cost: 0.0204 Epoch: 008/010 | Batch 200/938 | Cost: 0.2154 Epoch: 008/010 | Batch 250/938 | Cost: 0.0271 Epoch: 008/010 | Batch 300/938 | Cost: 0.0523 Epoch: 008/010 | Batch 350/938 | Cost: 0.1604 Epoch: 008/010 | Batch 400/938 | Cost: 0.0888 Epoch: 008/010 | Batch 450/938 | Cost: 0.0045 Epoch: 008/010 | Batch 500/938 | Cost: 0.0288 Epoch: 008/010 | Batch 550/938 | Cost: 0.1140 Epoch: 008/010 | Batch 600/938 | Cost: 0.0849 Epoch: 008/010 | Batch 650/938 | Cost: 0.0216 Epoch: 008/010 | Batch 700/938 | Cost: 0.0294 Epoch: 008/010 | Batch 750/938 | Cost: 0.0995 Epoch: 008/010 | Batch 800/938 | Cost: 0.1159 Epoch: 008/010 | Batch 850/938 | Cost: 0.1599 Epoch: 008/010 | Batch 900/938 | Cost: 0.1317 Epoch: 008/010 training accuracy: 98.29% Time elapsed: 1.97 min Epoch: 009/010 | Batch 000/938 | Cost: 0.1071 Epoch: 009/010 | Batch 050/938 | Cost: 0.0580 Epoch: 009/010 | Batch 100/938 | Cost: 0.1777 Epoch: 009/010 | Batch 150/938 | Cost: 0.2850 Epoch: 009/010 | Batch 200/938 | Cost: 0.1229 Epoch: 009/010 | Batch 250/938 | Cost: 0.0672 Epoch: 009/010 | Batch 300/938 | Cost: 0.2009 Epoch: 009/010 | Batch 350/938 | Cost: 0.0110 Epoch: 009/010 | Batch 400/938 | Cost: 0.2604 Epoch: 009/010 | Batch 450/938 | Cost: 0.0801 Epoch: 009/010 | Batch 500/938 | Cost: 0.0092 Epoch: 009/010 | Batch 550/938 | Cost: 0.1360 Epoch: 009/010 | Batch 600/938 | Cost: 0.0664 Epoch: 009/010 | Batch 650/938 | Cost: 0.0886 Epoch: 009/010 | Batch 700/938 | Cost: 0.0630 Epoch: 009/010 | Batch 750/938 | Cost: 0.0784 Epoch: 009/010 | Batch 800/938 | Cost: 0.1736 Epoch: 009/010 | Batch 850/938 | Cost: 0.0855 Epoch: 009/010 | Batch 900/938 | Cost: 0.2815 Epoch: 009/010 training accuracy: 97.74% Time elapsed: 2.21 min Epoch: 010/010 | Batch 000/938 | Cost: 0.0024 Epoch: 010/010 | Batch 050/938 | Cost: 0.0497 Epoch: 010/010 | Batch 100/938 | Cost: 0.0888 Epoch: 010/010 | Batch 150/938 | Cost: 0.1719 Epoch: 010/010 | Batch 200/938 | Cost: 0.1729 Epoch: 010/010 | Batch 250/938 | Cost: 0.0543 Epoch: 010/010 | Batch 300/938 | Cost: 0.3770 Epoch: 010/010 | Batch 350/938 | Cost: 0.0270 Epoch: 010/010 | Batch 400/938 | Cost: 0.1400 Epoch: 010/010 | Batch 450/938 | Cost: 0.0526 Epoch: 010/010 | Batch 500/938 | Cost: 0.1984 Epoch: 010/010 | Batch 550/938 | Cost: 0.1677 Epoch: 010/010 | Batch 600/938 | Cost: 0.0550 Epoch: 010/010 | Batch 650/938 | Cost: 0.0294 Epoch: 010/010 | Batch 700/938 | Cost: 0.0465 Epoch: 010/010 | Batch 750/938 | Cost: 0.1103 Epoch: 010/010 | Batch 800/938 | Cost: 0.0272 Epoch: 010/010 | Batch 850/938 | Cost: 0.1376 Epoch: 010/010 | Batch 900/938 | Cost: 0.0279 Epoch: 010/010 training accuracy: 98.09% Time elapsed: 2.46 min Total Training Time: 2.46 min
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))
Test accuracy: 96.80%
torch.manual_seed(random_seed)
model = MultilayerPerceptron(num_features=num_features,
num_classes=num_classes)
#########################################################
#########################################################
### GRADIENT CLIPPING
for p in model.parameters():
p.register_hook(lambda grad: torch.clamp(grad, -0.1, 1.0))
#########################################################
#########################################################
model = model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
###################################################################
start_time = time.time()
for epoch in range(num_epochs):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
features = features.view(-1, 28*28).to(device)
targets = targets.to(device)
### FORWARD AND BACK PROP
logits, probas = model(features)
cost = F.cross_entropy(logits, targets)
optimizer.zero_grad()
cost.backward()
### UPDATE MODEL PARAMETERS
optimizer.step()
### LOGGING
if not batch_idx % 50:
print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f'
%(epoch+1, num_epochs, batch_idx,
len(train_loader), cost))
with torch.set_grad_enabled(False):
print('Epoch: %03d/%03d training accuracy: %.2f%%' % (
epoch+1, num_epochs,
compute_accuracy(model, train_loader)))
print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))
print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))
Epoch: 001/010 | Batch 000/938 | Cost: 2.3054 Epoch: 001/010 | Batch 050/938 | Cost: 0.5977 Epoch: 001/010 | Batch 100/938 | Cost: 0.4369 Epoch: 001/010 | Batch 150/938 | Cost: 0.3053 Epoch: 001/010 | Batch 200/938 | Cost: 0.3661 Epoch: 001/010 | Batch 250/938 | Cost: 0.1908 Epoch: 001/010 | Batch 300/938 | Cost: 0.2845 Epoch: 001/010 | Batch 350/938 | Cost: 0.1928 Epoch: 001/010 | Batch 400/938 | Cost: 0.2715 Epoch: 001/010 | Batch 450/938 | Cost: 0.2338 Epoch: 001/010 | Batch 500/938 | Cost: 0.3923 Epoch: 001/010 | Batch 550/938 | Cost: 0.0973 Epoch: 001/010 | Batch 600/938 | Cost: 0.3142 Epoch: 001/010 | Batch 650/938 | Cost: 0.5024 Epoch: 001/010 | Batch 700/938 | Cost: 0.1549 Epoch: 001/010 | Batch 750/938 | Cost: 0.1906 Epoch: 001/010 | Batch 800/938 | Cost: 0.3325 Epoch: 001/010 | Batch 850/938 | Cost: 0.2060 Epoch: 001/010 | Batch 900/938 | Cost: 0.1301 Epoch: 001/010 training accuracy: 94.76% Time elapsed: 0.24 min Epoch: 002/010 | Batch 000/938 | Cost: 0.2553 Epoch: 002/010 | Batch 050/938 | Cost: 0.1858 Epoch: 002/010 | Batch 100/938 | Cost: 0.2514 Epoch: 002/010 | Batch 150/938 | Cost: 0.1413 Epoch: 002/010 | Batch 200/938 | Cost: 0.3071 Epoch: 002/010 | Batch 250/938 | Cost: 0.6133 Epoch: 002/010 | Batch 300/938 | Cost: 0.1657 Epoch: 002/010 | Batch 350/938 | Cost: 0.0828 Epoch: 002/010 | Batch 400/938 | Cost: 0.0733 Epoch: 002/010 | Batch 450/938 | Cost: 0.3012 Epoch: 002/010 | Batch 500/938 | Cost: 0.1857 Epoch: 002/010 | Batch 550/938 | Cost: 0.3618 Epoch: 002/010 | Batch 600/938 | Cost: 0.0777 Epoch: 002/010 | Batch 650/938 | Cost: 0.2648 Epoch: 002/010 | Batch 700/938 | Cost: 0.0242 Epoch: 002/010 | Batch 750/938 | Cost: 0.1050 Epoch: 002/010 | Batch 800/938 | Cost: 0.2148 Epoch: 002/010 | Batch 850/938 | Cost: 0.0817 Epoch: 002/010 | Batch 900/938 | Cost: 0.1354 Epoch: 002/010 training accuracy: 97.04% Time elapsed: 0.49 min Epoch: 003/010 | Batch 000/938 | Cost: 0.1346 Epoch: 003/010 | Batch 050/938 | Cost: 0.0825 Epoch: 003/010 | Batch 100/938 | Cost: 0.0771 Epoch: 003/010 | Batch 150/938 | Cost: 0.2360 Epoch: 003/010 | Batch 200/938 | Cost: 0.0730 Epoch: 003/010 | Batch 250/938 | Cost: 0.1499 Epoch: 003/010 | Batch 300/938 | Cost: 0.0410 Epoch: 003/010 | Batch 350/938 | Cost: 0.2091 Epoch: 003/010 | Batch 400/938 | Cost: 0.0738 Epoch: 003/010 | Batch 450/938 | Cost: 0.0889 Epoch: 003/010 | Batch 500/938 | Cost: 0.3630 Epoch: 003/010 | Batch 550/938 | Cost: 0.0312 Epoch: 003/010 | Batch 600/938 | Cost: 0.0782 Epoch: 003/010 | Batch 650/938 | Cost: 0.1753 Epoch: 003/010 | Batch 700/938 | Cost: 0.0286 Epoch: 003/010 | Batch 750/938 | Cost: 0.2166 Epoch: 003/010 | Batch 800/938 | Cost: 0.0627 Epoch: 003/010 | Batch 850/938 | Cost: 0.0204 Epoch: 003/010 | Batch 900/938 | Cost: 0.2867 Epoch: 003/010 training accuracy: 96.72% Time elapsed: 0.73 min Epoch: 004/010 | Batch 000/938 | Cost: 0.0207 Epoch: 004/010 | Batch 050/938 | Cost: 0.0499 Epoch: 004/010 | Batch 100/938 | Cost: 0.1858 Epoch: 004/010 | Batch 150/938 | Cost: 0.2015 Epoch: 004/010 | Batch 200/938 | Cost: 0.0285 Epoch: 004/010 | Batch 250/938 | Cost: 0.0029 Epoch: 004/010 | Batch 300/938 | Cost: 0.1746 Epoch: 004/010 | Batch 350/938 | Cost: 0.3149 Epoch: 004/010 | Batch 400/938 | Cost: 0.1773 Epoch: 004/010 | Batch 450/938 | Cost: 0.1013 Epoch: 004/010 | Batch 500/938 | Cost: 0.1665 Epoch: 004/010 | Batch 550/938 | Cost: 0.1540 Epoch: 004/010 | Batch 600/938 | Cost: 0.1822 Epoch: 004/010 | Batch 650/938 | Cost: 0.1506 Epoch: 004/010 | Batch 700/938 | Cost: 0.0224 Epoch: 004/010 | Batch 750/938 | Cost: 0.1400 Epoch: 004/010 | Batch 800/938 | Cost: 0.2262 Epoch: 004/010 | Batch 850/938 | Cost: 0.0679 Epoch: 004/010 | Batch 900/938 | Cost: 0.0020 Epoch: 004/010 training accuracy: 97.63% Time elapsed: 0.98 min Epoch: 005/010 | Batch 000/938 | Cost: 0.0508 Epoch: 005/010 | Batch 050/938 | Cost: 0.0585 Epoch: 005/010 | Batch 100/938 | Cost: 0.1441 Epoch: 005/010 | Batch 150/938 | Cost: 0.0862 Epoch: 005/010 | Batch 200/938 | Cost: 0.0284 Epoch: 005/010 | Batch 250/938 | Cost: 0.0977 Epoch: 005/010 | Batch 300/938 | Cost: 0.0565 Epoch: 005/010 | Batch 350/938 | Cost: 0.0272 Epoch: 005/010 | Batch 400/938 | Cost: 0.2603 Epoch: 005/010 | Batch 450/938 | Cost: 0.1202 Epoch: 005/010 | Batch 500/938 | Cost: 0.0612 Epoch: 005/010 | Batch 550/938 | Cost: 0.0833 Epoch: 005/010 | Batch 600/938 | Cost: 0.1666 Epoch: 005/010 | Batch 650/938 | Cost: 0.2642 Epoch: 005/010 | Batch 700/938 | Cost: 0.1884 Epoch: 005/010 | Batch 750/938 | Cost: 0.1608 Epoch: 005/010 | Batch 800/938 | Cost: 0.1029 Epoch: 005/010 | Batch 850/938 | Cost: 0.1178 Epoch: 005/010 | Batch 900/938 | Cost: 0.0709 Epoch: 005/010 training accuracy: 97.58% Time elapsed: 1.23 min Epoch: 006/010 | Batch 000/938 | Cost: 0.0642 Epoch: 006/010 | Batch 050/938 | Cost: 0.3518 Epoch: 006/010 | Batch 100/938 | Cost: 0.1134 Epoch: 006/010 | Batch 150/938 | Cost: 0.0821 Epoch: 006/010 | Batch 200/938 | Cost: 0.0645 Epoch: 006/010 | Batch 250/938 | Cost: 0.0486 Epoch: 006/010 | Batch 300/938 | Cost: 0.0972 Epoch: 006/010 | Batch 350/938 | Cost: 0.2861 Epoch: 006/010 | Batch 400/938 | Cost: 0.1126 Epoch: 006/010 | Batch 450/938 | Cost: 0.1479 Epoch: 006/010 | Batch 500/938 | Cost: 0.2181 Epoch: 006/010 | Batch 550/938 | Cost: 0.0674 Epoch: 006/010 | Batch 600/938 | Cost: 0.0705 Epoch: 006/010 | Batch 650/938 | Cost: 0.1032 Epoch: 006/010 | Batch 700/938 | Cost: 0.1529 Epoch: 006/010 | Batch 750/938 | Cost: 0.2484 Epoch: 006/010 | Batch 800/938 | Cost: 0.0432 Epoch: 006/010 | Batch 850/938 | Cost: 0.0821 Epoch: 006/010 | Batch 900/938 | Cost: 0.1152 Epoch: 006/010 training accuracy: 97.09% Time elapsed: 1.47 min Epoch: 007/010 | Batch 000/938 | Cost: 0.0418 Epoch: 007/010 | Batch 050/938 | Cost: 0.0527 Epoch: 007/010 | Batch 100/938 | Cost: 0.3778 Epoch: 007/010 | Batch 150/938 | Cost: 0.1742 Epoch: 007/010 | Batch 200/938 | Cost: 0.0725 Epoch: 007/010 | Batch 250/938 | Cost: 0.1187 Epoch: 007/010 | Batch 300/938 | Cost: 0.0980 Epoch: 007/010 | Batch 350/938 | Cost: 0.0077 Epoch: 007/010 | Batch 400/938 | Cost: 0.1274 Epoch: 007/010 | Batch 450/938 | Cost: 0.1387 Epoch: 007/010 | Batch 500/938 | Cost: 0.1959 Epoch: 007/010 | Batch 550/938 | Cost: 0.0874 Epoch: 007/010 | Batch 600/938 | Cost: 0.2559 Epoch: 007/010 | Batch 650/938 | Cost: 0.1413 Epoch: 007/010 | Batch 700/938 | Cost: 0.1285 Epoch: 007/010 | Batch 750/938 | Cost: 0.1931 Epoch: 007/010 | Batch 800/938 | Cost: 0.1151 Epoch: 007/010 | Batch 850/938 | Cost: 0.1889 Epoch: 007/010 | Batch 900/938 | Cost: 0.5518 Epoch: 007/010 training accuracy: 86.62% Time elapsed: 1.72 min Epoch: 008/010 | Batch 000/938 | Cost: 0.3283 Epoch: 008/010 | Batch 050/938 | Cost: 0.1818 Epoch: 008/010 | Batch 100/938 | Cost: 0.1827 Epoch: 008/010 | Batch 150/938 | Cost: 0.0844 Epoch: 008/010 | Batch 200/938 | Cost: 0.4017 Epoch: 008/010 | Batch 250/938 | Cost: 0.0129 Epoch: 008/010 | Batch 300/938 | Cost: 0.0155 Epoch: 008/010 | Batch 350/938 | Cost: 0.1844 Epoch: 008/010 | Batch 400/938 | Cost: 0.1146 Epoch: 008/010 | Batch 450/938 | Cost: 0.0566 Epoch: 008/010 | Batch 500/938 | Cost: 0.0895 Epoch: 008/010 | Batch 550/938 | Cost: 0.1851 Epoch: 008/010 | Batch 600/938 | Cost: 0.1134 Epoch: 008/010 | Batch 650/938 | Cost: 0.0838 Epoch: 008/010 | Batch 700/938 | Cost: 0.1157 Epoch: 008/010 | Batch 750/938 | Cost: 0.2275 Epoch: 008/010 | Batch 800/938 | Cost: 0.5753 Epoch: 008/010 | Batch 850/938 | Cost: 0.8735 Epoch: 008/010 | Batch 900/938 | Cost: 0.7114 Epoch: 008/010 training accuracy: 85.51% Time elapsed: 1.97 min Epoch: 009/010 | Batch 000/938 | Cost: 0.4851 Epoch: 009/010 | Batch 050/938 | Cost: 0.4595 Epoch: 009/010 | Batch 100/938 | Cost: 0.1939 Epoch: 009/010 | Batch 150/938 | Cost: 0.1813 Epoch: 009/010 | Batch 200/938 | Cost: 0.4969 Epoch: 009/010 | Batch 250/938 | Cost: 0.4874 Epoch: 009/010 | Batch 300/938 | Cost: 0.1605 Epoch: 009/010 | Batch 350/938 | Cost: 0.0899 Epoch: 009/010 | Batch 400/938 | Cost: 0.3318 Epoch: 009/010 | Batch 450/938 | Cost: 0.0524 Epoch: 009/010 | Batch 500/938 | Cost: 0.0215 Epoch: 009/010 | Batch 550/938 | Cost: 0.0997 Epoch: 009/010 | Batch 600/938 | Cost: 0.0541 Epoch: 009/010 | Batch 650/938 | Cost: 0.3480 Epoch: 009/010 | Batch 700/938 | Cost: 0.0736 Epoch: 009/010 | Batch 750/938 | Cost: 0.1682 Epoch: 009/010 | Batch 800/938 | Cost: 0.2877 Epoch: 009/010 | Batch 850/938 | Cost: 0.0539 Epoch: 009/010 | Batch 900/938 | Cost: 0.2708 Epoch: 009/010 training accuracy: 95.67% Time elapsed: 2.21 min Epoch: 010/010 | Batch 000/938 | Cost: 0.0531 Epoch: 010/010 | Batch 050/938 | Cost: 0.0453 Epoch: 010/010 | Batch 100/938 | Cost: 1.8852 Epoch: 010/010 | Batch 150/938 | Cost: 0.1455 Epoch: 010/010 | Batch 200/938 | Cost: 0.2089 Epoch: 010/010 | Batch 250/938 | Cost: 0.0155 Epoch: 010/010 | Batch 300/938 | Cost: 0.9183 Epoch: 010/010 | Batch 350/938 | Cost: 0.2231 Epoch: 010/010 | Batch 400/938 | Cost: 0.3704 Epoch: 010/010 | Batch 450/938 | Cost: 0.1086 Epoch: 010/010 | Batch 500/938 | Cost: 0.3775 Epoch: 010/010 | Batch 550/938 | Cost: 0.4196 Epoch: 010/010 | Batch 600/938 | Cost: 0.2836 Epoch: 010/010 | Batch 650/938 | Cost: 0.1170 Epoch: 010/010 | Batch 700/938 | Cost: 0.2631 Epoch: 010/010 | Batch 750/938 | Cost: 0.1400 Epoch: 010/010 | Batch 800/938 | Cost: 0.1048 Epoch: 010/010 | Batch 850/938 | Cost: 0.7937 Epoch: 010/010 | Batch 900/938 | Cost: 0.2107 Epoch: 010/010 training accuracy: 87.98% Time elapsed: 2.46 min Total Training Time: 2.46 min
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))
Test accuracy: 86.94%
torch.manual_seed(random_seed)
model = MultilayerPerceptron(num_features=num_features,
num_classes=num_classes)
model = model.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
###################################################################
start_time = time.time()
for epoch in range(num_epochs):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
features = features.view(-1, 28*28).to(device)
targets = targets.to(device)
### FORWARD AND BACK PROP
logits, probas = model(features)
cost = F.cross_entropy(logits, targets)
optimizer.zero_grad()
cost.backward()
### UPDATE MODEL PARAMETERS
#########################################################
#########################################################
### GRADIENT CLIPPING
torch.nn.utils.clip_grad_norm_(model.parameters(), 1., norm_type=2)
#########################################################
#########################################################
optimizer.step()
### LOGGING
if not batch_idx % 50:
print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f'
%(epoch+1, num_epochs, batch_idx,
len(train_loader), cost))
with torch.set_grad_enabled(False):
print('Epoch: %03d/%03d training accuracy: %.2f%%' % (
epoch+1, num_epochs,
compute_accuracy(model, train_loader)))
print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))
print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))
Epoch: 001/010 | Batch 000/938 | Cost: 2.3054 Epoch: 001/010 | Batch 050/938 | Cost: 0.5121 Epoch: 001/010 | Batch 100/938 | Cost: 0.3424 Epoch: 001/010 | Batch 150/938 | Cost: 0.2765 Epoch: 001/010 | Batch 200/938 | Cost: 0.5126 Epoch: 001/010 | Batch 250/938 | Cost: 0.1481 Epoch: 001/010 | Batch 300/938 | Cost: 0.2240 Epoch: 001/010 | Batch 350/938 | Cost: 0.1948 Epoch: 001/010 | Batch 400/938 | Cost: 0.0655 Epoch: 001/010 | Batch 450/938 | Cost: 0.1893 Epoch: 001/010 | Batch 500/938 | Cost: 0.4133 Epoch: 001/010 | Batch 550/938 | Cost: 0.0375 Epoch: 001/010 | Batch 600/938 | Cost: 0.2691 Epoch: 001/010 | Batch 650/938 | Cost: 0.3342 Epoch: 001/010 | Batch 700/938 | Cost: 0.1662 Epoch: 001/010 | Batch 750/938 | Cost: 0.0702 Epoch: 001/010 | Batch 800/938 | Cost: 0.4246 Epoch: 001/010 | Batch 850/938 | Cost: 0.2282 Epoch: 001/010 | Batch 900/938 | Cost: 0.0459 Epoch: 001/010 training accuracy: 94.99% Time elapsed: 0.25 min Epoch: 002/010 | Batch 000/938 | Cost: 0.2188 Epoch: 002/010 | Batch 050/938 | Cost: 0.3042 Epoch: 002/010 | Batch 100/938 | Cost: 0.1391 Epoch: 002/010 | Batch 150/938 | Cost: 0.1453 Epoch: 002/010 | Batch 200/938 | Cost: 0.3031 Epoch: 002/010 | Batch 250/938 | Cost: 0.1398 Epoch: 002/010 | Batch 300/938 | Cost: 0.0868 Epoch: 002/010 | Batch 350/938 | Cost: 0.1679 Epoch: 002/010 | Batch 400/938 | Cost: 0.0480 Epoch: 002/010 | Batch 450/938 | Cost: 0.2823 Epoch: 002/010 | Batch 500/938 | Cost: 0.2307 Epoch: 002/010 | Batch 550/938 | Cost: 0.1610 Epoch: 002/010 | Batch 600/938 | Cost: 0.0972 Epoch: 002/010 | Batch 650/938 | Cost: 0.3210 Epoch: 002/010 | Batch 700/938 | Cost: 0.0697 Epoch: 002/010 | Batch 750/938 | Cost: 0.0879 Epoch: 002/010 | Batch 800/938 | Cost: 0.2113 Epoch: 002/010 | Batch 850/938 | Cost: 0.2496 Epoch: 002/010 | Batch 900/938 | Cost: 0.2453 Epoch: 002/010 training accuracy: 96.15% Time elapsed: 0.49 min Epoch: 003/010 | Batch 000/938 | Cost: 0.1779 Epoch: 003/010 | Batch 050/938 | Cost: 0.0618 Epoch: 003/010 | Batch 100/938 | Cost: 0.0570 Epoch: 003/010 | Batch 150/938 | Cost: 0.2510 Epoch: 003/010 | Batch 200/938 | Cost: 0.1193 Epoch: 003/010 | Batch 250/938 | Cost: 0.2530 Epoch: 003/010 | Batch 300/938 | Cost: 0.1220 Epoch: 003/010 | Batch 350/938 | Cost: 0.2401 Epoch: 003/010 | Batch 400/938 | Cost: 0.0520 Epoch: 003/010 | Batch 450/938 | Cost: 0.0262 Epoch: 003/010 | Batch 500/938 | Cost: 0.2961 Epoch: 003/010 | Batch 550/938 | Cost: 0.0030 Epoch: 003/010 | Batch 600/938 | Cost: 0.1998 Epoch: 003/010 | Batch 650/938 | Cost: 0.1968 Epoch: 003/010 | Batch 700/938 | Cost: 0.0499 Epoch: 003/010 | Batch 750/938 | Cost: 0.1742 Epoch: 003/010 | Batch 800/938 | Cost: 0.1034 Epoch: 003/010 | Batch 850/938 | Cost: 0.0437 Epoch: 003/010 | Batch 900/938 | Cost: 0.1414 Epoch: 003/010 training accuracy: 97.30% Time elapsed: 0.74 min Epoch: 004/010 | Batch 000/938 | Cost: 0.1098 Epoch: 004/010 | Batch 050/938 | Cost: 0.0060 Epoch: 004/010 | Batch 100/938 | Cost: 0.3551 Epoch: 004/010 | Batch 150/938 | Cost: 0.3143 Epoch: 004/010 | Batch 200/938 | Cost: 0.0527 Epoch: 004/010 | Batch 250/938 | Cost: 0.0204 Epoch: 004/010 | Batch 300/938 | Cost: 0.0289 Epoch: 004/010 | Batch 350/938 | Cost: 0.2386 Epoch: 004/010 | Batch 400/938 | Cost: 0.0694 Epoch: 004/010 | Batch 450/938 | Cost: 0.1200 Epoch: 004/010 | Batch 500/938 | Cost: 0.0797 Epoch: 004/010 | Batch 550/938 | Cost: 0.0891 Epoch: 004/010 | Batch 600/938 | Cost: 0.3322 Epoch: 004/010 | Batch 650/938 | Cost: 0.1640 Epoch: 004/010 | Batch 700/938 | Cost: 0.1170 Epoch: 004/010 | Batch 750/938 | Cost: 0.2028 Epoch: 004/010 | Batch 800/938 | Cost: 0.2188 Epoch: 004/010 | Batch 850/938 | Cost: 0.0575 Epoch: 004/010 | Batch 900/938 | Cost: 0.0180 Epoch: 004/010 training accuracy: 96.86% Time elapsed: 0.98 min Epoch: 005/010 | Batch 000/938 | Cost: 0.0779 Epoch: 005/010 | Batch 050/938 | Cost: 0.1183 Epoch: 005/010 | Batch 100/938 | Cost: 0.1184 Epoch: 005/010 | Batch 150/938 | Cost: 0.0815 Epoch: 005/010 | Batch 200/938 | Cost: 0.0691 Epoch: 005/010 | Batch 250/938 | Cost: 0.0784 Epoch: 005/010 | Batch 300/938 | Cost: 0.1464 Epoch: 005/010 | Batch 350/938 | Cost: 0.1488 Epoch: 005/010 | Batch 400/938 | Cost: 0.2636 Epoch: 005/010 | Batch 450/938 | Cost: 0.0839 Epoch: 005/010 | Batch 500/938 | Cost: 0.1343 Epoch: 005/010 | Batch 550/938 | Cost: 0.0514 Epoch: 005/010 | Batch 600/938 | Cost: 0.1802 Epoch: 005/010 | Batch 650/938 | Cost: 0.0681 Epoch: 005/010 | Batch 700/938 | Cost: 0.0986 Epoch: 005/010 | Batch 750/938 | Cost: 0.0930 Epoch: 005/010 | Batch 800/938 | Cost: 0.1829 Epoch: 005/010 | Batch 850/938 | Cost: 0.1694 Epoch: 005/010 | Batch 900/938 | Cost: 0.0440 Epoch: 005/010 training accuracy: 97.22% Time elapsed: 1.22 min Epoch: 006/010 | Batch 000/938 | Cost: 0.0142 Epoch: 006/010 | Batch 050/938 | Cost: 0.3528 Epoch: 006/010 | Batch 100/938 | Cost: 0.0710 Epoch: 006/010 | Batch 150/938 | Cost: 0.0553 Epoch: 006/010 | Batch 200/938 | Cost: 0.0084 Epoch: 006/010 | Batch 250/938 | Cost: 0.1178 Epoch: 006/010 | Batch 300/938 | Cost: 0.1271 Epoch: 006/010 | Batch 350/938 | Cost: 0.0404 Epoch: 006/010 | Batch 400/938 | Cost: 0.1435 Epoch: 006/010 | Batch 450/938 | Cost: 0.1568 Epoch: 006/010 | Batch 500/938 | Cost: 0.2100 Epoch: 006/010 | Batch 550/938 | Cost: 0.0019 Epoch: 006/010 | Batch 600/938 | Cost: 0.1721 Epoch: 006/010 | Batch 650/938 | Cost: 0.0943 Epoch: 006/010 | Batch 700/938 | Cost: 0.0913 Epoch: 006/010 | Batch 750/938 | Cost: 0.1211 Epoch: 006/010 | Batch 800/938 | Cost: 0.0890 Epoch: 006/010 | Batch 850/938 | Cost: 0.0390 Epoch: 006/010 | Batch 900/938 | Cost: 0.0521 Epoch: 006/010 training accuracy: 97.79% Time elapsed: 1.47 min Epoch: 007/010 | Batch 000/938 | Cost: 0.0059 Epoch: 007/010 | Batch 050/938 | Cost: 0.0371 Epoch: 007/010 | Batch 100/938 | Cost: 0.2702 Epoch: 007/010 | Batch 150/938 | Cost: 0.1142 Epoch: 007/010 | Batch 200/938 | Cost: 0.0900 Epoch: 007/010 | Batch 250/938 | Cost: 0.1922 Epoch: 007/010 | Batch 300/938 | Cost: 0.0062 Epoch: 007/010 | Batch 350/938 | Cost: 0.0435 Epoch: 007/010 | Batch 400/938 | Cost: 0.0503 Epoch: 007/010 | Batch 450/938 | Cost: 0.1411 Epoch: 007/010 | Batch 500/938 | Cost: 0.1547 Epoch: 007/010 | Batch 550/938 | Cost: 0.1858 Epoch: 007/010 | Batch 600/938 | Cost: 0.0108 Epoch: 007/010 | Batch 650/938 | Cost: 0.0569 Epoch: 007/010 | Batch 700/938 | Cost: 0.0254 Epoch: 007/010 | Batch 750/938 | Cost: 0.0635 Epoch: 007/010 | Batch 800/938 | Cost: 0.2539 Epoch: 007/010 | Batch 850/938 | Cost: 0.1338 Epoch: 007/010 | Batch 900/938 | Cost: 0.3336 Epoch: 007/010 training accuracy: 98.25% Time elapsed: 1.71 min Epoch: 008/010 | Batch 000/938 | Cost: 0.0215 Epoch: 008/010 | Batch 050/938 | Cost: 0.2800 Epoch: 008/010 | Batch 100/938 | Cost: 0.2627 Epoch: 008/010 | Batch 150/938 | Cost: 0.0538 Epoch: 008/010 | Batch 200/938 | Cost: 0.2164 Epoch: 008/010 | Batch 250/938 | Cost: 0.0025 Epoch: 008/010 | Batch 300/938 | Cost: 0.0021 Epoch: 008/010 | Batch 350/938 | Cost: 0.1489 Epoch: 008/010 | Batch 400/938 | Cost: 0.0997 Epoch: 008/010 | Batch 450/938 | Cost: 0.0055 Epoch: 008/010 | Batch 500/938 | Cost: 0.0181 Epoch: 008/010 | Batch 550/938 | Cost: 0.1672 Epoch: 008/010 | Batch 600/938 | Cost: 0.0538 Epoch: 008/010 | Batch 650/938 | Cost: 0.0842 Epoch: 008/010 | Batch 700/938 | Cost: 0.0941 Epoch: 008/010 | Batch 750/938 | Cost: 0.0171 Epoch: 008/010 | Batch 800/938 | Cost: 0.0638 Epoch: 008/010 | Batch 850/938 | Cost: 0.2507 Epoch: 008/010 | Batch 900/938 | Cost: 0.0568 Epoch: 008/010 training accuracy: 98.31% Time elapsed: 1.96 min Epoch: 009/010 | Batch 000/938 | Cost: 0.0844 Epoch: 009/010 | Batch 050/938 | Cost: 0.1087 Epoch: 009/010 | Batch 100/938 | Cost: 0.0584 Epoch: 009/010 | Batch 150/938 | Cost: 0.0544 Epoch: 009/010 | Batch 200/938 | Cost: 0.0352 Epoch: 009/010 | Batch 250/938 | Cost: 0.0189 Epoch: 009/010 | Batch 300/938 | Cost: 0.0356 Epoch: 009/010 | Batch 350/938 | Cost: 0.1357 Epoch: 009/010 | Batch 400/938 | Cost: 0.2133 Epoch: 009/010 | Batch 450/938 | Cost: 0.0081 Epoch: 009/010 | Batch 500/938 | Cost: 0.0710 Epoch: 009/010 | Batch 550/938 | Cost: 0.0652 Epoch: 009/010 | Batch 600/938 | Cost: 0.0136 Epoch: 009/010 | Batch 650/938 | Cost: 0.0772 Epoch: 009/010 | Batch 700/938 | Cost: 0.0744 Epoch: 009/010 | Batch 750/938 | Cost: 0.0388 Epoch: 009/010 | Batch 800/938 | Cost: 0.0208 Epoch: 009/010 | Batch 850/938 | Cost: 0.0114 Epoch: 009/010 | Batch 900/938 | Cost: 0.0706 Epoch: 009/010 training accuracy: 97.76% Time elapsed: 2.20 min Epoch: 010/010 | Batch 000/938 | Cost: 0.0773 Epoch: 010/010 | Batch 050/938 | Cost: 0.0362 Epoch: 010/010 | Batch 100/938 | Cost: 0.0406 Epoch: 010/010 | Batch 150/938 | Cost: 0.0900 Epoch: 010/010 | Batch 200/938 | Cost: 0.3629 Epoch: 010/010 | Batch 250/938 | Cost: 0.0016 Epoch: 010/010 | Batch 300/938 | Cost: 0.0314 Epoch: 010/010 | Batch 350/938 | Cost: 0.0677 Epoch: 010/010 | Batch 400/938 | Cost: 0.0821 Epoch: 010/010 | Batch 450/938 | Cost: 0.0717 Epoch: 010/010 | Batch 500/938 | Cost: 0.2704 Epoch: 010/010 | Batch 550/938 | Cost: 0.1784 Epoch: 010/010 | Batch 600/938 | Cost: 0.0899 Epoch: 010/010 | Batch 650/938 | Cost: 0.0578 Epoch: 010/010 | Batch 700/938 | Cost: 0.1572 Epoch: 010/010 | Batch 750/938 | Cost: 0.0106 Epoch: 010/010 | Batch 800/938 | Cost: 0.0714 Epoch: 010/010 | Batch 850/938 | Cost: 0.0125 Epoch: 010/010 | Batch 900/938 | Cost: 0.0235 Epoch: 010/010 training accuracy: 98.38% Time elapsed: 2.45 min Total Training Time: 2.45 min
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))
Test accuracy: 96.89%
%watermark -iv
numpy 1.16.4 torch 1.2.0 torchvision 0.4.0a0+6b959ee