Adapted from: https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html & https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
We will perform the following steps in order:
from __future__ import print_function, division
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torchvision
from torchvision import datasets, models, transforms
import numpy as np
import matplotlib.pyplot as plt
import os
plt.ion()
We will load our dataset using torchvision
from the following file structure:
├── NN_Emotional_Faces_Classifier.ipynb
├── nim_stim
| └── training
| └── happiness
| └── sadness
| └── testing
| └── happiness
| └── sadness
The output of torchvision datasets are PILImage images of range [0, 1]. We transform them to Tensors of normalized range [-1, 1].
data_transforms = {
'training': transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.Resize((128,128)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
]),
'testing': transforms.Compose([
transforms.Resize((128,128)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
]),
}
data_dir = './nim_stim/'
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
data_transforms[x])
for x in ['training', 'testing']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,
shuffle=True, num_workers=4)
for x in ['training', 'testing']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['training', 'testing']}
class_names = image_datasets['training'].classes
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
Let's view some of our images:
def imshow(inp, title=None):
"""Imshow for Tensor."""
inp = inp.numpy().transpose((1, 2, 0))
mean = np.array([0.5, 0.5, 0.5])
std = np.array([0.5, 0.5, 0.5])
inp = std * inp + mean
inp = np.clip(inp, 0, 1)
plt.imshow(inp)
if title is not None:
plt.title(title)
plt.pause(0.001) # pause a bit so that plots are updated
# Get a batch of training data
inputs, classes = next(iter(dataloaders['training']))
# Make a grid from batch
out = torchvision.utils.make_grid(inputs)
imshow(out, title=[class_names[x] for x in classes])
A typical training procedure for a neural network is as follows:
Luckily, PyTorch let's us implement this fairly easily
#define the network
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16*841, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(nn.functional.relu(self.conv1(x)))
x = self.pool(nn.functional.relu(self.conv2(x)))
x = x.view(x.size(0), 16*841)
x = nn.functional.relu(self.fc1(x))
x = nn.functional.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
Now that we've defined the network with a feed-forward function, let's specify the loss function to take the (output, target) pair of inputs, and compute a value that estimates how far away the output is from the target.
To backpropagate the error all we have to do is call loss.backward()
. You need to clear the existing gradients though with optimizer.zero_grad()
, or else gradients will be accumulated to existing gradients. When we call loss.backward()
, the whole graph is differentiated w.r.t. the loss.
The package torch.optim
implements all sorts of different update rules. For now, we will implement the simplest: Stochastic Gradient Descent (SGD): weight = weight - learning_rate * gradient
using optim.SGD()
.
criterion = nn.CrossEntropyLoss() #nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
num_epochs = 75
We will loop over our data iterator, and feed the inputs to the network and optimize using our specified settings
import torch.nn as nn
import torch.nn.functional as F
for epoch in range(num_epochs):
print('Epoch {}/{}'.format(epoch, num_epochs - 1))
print('-' * 10)
running_loss = 0.0
for i, (inputs, labels) in enumerate(dataloaders['training']):
# inputs = inputs.to(device)
# labels = labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs) # forward
loss = criterion(outputs, labels) # loss function defined above
loss.backward() # backprop
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 18 == 0: # print every 18th
print('[%d] loss: %.3f' %
(epoch + 1, running_loss / 18))
running_loss = 0.0
print('Finished Training')
Epoch 0/74 ---------- [1] loss: 0.129 Epoch 1/74 ---------- [2] loss: 0.112 Epoch 2/74 ---------- [3] loss: 0.048 Epoch 3/74 ---------- [4] loss: 0.032 Epoch 4/74 ---------- [5] loss: 0.031 Epoch 5/74 ---------- [6] loss: 0.072 Epoch 6/74 ---------- [7] loss: 0.040 Epoch 7/74 ---------- [8] loss: 0.032 Epoch 8/74 ---------- [9] loss: 0.049 Epoch 9/74 ---------- [10] loss: 0.040 Epoch 10/74 ---------- [11] loss: 0.035 Epoch 11/74 ---------- [12] loss: 0.033 Epoch 12/74 ---------- [13] loss: 0.038 Epoch 13/74 ---------- [14] loss: 0.039 Epoch 14/74 ---------- [15] loss: 0.035 Epoch 15/74 ---------- [16] loss: 0.037 Epoch 16/74 ---------- [17] loss: 0.033 Epoch 17/74 ---------- [18] loss: 0.076 Epoch 18/74 ---------- [19] loss: 0.036 Epoch 19/74 ---------- [20] loss: 0.038 Epoch 20/74 ---------- [21] loss: 0.033 Epoch 21/74 ---------- [22] loss: 0.052 Epoch 22/74 ---------- [23] loss: 0.030 Epoch 23/74 ---------- [24] loss: 0.039 Epoch 24/74 ---------- [25] loss: 0.050 Epoch 25/74 ---------- [26] loss: 0.033 Epoch 26/74 ---------- [27] loss: 0.040 Epoch 27/74 ---------- [28] loss: 0.045 Epoch 28/74 ---------- [29] loss: 0.043 Epoch 29/74 ---------- [30] loss: 0.035 Epoch 30/74 ---------- [31] loss: 0.022 Epoch 31/74 ---------- [32] loss: 0.022 Epoch 32/74 ---------- [33] loss: 0.011 Epoch 33/74 ---------- [34] loss: 0.019 Epoch 34/74 ---------- [35] loss: 0.025 Epoch 35/74 ---------- [36] loss: 0.024 Epoch 36/74 ---------- [37] loss: 0.005 Epoch 37/74 ---------- [38] loss: 0.007 Epoch 38/74 ---------- [39] loss: 0.012 Epoch 39/74 ---------- [40] loss: 0.005 Epoch 40/74 ---------- [41] loss: 0.033 Epoch 41/74 ---------- [42] loss: 0.015 Epoch 42/74 ---------- [43] loss: 0.009 Epoch 43/74 ---------- [44] loss: 0.001 Epoch 44/74 ---------- [45] loss: 0.001 Epoch 45/74 ---------- [46] loss: 0.027 Epoch 46/74 ---------- [47] loss: 0.014 Epoch 47/74 ---------- [48] loss: 0.000 Epoch 48/74 ---------- [49] loss: 0.003 Epoch 49/74 ---------- [50] loss: 0.016 Epoch 50/74 ---------- [51] loss: 0.011 Epoch 51/74 ---------- [52] loss: 0.017 Epoch 52/74 ---------- [53] loss: 0.003 Epoch 53/74 ---------- [54] loss: 0.001 Epoch 54/74 ---------- [55] loss: 0.000 Epoch 55/74 ---------- [56] loss: 0.003 Epoch 56/74 ---------- [57] loss: 0.005 Epoch 57/74 ---------- [58] loss: 0.000 Epoch 58/74 ---------- [59] loss: 0.034 Epoch 59/74 ---------- [60] loss: 0.039 Epoch 60/74 ---------- [61] loss: 0.009 Epoch 61/74 ---------- [62] loss: 0.009 Epoch 62/74 ---------- [63] loss: 0.000 Epoch 63/74 ---------- [64] loss: 0.000 Epoch 64/74 ---------- [65] loss: 0.001 Epoch 65/74 ---------- [66] loss: 0.019 Epoch 66/74 ---------- [67] loss: 0.005 Epoch 67/74 ---------- [68] loss: 0.001 Epoch 68/74 ---------- [69] loss: 0.000 Epoch 69/74 ---------- [70] loss: 0.000 Epoch 70/74 ---------- [71] loss: 0.000 Epoch 71/74 ---------- [72] loss: 0.000 Epoch 72/74 ---------- [73] loss: 0.009 Epoch 73/74 ---------- [74] loss: 0.011 Epoch 74/74 ---------- [75] loss: 0.001 Finished Training
We have trained the network for 50 passes over the training dataset. But we need to check if the network has learned anything at all.
We will check this by predicting the class label that the neural network outputs, and checking it against the ground-truth. If the prediction is correct, we add the sample to the list of correct predictions.
images, labels = next(iter(dataloaders['testing']))
outputs = net(images)
_, predicted = torch.max(outputs, 1)
# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % class_names[labels[j]] for j in range(4)))
print('Predicted: ', ' '.join('%5s' % class_names[predicted[j]]
for j in range(4)))
GroundTruth: sadness happiness happiness sadness Predicted: sadness sadness happiness sadness
Let us look at how the network performs on the whole dataset.
correct = 0
total = 0
with torch.no_grad():
for data in dataloaders['testing']:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on test images: %d %%' % (
100 * correct / total))
Accuracy of the network on test images: 91 %
And, there you have it! In this simple example, we can classify happy and sad faces with super high accuracy! (better than some people)