Notebook

Building a Feed-forward Neural Network using PyTorch¶

Adapted from: https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html & https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

Training an image classifier to classify happy and sad faces¶

We will perform the following steps in order:

Load and normalizing the the using torchvision
Define a Convolutional Neural Network
Define a loss function
Train the network on the training data
Test the network on the test data

In [1]:

from __future__ import print_function, division

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler

import torchvision
from torchvision import datasets, models, transforms

import numpy as np
import matplotlib.pyplot as plt
import os

plt.ion()

1. Load and normalize NimStim Dataset of Facial Expressions¶

We will load our dataset using torchvision from the following file structure:

├── NN_Emotional_Faces_Classifier.ipynb
├── nim_stim
|   └── training
|          └── happiness
|          └── sadness
|   └── testing
|          └── happiness
|          └── sadness

The output of torchvision datasets are PILImage images of range [0, 1]. We transform them to Tensors of normalized range [-1, 1].

In [2]:

data_transforms = {
    'training': transforms.Compose([
        transforms.RandomHorizontalFlip(),
        transforms.Resize((128,128)),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ]),
    'testing': transforms.Compose([
        transforms.Resize((128,128)),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ]),
}

data_dir = './nim_stim/'
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['training', 'testing']}

dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,
                                             shuffle=True, num_workers=4)
                  for x in ['training', 'testing']}

dataset_sizes = {x: len(image_datasets[x]) for x in ['training', 'testing']}
class_names = image_datasets['training'].classes

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Let's view some of our images:

In [3]:

def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.5, 0.5, 0.5])
    std = np.array([0.5, 0.5, 0.5])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(0.001)  # pause a bit so that plots are updated

# Get a batch of training data
inputs, classes = next(iter(dataloaders['training']))

# Make a grid from batch
out = torchvision.utils.make_grid(inputs)

imshow(out, title=[class_names[x] for x in classes])

2. Define a Convolutional Neural Network¶

A typical training procedure for a neural network is as follows:

Define the neural network that has some learnable parameters (or weights)
Iterate over a dataset of inputs
Process input through the network
Compute the loss (how far is the output from being correct)
Propagate gradients back into the network’s parameters
Update the weights of the network, typically using a simple update rule: weight = weight - learning_rate * gradient

Luckily, PyTorch let's us implement this fairly easily

In [4]:

#define the network
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*841, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(nn.functional.relu(self.conv1(x)))
        x = self.pool(nn.functional.relu(self.conv2(x)))
        x = x.view(x.size(0), 16*841)
        x = nn.functional.relu(self.fc1(x))
        x = nn.functional.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

3. Define a Loss function and optimizer¶

Now that we've defined the network with a feed-forward function, let's specify the loss function to take the (output, target) pair of inputs, and compute a value that estimates how far away the output is from the target.

To backpropagate the error all we have to do is call loss.backward(). You need to clear the existing gradients though with optimizer.zero_grad(), or else gradients will be accumulated to existing gradients. When we call loss.backward(), the whole graph is differentiated w.r.t. the loss.

The package torch.optim implements all sorts of different update rules. For now, we will implement the simplest: Stochastic Gradient Descent (SGD): weight = weight - learning_rate * gradient using optim.SGD().

In [5]:

criterion = nn.CrossEntropyLoss() #nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
num_epochs = 75

4. Train the network (where the magic happens)¶

We will loop over our data iterator, and feed the inputs to the network and optimize using our specified settings

In [6]:

import torch.nn as nn
import torch.nn.functional as F

for epoch in range(num_epochs):
    print('Epoch {}/{}'.format(epoch, num_epochs - 1))
    print('-' * 10)
    
    running_loss = 0.0
    
    for i, (inputs, labels) in enumerate(dataloaders['training']):
#         inputs = inputs.to(device)
#         labels = labels.to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs) # forward
        
        loss = criterion(outputs, labels) # loss function defined above
        loss.backward() # backprop
        optimizer.step() 

        # print statistics
        running_loss += loss.item()
        if i % 18 == 0:    # print every 18th
            print('[%d] loss: %.3f' %
                  (epoch + 1, running_loss / 18))
            running_loss = 0.0

print('Finished Training')

Epoch 0/74
----------
[1] loss: 0.129
Epoch 1/74
----------
[2] loss: 0.112
Epoch 2/74
----------
[3] loss: 0.048
Epoch 3/74
----------
[4] loss: 0.032
Epoch 4/74
----------
[5] loss: 0.031
Epoch 5/74
----------
[6] loss: 0.072
Epoch 6/74
----------
[7] loss: 0.040
Epoch 7/74
----------
[8] loss: 0.032
Epoch 8/74
----------
[9] loss: 0.049
Epoch 9/74
----------
[10] loss: 0.040
Epoch 10/74
----------
[11] loss: 0.035
Epoch 11/74
----------
[12] loss: 0.033
Epoch 12/74
----------
[13] loss: 0.038
Epoch 13/74
----------
[14] loss: 0.039
Epoch 14/74
----------
[15] loss: 0.035
Epoch 15/74
----------
[16] loss: 0.037
Epoch 16/74
----------
[17] loss: 0.033
Epoch 17/74
----------
[18] loss: 0.076
Epoch 18/74
----------
[19] loss: 0.036
Epoch 19/74
----------
[20] loss: 0.038
Epoch 20/74
----------
[21] loss: 0.033
Epoch 21/74
----------
[22] loss: 0.052
Epoch 22/74
----------
[23] loss: 0.030
Epoch 23/74
----------
[24] loss: 0.039
Epoch 24/74
----------
[25] loss: 0.050
Epoch 25/74
----------
[26] loss: 0.033
Epoch 26/74
----------
[27] loss: 0.040
Epoch 27/74
----------
[28] loss: 0.045
Epoch 28/74
----------
[29] loss: 0.043
Epoch 29/74
----------
[30] loss: 0.035
Epoch 30/74
----------
[31] loss: 0.022
Epoch 31/74
----------
[32] loss: 0.022
Epoch 32/74
----------
[33] loss: 0.011
Epoch 33/74
----------
[34] loss: 0.019
Epoch 34/74
----------
[35] loss: 0.025
Epoch 35/74
----------
[36] loss: 0.024
Epoch 36/74
----------
[37] loss: 0.005
Epoch 37/74
----------
[38] loss: 0.007
Epoch 38/74
----------
[39] loss: 0.012
Epoch 39/74
----------
[40] loss: 0.005
Epoch 40/74
----------
[41] loss: 0.033
Epoch 41/74
----------
[42] loss: 0.015
Epoch 42/74
----------
[43] loss: 0.009
Epoch 43/74
----------
[44] loss: 0.001
Epoch 44/74
----------
[45] loss: 0.001
Epoch 45/74
----------
[46] loss: 0.027
Epoch 46/74
----------
[47] loss: 0.014
Epoch 47/74
----------
[48] loss: 0.000
Epoch 48/74
----------
[49] loss: 0.003
Epoch 49/74
----------
[50] loss: 0.016
Epoch 50/74
----------
[51] loss: 0.011
Epoch 51/74
----------
[52] loss: 0.017
Epoch 52/74
----------
[53] loss: 0.003
Epoch 53/74
----------
[54] loss: 0.001
Epoch 54/74
----------
[55] loss: 0.000
Epoch 55/74
----------
[56] loss: 0.003
Epoch 56/74
----------
[57] loss: 0.005
Epoch 57/74
----------
[58] loss: 0.000
Epoch 58/74
----------
[59] loss: 0.034
Epoch 59/74
----------
[60] loss: 0.039
Epoch 60/74
----------
[61] loss: 0.009
Epoch 61/74
----------
[62] loss: 0.009
Epoch 62/74
----------
[63] loss: 0.000
Epoch 63/74
----------
[64] loss: 0.000
Epoch 64/74
----------
[65] loss: 0.001
Epoch 65/74
----------
[66] loss: 0.019
Epoch 66/74
----------
[67] loss: 0.005
Epoch 67/74
----------
[68] loss: 0.001
Epoch 68/74
----------
[69] loss: 0.000
Epoch 69/74
----------
[70] loss: 0.000
Epoch 70/74
----------
[71] loss: 0.000
Epoch 71/74
----------
[72] loss: 0.000
Epoch 72/74
----------
[73] loss: 0.009
Epoch 73/74
----------
[74] loss: 0.011
Epoch 74/74
----------
[75] loss: 0.001
Finished Training

5. Test the network on the test data¶

We have trained the network for 50 passes over the training dataset. But we need to check if the network has learned anything at all.

We will check this by predicting the class label that the neural network outputs, and checking it against the ground-truth. If the prediction is correct, we add the sample to the list of correct predictions.

In [7]:

images, labels = next(iter(dataloaders['testing']))
outputs = net(images)
_, predicted = torch.max(outputs, 1)

# print images
imshow(torchvision.utils.make_grid(images))

print('GroundTruth: ', ' '.join('%5s' % class_names[labels[j]] for j in range(4)))

print('Predicted: ', ' '.join('%5s' % class_names[predicted[j]]
                              for j in range(4)))

GroundTruth:  sadness happiness happiness sadness
Predicted:  sadness sadness happiness sadness

Magic! Way better than I expected!¶

Let us look at how the network performs on the whole dataset.

In [8]:

correct = 0
total = 0
with torch.no_grad():
    for data in dataloaders['testing']:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on test images: %d %%' % (
    100 * correct / total))

Accuracy of the network on test images: 91 %

And, there you have it! In this simple example, we can classify happy and sad faces with super high accuracy! (better than some people)

In [ ]: