In tutorial 1, we reviewed basics of Python and how Numpy extends vanilla Python for many tasks in scientific computing.
In this tutorial, we will go over two libraries, Matplotlib for data visualization and PyTorch for machine learning.
Matplotlib is a plotting library. This section gives a brief introduction to the matplotlib.pyplot
module, which provides a plotting system similar to that of MATLAB.
import numpy as np
import matplotlib.pyplot as plt
The most important function in matplotlib.pyplot
is plot
, which allows you to plot 2D data. Here is a simple example:
# Compute the x and y coordinates for points on a sine curve
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)
# Plot the points using matplotlib
plt.plot(x, y)
With just a little bit of extra work we can easily plot multiple lines at once, and add a title, legend, and axis labels:
y_sin = np.sin(x)
y_cos = np.cos(x)
# Can plot multiple graphs
plt.plot(x, y_sin)
plt.plot(x, y_cos)
# Set x and y label
plt.xlabel('x axis label')
plt.ylabel('y axis label')
# Set title and legend
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
You can plot different things in the same figure using the subplot function. Here is an example:
# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Set up a subplot grid that has height 2 and width 1.
# This sets the first such subplot as active.
plt.subplot(2, 1, 1)
# Make the first plot
plt.plot(x, y_sin)
plt.title('Sine')
# Set the second subplot as active
plt.subplot(2, 1, 2)
# Make the second plot.
plt.plot(x, y_cos)
plt.title('Cosine')
# Show the figure.
plt.show()
imshow
function from pyplot
module can be used to show images. For example:
img = plt.imread('cute-kittens.jpg')
print(img)
# Show the original image
plt.imshow(img) # Similar to plt.plot but for image
plt.show()
Note that each cells in an image is composed of 3 color channels (i.e. RGB color). Often the last axis is used for color channels, in the order of red, green, and blue.
print(img.shape) # 460 width x 276 height x RGB (3 channels)
# Displaying only red color channel
plt.imshow(img[:, :, 0])
plt.show()
PyTorch is a Python-based scientific computing package. PyTorch is currently, along with Tensorflow, one of the most popular machine learning library.
PyTorch, at its core, is similar to Numpy in a sense that they both
However, compare to Numpy, PyTorch offers much better GPU support and provides many high-level features for machine learning. Technically, Numpy can be used to perform almost every thing PyTorch does. However, Numpy would be a lot slower than PyTorch, especially with CUDA GPU, and it would take more effort to write machine learning related code compared to using PyTorch.
Mathematically speaking, tensor is a mathematical object for representing multi-dimensional arrays and tensor can be thought of as generalization of vectors and matrices. Tensor extends vector(1-D grid of numbers) and matrix(2-D grid of numbers) to represent any dimensional structure.
In PyTorch, tensor
is similar to Numpy's ndarray
but can be used on a GPU to accelerate computing.
tensor
can be created using initialization functions, similar to ones for ndarray
.
import torch
x = torch.empty(5, 3)
print(x)
x = torch.rand(5, 3)
print(x)
x = torch.zeros(5, 3, dtype=torch.long) # explicitely specify data type
print(x)
tensor
can also be created from array-like data such as ndarray
or other tensors
x = torch.tensor([5.5, 3]) # From Python list
print(x)
np_array = np.arange(6).reshape((2, 3))
torch_tensor = torch.from_numpy(np_array) # From ndarray
print(np_array)
print(torch_tensor)
np_array_2 = torch_tensor.numpy() # Back to ndarray from tensor
print(np_array_2)
Operations on tensor
use similar syntax as in Numpy
x = torch.ones(5, 3)
print(x)
x *= 2
print(x)
y = torch.rand(5, 3)
print(y)
print(x + y)
print(x * y)
# Using different syntax for the same operations above
print(torch.add(x, y))
# Inplace operation
x.add_(y)
print(x)
# Using the same indexing syntax from Python list and Numpy
print(x[1:4, :])
print(x.shape) # Similar to Numpy
Using PyTorch's torch.nn
package, it is easy to build a neural network.
import torch.nn as nn
In this tutorial, we will create a fully-connected feed-forward network model for 3-class classification. Our neural network will be 2-layer neural net with one hidden layer and one output layer (reminder that input layer is normally not counted)
We will be using modified Iris data (https://archive.ics.uci.edu/ml/datasets/Iris). This dataset has 150 samples and classifies 3 types of Iris flowers. Each sample has 4 numerical features.
import numpy as np
data = torch.tensor([
[5.1,3.5,1.4,0.2,],
[4.9,3.0,1.4,0.2,],
[4.7,3.2,1.3,0.2,],
[4.6,3.1,1.5,0.2,],
[5.0,3.6,1.4,0.2,],
[5.4,3.9,1.7,0.4,],
[4.6,3.4,1.4,0.3,],
[5.0,3.4,1.5,0.2,],
[4.4,2.9,1.4,0.2,],
[4.9,3.1,1.5,0.1,],
[5.4,3.7,1.5,0.2,],
[4.8,3.4,1.6,0.2,],
[4.8,3.0,1.4,0.1,],
[4.3,3.0,1.1,0.1,],
[5.8,4.0,1.2,0.2,],
[5.7,4.4,1.5,0.4,],
[5.4,3.9,1.3,0.4,],
[5.1,3.5,1.4,0.3,],
[5.7,3.8,1.7,0.3,],
[5.1,3.8,1.5,0.3,],
[5.4,3.4,1.7,0.2,],
[5.1,3.7,1.5,0.4,],
[4.6,3.6,1.0,0.2,],
[5.1,3.3,1.7,0.5,],
[4.8,3.4,1.9,0.2,],
[5.0,3.0,1.6,0.2,],
[5.0,3.4,1.6,0.4,],
[5.2,3.5,1.5,0.2,],
[5.2,3.4,1.4,0.2,],
[4.7,3.2,1.6,0.2,],
[4.8,3.1,1.6,0.2,],
[5.4,3.4,1.5,0.4,],
[5.2,4.1,1.5,0.1,],
[5.5,4.2,1.4,0.2,],
[4.9,3.1,1.5,0.1,],
[5.0,3.2,1.2,0.2,],
[5.5,3.5,1.3,0.2,],
[4.9,3.1,1.5,0.1,],
[4.4,3.0,1.3,0.2,],
[5.1,3.4,1.5,0.2,],
[5.0,3.5,1.3,0.3,],
[4.5,2.3,1.3,0.3,],
[4.4,3.2,1.3,0.2,],
[5.0,3.5,1.6,0.6,],
[5.1,3.8,1.9,0.4,],
[4.8,3.0,1.4,0.3,],
[5.1,3.8,1.6,0.2,],
[4.6,3.2,1.4,0.2,],
[5.3,3.7,1.5,0.2,],
[5.0,3.3,1.4,0.2,],
[7.0,3.2,4.7,1.4,],
[6.4,3.2,4.5,1.5,],
[6.9,3.1,4.9,1.5,],
[5.5,2.3,4.0,1.3,],
[6.5,2.8,4.6,1.5,],
[5.7,2.8,4.5,1.3,],
[6.3,3.3,4.7,1.6,],
[4.9,2.4,3.3,1.0,],
[6.6,2.9,4.6,1.3,],
[5.2,2.7,3.9,1.4,],
[5.0,2.0,3.5,1.0,],
[5.9,3.0,4.2,1.5,],
[6.0,2.2,4.0,1.0,],
[6.1,2.9,4.7,1.4,],
[5.6,2.9,3.6,1.3,],
[6.7,3.1,4.4,1.4,],
[5.6,3.0,4.5,1.5,],
[5.8,2.7,4.1,1.0,],
[6.2,2.2,4.5,1.5,],
[5.6,2.5,3.9,1.1,],
[5.9,3.2,4.8,1.8,],
[6.1,2.8,4.0,1.3,],
[6.3,2.5,4.9,1.5,],
[6.1,2.8,4.7,1.2,],
[6.4,2.9,4.3,1.3,],
[6.6,3.0,4.4,1.4,],
[6.8,2.8,4.8,1.4,],
[6.7,3.0,5.0,1.7,],
[6.0,2.9,4.5,1.5,],
[5.7,2.6,3.5,1.0,],
[5.5,2.4,3.8,1.1,],
[5.5,2.4,3.7,1.0,],
[5.8,2.7,3.9,1.2,],
[6.0,2.7,5.1,1.6,],
[5.4,3.0,4.5,1.5,],
[6.0,3.4,4.5,1.6,],
[6.7,3.1,4.7,1.5,],
[6.3,2.3,4.4,1.3,],
[5.6,3.0,4.1,1.3,],
[5.5,2.5,4.0,1.3,],
[5.5,2.6,4.4,1.2,],
[6.1,3.0,4.6,1.4,],
[5.8,2.6,4.0,1.2,],
[5.0,2.3,3.3,1.0,],
[5.6,2.7,4.2,1.3,],
[5.7,3.0,4.2,1.2,],
[5.7,2.9,4.2,1.3,],
[6.2,2.9,4.3,1.3,],
[5.1,2.5,3.0,1.1,],
[5.7,2.8,4.1,1.3,],
[6.3,3.3,6.0,2.5,],
[5.8,2.7,5.1,1.9,],
[7.1,3.0,5.9,2.1,],
[6.3,2.9,5.6,1.8,],
[6.5,3.0,5.8,2.2,],
[7.6,3.0,6.6,2.1,],
[4.9,2.5,4.5,1.7,],
[7.3,2.9,6.3,1.8,],
[6.7,2.5,5.8,1.8,],
[7.2,3.6,6.1,2.5,],
[6.5,3.2,5.1,2.0,],
[6.4,2.7,5.3,1.9,],
[6.8,3.0,5.5,2.1,],
[5.7,2.5,5.0,2.0,],
[5.8,2.8,5.1,2.4,],
[6.4,3.2,5.3,2.3,],
[6.5,3.0,5.5,1.8,],
[7.7,3.8,6.7,2.2,],
[7.7,2.6,6.9,2.3,],
[6.0,2.2,5.0,1.5,],
[6.9,3.2,5.7,2.3,],
[5.6,2.8,4.9,2.0,],
[7.7,2.8,6.7,2.0,],
[6.3,2.7,4.9,1.8,],
[6.7,3.3,5.7,2.1,],
[7.2,3.2,6.0,1.8,],
[6.2,2.8,4.8,1.8,],
[6.1,3.0,4.9,1.8,],
[6.4,2.8,5.6,2.1,],
[7.2,3.0,5.8,1.6,],
[7.4,2.8,6.1,1.9,],
[7.9,3.8,6.4,2.0,],
[6.4,2.8,5.6,2.2,],
[6.3,2.8,5.1,1.5,],
[6.1,2.6,5.6,1.4,],
[7.7,3.0,6.1,2.3,],
[6.3,3.4,5.6,2.4,],
[6.4,3.1,5.5,1.8,],
[6.0,3.0,4.8,1.8,],
[6.9,3.1,5.4,2.1,],
[6.7,3.1,5.6,2.4,],
[6.9,3.1,5.1,2.3,],
[5.8,2.7,5.1,1.9,],
[6.8,3.2,5.9,2.3,],
[6.7,3.3,5.7,2.5,],
[6.7,3.0,5.2,2.3,],
[6.3,2.5,5.0,1.9,],
[6.5,3.0,5.2,2.0,],
[6.2,3.4,5.4,2.3,],
[5.9,3.0,5.1,1.8,],
])
data /= torch.max(data, 0)[0] # Normalizing features by each of its maximum
# Create labels
labels = torch.cat([torch.zeros((50, ), dtype=torch.long),
torch.ones((50, ), dtype=torch.long),
torch.ones((50, ), dtype=torch.long) * 2])
Our neural network model will use 4 features from data
to classify Iris plant into type 0
, 1
, or 2
.
Our neural network will take 4 input features and have 3 output nodes. Our neural net will have one hidden layer with 12 nodes and they will be connected to 3 output nodes.
input_size = 4
hidden_size = 12
output_size = 3
To build a neural network model, first we need to create a class for our model. nn.Module
contains many useful features for building neural network, so in order to piggyback on it, we need to have nn.Module
as the parent class.
class NeuralNet(nn.Module):
# So far this is empty
pass
Each layer of our neural network model is defined as instance attributes inside the constructor function.
torch.nn
package contains commonly used layers (e.g. linear, convolutional, etc...). We will be using torch.nn.Linear
to define our hidden and output layers.
class NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
# Hidden layer
self.hidden = nn.Linear(input_size, hidden_size)
# Output layer
self.output = nn.Linear(hidden_size, output_size)
Now we need to connect our layers. To do so, we need to define an instance method called forward
inside NuerlNet
class. forward
function will take input data and return the output of our neural network.
Let's use tanh
activation for our hidden layer. torch.nn.functional
contains many popular activation functions.
import torch.nn.functional as F
class NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
# Hidden layer
self.hidden = nn.Linear(input_size, hidden_size)
# Output layer
self.output = nn.Linear(hidden_size, output_size)
def forward(self, x):
# x is input. In our case, it will be samples with 4 features
out = F.tanh(self.hidden(x)) # x passing through hidden layer and tanh activation
out = self.output(out) # The output from the previous operation passing through output layer
return out
Basically, for all neural network using PyTorch, we a class inherited from nn.Module
, and this class need to have 2 instance methods: __init__
, which specifies layers, and forward
, which specifies how input data passes through each layers to get the output.
Once we finished defining the architecture of the neural network by creating NeuralNet
class, we instantiate it.
net = NeuralNet()
Before we continue on with training, let's quickly take a look at initial parameters of net
.
params = list(net.parameters())
print(params)
The above output might look confusing, but if you look at it carefully, you would realize params
is a list of parameters for each layer's weights and bias.
print(len(params)) # 2 layers * (1 weights + 1 bias) = 4 elemetns in params
print(params[0].size()) # Weights for hidden layer
print(params[1].size()) # Bias for hidden layer
print(params[0].size()) # Weights for output layer
print(params[0].size()) # Bias for output layer
The goal of training is to learn the best values for these parameters. During each iteration of training, their values will be updated.
Now we will be training our network. We will be using Adam optimizer, a widely used variation of stochastic gradient descent.
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)
For loss function, we will be using cross-entropy which is useful when training a classification problem with C number of classes (https://pytorch.org/docs/stable/nn.html#crossentropyloss).
criterion = nn.CrossEntropyLoss()
Now we have everything to train our neural network. In this example, we will iterate over our entire dataset 500 times.
num_epoch = 500
# So the training will look something like this:
for epoch in range(num_epoch): # loop over the dataset multiple times
# This is where training happens
pass
Training can be broken down into 4 steps:
for epoch in range(num_epoch): # loop over the dataset multiple times
# 1. Forward pass
outputs = net(data)
# 2. Calculate CrossEntropyLoss
loss = criterion(outputs, labels)
# 3. Backward pass
loss.backward()
# 4. Update parameters using Adam optimizer
optimizer.step()
# It is important to zero the parameter gradients before the next iteration
optimizer.zero_grad()
# Print loss
if epoch % 50 == 0: # print every 50 epoch
print ('Epoch [%d/%d] Loss: %.4f' %(epoch+1, num_epoch, loss.item()))
print('finished Training')
This concludes training our neural network using Iris dataset.
Lets see how parameters have changed after training.
print(list(net.parameters()))
We can see that our parameters have changed after the training.
Now let's see how net
performs over training data.
outputs = net(data) # output tensor with 3 columns
# Each sample is classified to the class with the maximum output value
# e.g. a sample with output [0.1, 1.3, 0.3] will be classified as class 1
_, predicted = torch.max(outputs.data, 1) # Obtain class for each samples
# Accuracy = correct classification / total sample
print('Accuracy over training data %d %%' % (100 * torch.sum(labels == predicted) / labels.shape[0]))
As shown above, net
performs very well on the training data. In reality, there needs to be a test data to check how well the trained model generalizes.
Some potential exercises you can do are:
net
, and compare how well it peforms on training set and test set.