09 - Introduction to Neural Networks¶

by Fabio A. González, Universidad Nacional de Colombia

version 1.0, June 2018

Part of the class Applied Deep Learning ¶

This notebook is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

To run this notebook you need to download Pybrain (https://github.com/pybrain/pybrain) and copy the pybrain folder to the same folder where this notebook is.

In [2]:

import pybrain
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

%matplotlib inline

Artificial Neuron¶

$$o_j^{(n)} = \varphi\left(\sum_{i\; in\; layer (n-1)}w_{ij}o_i^{(n-1)} \right)$$

Step activation function¶

Logistic activation function¶

$$\varphi(x) = \frac{1}{1 - e^{-(x-b)}}$$

Question: How to program an artificial neuron to calculate the and function?¶

$X$	$Y$	$X$ and $Y$
0	0	0
0	1	0
1	0	0
1	1	1

AND Neural Network¶

In [3]:

from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(2, 1, outclass=pybrain.SigmoidLayer)
print(net.params)

[ 0.453198   -1.24483399 -0.91415624]

In [4]:

def print_pred2(dataset, network):
    df = pd.DataFrame(dataset.data['sample'][:dataset.getLength()],columns=['X', 'Y'])
    prediction = np.round(network.activateOnDataset(dataset),3)
    df['output'] = pd.DataFrame(prediction)
    return df

from pybrain.datasets import UnsupervisedDataSet, SupervisedDataSet
D = UnsupervisedDataSet(2) # define a dataset in pybrain
D.addSample([0,0])
D.addSample([0,1])
D.addSample([1,0])
D.addSample([1,1])
print_pred2(D, net)

Out[4]:

	X	Y	output
0	0.0	0.0	0.611
1	0.0	1.0	0.387
2	1.0	0.0	0.312
3	1.0	1.0	0.154

AND Neural Network¶

In [5]:

net.params[:] = [ -150, -100,  100]
print_pred2(D, net)

Out[5]:

	X	Y
0	0.0	0.0
1	0.0	1.0
2	1.0	0.0
3	1.0	1.0

Question: How to program an artificial neuron to calculate the xor function?¶

$X$	$Y$	$X$ xor $Y$
0	0	0
0	1	1
1	0	1
1	1	0

Plotting the NN Output¶

In [6]:

def plot_nn_prediction(N): 
    # a function to plot the binary output of a network on the [0,1]x[0,1] space
    x_list = np.arange(0.0,1.0,0.025)
    y_list = np.arange(1.0,0.0,-0.025)    
    z = [0.0 if N.activate([x,y])[0] <0.5 else 1.0  for y in y_list for x in x_list]
    z = np.array(z)
    grid = z.reshape((len(x_list), len(y_list)))
    plt.imshow(grid, extent=(x_list.min(), x_list.max(), y_list.min(), y_list.max()),cmap=plt.get_cmap('Greys_r'))
    plt.show()

Plotting the NN Output¶

In [7]:

net.params[:] = [-20, -50, 50]
plot_nn_prediction(net)

## Answer: It is impossible with only one neuron!

We need to use more than one neuron....¶

Multilayer Neural Network¶

Learning an XOR NN¶

In [8]:

Dtrain = SupervisedDataSet(2,1) # define a dataset in pybrain
Dtrain.addSample([0,0],[0])
Dtrain.addSample([0,1],[1])
Dtrain.addSample([1,0],[1])
Dtrain.addSample([1,1],[0])

from pybrain.supervised.trainers import BackpropTrainer

net = buildNetwork(2, 2, 1, hiddenclass=pybrain.SigmoidLayer, outclass=pybrain.SigmoidLayer)
T = BackpropTrainer(net, learningrate=0.1, momentum=0.9)
T.trainOnDataset(Dtrain, 1000)
print_pred2(D, net)

Out[8]:

	X	Y	output
0	0.0	0.0	0.044
1	0.0	1.0	0.949
2	1.0	0.0	0.949
3	1.0	1.0	0.049

XOR NN Output Plot¶

In [9]:

plot_nn_prediction(net)

The Little Red Riding Hood Neural Network¶

LRRH Network Architecture¶

Training¶

In [10]:

from pybrain.tools.validation import Validator

validator =  Validator()
Dlrrh = SupervisedDataSet(4,4) 
Dlrrh.addSample([1,1,0,0],[1,0,0,0])
Dlrrh.addSample([0,1,1,0],[0,0,1,1])
Dlrrh.addSample([0,0,0,1],[0,1,1,0])
df = pd.DataFrame(Dlrrh['input'],columns=['Big Ears', 'Big Teeth', 'Handsome', 'Wrinkled'])
print (df.join(pd.DataFrame(Dlrrh['target'],columns=['Scream', 'Hug', 'Food', 'Kiss'])))
net = buildNetwork(4, 3, 4, hiddenclass=pybrain.SigmoidLayer, outclass=pybrain.SigmoidLayer)

   Big Ears  Big Teeth  Handsome  Wrinkled  Scream  Hug  Food  Kiss
0       1.0        1.0       0.0       0.0     1.0  0.0   0.0   0.0
1       0.0        1.0       1.0       0.0     0.0  0.0   1.0   1.0
2       0.0        0.0       0.0       1.0     0.0  1.0   1.0   0.0

Backpropagation¶

In [11]:

T = BackpropTrainer(net, learningrate=0.01, momentum=0.99)
scores = []
for i in range(1000):
    T.trainOnDataset(Dlrrh, 1)
    prediction = net.activateOnDataset(Dlrrh)
    scores.append(validator.MSE(prediction, Dlrrh.getField('target')))
plt.ylabel('Mean Square Error')
plt.xlabel('Iteration')
plt.plot(scores)

Out[11]:

[<matplotlib.lines.Line2D at 0x10dc8cc50>]

Learning as optimization¶

General optimization problem:

$$\min_{f\in H}L(f,D)$$

with $H$: hypothesis space, $D$:training data, $L$:loss/error

Example, least squares linear regression:¶

$$\min_{f\in H}L(f,D)$$

Hypothesis space:

$f(x)=w^{T}x$

Data: $D = \{(x_1, t_1), \dots , (x_n, t_n)\}$
Least squares loss:

$$L(f, D)=-\sum_{t_i \in D}(t_{i} - f(x_i))^2$$

Example, logistic regression:¶

$$\min_{f\in H}L(f,D)$$

Hypothesis space:

$f(x)=P(C_{+}|x)=\sigma(w^{T}x)$

Data: $D = \{(x_1, t_1), \dots , (x_n, t_n)\}$
Cross-entropy error:

$$E(f,D)=-\ln p(D|f)=-\sum_{t_i \in D}(t_{n}\ln y_{n}+(1-t_{n})\ln(1-y_{n}))$$

Gradient descent¶

Prediction¶

In [12]:

def lrrh_input(vals):
    return pd.DataFrame(vals,index=['big ears', 'big teeth', 'handsome', 'wrinkled'], columns=['input'])

def lrrh_output(vals):
    return pd.DataFrame(vals,index=['scream', 'hug', 'offer food', 'kiss cheek'], columns=['output'])

In [13]:

in_vals = [0, 0, 0, 0]
lrrh_input(in_vals)

Out[13]:

	input
big ears	0
big teeth	0
handsome	0
wrinkled	0

In [14]:

lrrh_output(net.activate(in_vals))

Out[14]:

	output
scream	0.055038
hug	0.786478
offer food	0.947874
kiss cheek	0.090418

09 - Introduction to Neural Networks¶

Part of the class Applied Deep Learning¶

Artificial Neuron¶

Step activation function¶

Logistic activation function¶

Question: How to program an artificial neuron to calculate the and function?¶

AND Neural Network¶

AND Neural Network¶

Question: How to program an artificial neuron to calculate the xor function?¶

Plotting the NN Output¶

Plotting the NN Output¶

We need to use more than one neuron....¶

Multilayer Neural Network¶

Learning an XOR NN¶

XOR NN Output Plot¶

The Little Red Riding Hood Neural Network¶

LRRH Network Architecture¶

Training¶

Backpropagation¶

Learning as optimization¶

Example, least squares linear regression:¶

Example, logistic regression:¶

Gradient descent¶

Prediction¶

Part of the class Applied Deep Learning ¶