# 09 - Introduction to Neural Networks¶

by Fabio A. González, Universidad Nacional de Colombia

version 1.0, June 2018

## Part of the class Applied Deep Learning¶

To run this notebook you need to download Pybrain (https://github.com/pybrain/pybrain) and copy the pybrain folder to the same folder where this notebook is.

In [2]:
import pybrain
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

%matplotlib inline


## Artificial Neuron¶

$$o_j^{(n)} = \varphi\left(\sum_{i\; in\; layer (n-1)}w_{ij}o_i^{(n-1)} \right)$$

## Logistic activation function¶

$$\varphi(x) = \frac{1}{1 - e^{-(x-b)}}$$

### Question: How to program an artificial neuron to calculate the and function?¶

$X$ $Y$ $X$ and $Y$
0 0 0
0 1 0
1 0 0
1 1 1

## AND Neural Network¶

In [3]:
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(2, 1, outclass=pybrain.SigmoidLayer)
print(net.params)

[ 0.453198   -1.24483399 -0.91415624]

In [4]:
def print_pred2(dataset, network):
df = pd.DataFrame(dataset.data['sample'][:dataset.getLength()],columns=['X', 'Y'])
prediction = np.round(network.activateOnDataset(dataset),3)
df['output'] = pd.DataFrame(prediction)
return df

from pybrain.datasets import UnsupervisedDataSet, SupervisedDataSet
D = UnsupervisedDataSet(2) # define a dataset in pybrain
print_pred2(D, net)

Out[4]:
X Y output
0 0.0 0.0 0.611
1 0.0 1.0 0.387
2 1.0 0.0 0.312
3 1.0 1.0 0.154

## AND Neural Network¶

In [5]:
net.params[:] = [ -150, -100,  100]
print_pred2(D, net)

Out[5]:
X Y output
0 0.0 0.0 0.0
1 0.0 1.0 0.0
2 1.0 0.0 0.0
3 1.0 1.0 0.0

### Question: How to program an artificial neuron to calculate the xor function?¶

$X$ $Y$ $X$ xor $Y$
0 0 0
0 1 1
1 0 1
1 1 0

## Plotting the NN Output¶

In [6]:
def plot_nn_prediction(N):
# a function to plot the binary output of a network on the [0,1]x[0,1] space
x_list = np.arange(0.0,1.0,0.025)
y_list = np.arange(1.0,0.0,-0.025)
z = [0.0 if N.activate([x,y])[0] <0.5 else 1.0  for y in y_list for x in x_list]
z = np.array(z)
grid = z.reshape((len(x_list), len(y_list)))
plt.imshow(grid, extent=(x_list.min(), x_list.max(), y_list.min(), y_list.max()),cmap=plt.get_cmap('Greys_r'))
plt.show()


## Plotting the NN Output¶

In [7]:
net.params[:] = [-20, -50, 50]
plot_nn_prediction(net)


## Learning an XOR NN¶

In [8]:
Dtrain = SupervisedDataSet(2,1) # define a dataset in pybrain

from pybrain.supervised.trainers import BackpropTrainer

net = buildNetwork(2, 2, 1, hiddenclass=pybrain.SigmoidLayer, outclass=pybrain.SigmoidLayer)
T = BackpropTrainer(net, learningrate=0.1, momentum=0.9)
T.trainOnDataset(Dtrain, 1000)
print_pred2(D, net)

Out[8]:
X Y output
0 0.0 0.0 0.044
1 0.0 1.0 0.949
2 1.0 0.0 0.949
3 1.0 1.0 0.049

## XOR NN Output Plot¶

In [9]:
plot_nn_prediction(net)


## Training¶

In [10]:
from pybrain.tools.validation import Validator

validator =  Validator()
Dlrrh = SupervisedDataSet(4,4)
df = pd.DataFrame(Dlrrh['input'],columns=['Big Ears', 'Big Teeth', 'Handsome', 'Wrinkled'])
print (df.join(pd.DataFrame(Dlrrh['target'],columns=['Scream', 'Hug', 'Food', 'Kiss'])))
net = buildNetwork(4, 3, 4, hiddenclass=pybrain.SigmoidLayer, outclass=pybrain.SigmoidLayer)

   Big Ears  Big Teeth  Handsome  Wrinkled  Scream  Hug  Food  Kiss
0       1.0        1.0       0.0       0.0     1.0  0.0   0.0   0.0
1       0.0        1.0       1.0       0.0     0.0  0.0   1.0   1.0
2       0.0        0.0       0.0       1.0     0.0  1.0   1.0   0.0


## Backpropagation¶

In [11]:
T = BackpropTrainer(net, learningrate=0.01, momentum=0.99)
scores = []
for i in range(1000):
T.trainOnDataset(Dlrrh, 1)
prediction = net.activateOnDataset(Dlrrh)
scores.append(validator.MSE(prediction, Dlrrh.getField('target')))
plt.ylabel('Mean Square Error')
plt.xlabel('Iteration')
plt.plot(scores)

Out[11]:
[<matplotlib.lines.Line2D at 0x10dc8cc50>]

## Learning as optimization¶

General optimization problem:

$$\min_{f\in H}L(f,D)$$

with $H$: hypothesis space, $D$:training data, $L$:loss/error

## Example, least squares linear regression:¶

$$\min_{f\in H}L(f,D)$$
• Hypothesis space: $f(x)=w^{T}x$
• Data: $D = \{(x_1, t_1), \dots , (x_n, t_n)\}$
• Least squares loss: $$L(f, D)=-\sum_{t_i \in D}(t_{i} - f(x_i))^2$$

## Example, logistic regression:¶

$$\min_{f\in H}L(f,D)$$
• Hypothesis space: $f(x)=P(C_{+}|x)=\sigma(w^{T}x)$
• Data: $D = \{(x_1, t_1), \dots , (x_n, t_n)\}$
• Cross-entropy error: $$E(f,D)=-\ln p(D|f)=-\sum_{t_i \in D}(t_{n}\ln y_{n}+(1-t_{n})\ln(1-y_{n}))$$

## Prediction¶

In [12]:
def lrrh_input(vals):
return pd.DataFrame(vals,index=['big ears', 'big teeth', 'handsome', 'wrinkled'], columns=['input'])

def lrrh_output(vals):
return pd.DataFrame(vals,index=['scream', 'hug', 'offer food', 'kiss cheek'], columns=['output'])

In [13]:
in_vals = [0, 0, 0, 0]
lrrh_input(in_vals)

Out[13]:
input
big ears 0
big teeth 0
handsome 0
wrinkled 0
In [14]:
lrrh_output(net.activate(in_vals))

Out[14]:
output
scream 0.055038
hug 0.786478
offer food 0.947874
kiss cheek 0.090418