Neural Networks

Introduction

We will assume that you have already been exposed to neural network modeling. This section is designed to quickly help you recap the basics that you will need in order to create and experiment with neural networks in Pyrobot.

See the Pyrobot website for more info.

We will concentrate mostly on backpropagation networks here. A typical backprop network is a three layer network containing input, hidden, and output layers. Each layer contains a collection of nodes. Typically, the nodes in a layer are fully connected to the next layer. For instance, every input node will have a weighted connection to every hidden node. Similarly, every hidden node will have a weighted connection to every output node.

Processing in a backprop network works as follows. Input is propagated forward from the input layer through the hidden layer and finally through the output layer to produce a response. Each node, regardless of the layer it is in, uses the same transfer function in order to propagate its information forward to the next layer. This is described next.

Transfer function of a node

Each node maintains an activation value that depends on the activation values of its incoming neighbors, the weights from its incoming neighbors, and its own default bias value. To compute this activation value, we first calculate the node's net input.

The net input is a weighted sum of all the incoming activations plus the node's bias value:

$net_i = \sum\limits_{j=1}^n w_{ij} x_j + b_i$

Here is some corresponding Python code to compute this function for each node:

In [2]:
toNodes = range(3, 5)
fromNodes = range(0, 2)

bias       = [0.2, -0.1, 0.5, 0.1, 0.4, 0.9]
activation = [0.8, -0.3, -0.8, 0.1, 0.5]
netInput   = [0, 0, 0, 0, 0]
weight = [[ 0.1, -0.8], 
          [-0.3,  0.1], 
          [ 0.2, -0.1], 
          [ 0.0,  0.1], 
          [ 0.8, -0.8], 
          [ 0.4, 0.5]]

for i in toNodes:
   netInput[i] = bias[i]
   for j in fromNodes:
      netInput[i] += (weight[i][j] * activation[j]) 

where weight[i][j] is the weight $w_{ij}$, or connection strength, from the $j^{th}$ node to the $i^{th}$ node, activation[j] is the activation signal $x_j$ of the $j^{th}$ input node, and bias[i] is the bias value $b_i$ of the $i^{th}$ node.

After computing the net input, each node has to compute its output activation. The value that results from applying the activation function to the net input is the signal that will be sent as output to all the nodes in the next layer. The activation function used in backprop networks is generally:

$a_i = \sigma(net_i)$

where $\sigma(x) = \dfrac{1}{1 + e^{-x}}$

In [3]:
import math

def activationFunction(netInput):
    return 1.0 / (1.0 + math.exp(-netInput))

for i in toNodes:
    activation[i] = activationFunction(netInput[i])

This $\sigma$ is the activation function, as shown in the plot below. Notice that the function is monotonically increasing and bounded by 0.0 and 1.0 as the net input approaches negative infinity and positive infinity, respectively.

In [4]:
import math
pts = [(x, activationFunction(x)) for x in range(-10, 10)]
In [7]:
calico.ScatterChart(['x', 'activiation'], pts, {'width': 600, "height": 400, 
                                                'legend': {'position': 'in'}, "lineWidth": 1, "pointSize": 3})
Out[7]:
In [2]:
from ai.conx import *
In [5]:
net = Network()
net.addLayers(2, 3, 1)
print(net)
Conx using seed: 1398275934.28
Layer 'output': (Kind: Output, Size: 1, Active: 1, Frozen: 0)
Target    : 0.00  
Activation: 0.00  
Layer 'hidden': (Kind: Hidden, Size: 3, Active: 1, Frozen: 0)
Activation: 0.00  0.00  0.00  
Layer 'input': (Kind: Input, Size: 2, Active: 1, Frozen: 0)
Activation: 0.00  0.00  

In [ ]: