ConX is an accessible and powerful way to build and understand deep learning neural networks. Specifically, it sits on top of Keras, which sits on top of TensorFlow, CNTK, or Theano (though Theano is no longer being developed).
ConX:
But rather than attempting to explain each of these points, let's demonstrate them. There are 8 steps needed to construct and use a ConX network:
This demonstration is being run in a Jupyter Notebook. ConX doesn't require running in the notebook, but if you do, you will be able to use the visualizations and dashboard.
As a demonstration, let's build a simple network for learning the XOR (exclusive or). XOR is defined as:
Input 1 | Input 2 | XOR |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
We will need the Network, and Layer classes from the conx module:
import conx as cx
Using TensorFlow backend. ConX, version 3.7.4
Every network needs a name:
net = cx.Network("XOR Network")
Every layer needs a name and a size. We add each of the layers of our network. The first layer will be an "input" layer (named arbitrarily "input"). We only need to specify the size. For our XOR problem, there are two inputs:
net.add(cx.Layer("input", 2))
'input'
For the next layers, we will also use the default layer type for hidden and output layers. However, we also need to specify the function to apply to the "net inputs" to the layer, after the matrix multiplications. We have a few choices for which activation functions to use:
You can try any of these, though the sigmoid function generally works best for this problem. Feel free to experiment with other options.
net.add(cx.Layer("hidden", 3, activation="sigmoid"))
net.add(cx.Layer("output", 1, activation="sigmoid"))
'output'
We connect up the layers as needed. This is a simple 3-layer network:
net.connect("input", "hidden")
net.connect("hidden", "output")
Note:
We use the term layer
here because each of these items composes the layer itself. In general though, a layer can be composed of many of these items. In that case, we call such a layer a bank
.
Before we can do this step, we need to do two things:
The first option is called the error
(or loss
). There are many choices for the error function, and we'll dive into each later. For now, we'll just briefly mention them:
The second option is called "optimizer". Again, there are many choices, but we just briefly name them here:
For now, we'll just pick "mse" for the error function, and "adam" for the optimizer.
And we compile the network:
net.compile(error="mse", optimizer="sgd", lr=0.1, momentum=0.5)
Networks in ConX are initialized with small random weights in the range -1..1. Each unit in a layer is also give a bias, which is initialized to 0. A bias is trained, just as the weights, and is added in when calculating a unit's incoming activation. Once trained, the bias provides each unit with a default activation value in the absence of other inputs.
You can inspect the weights coming into a layer, as shown below.
net.get_weights("hidden")
[[[-0.767471432685852, 0.10572397708892822, -0.7625459432601929], [-0.6755458116531372, 0.20516455173492432, 0.43674349784851074]], [0.0, 0.0, 0.0]]
net.get_weights("output")
[[[0.471946120262146], [0.9158862829208374], [1.1649702787399292]], [0.0]]
At this point in the steps, you can see a visual representation of the network by simply asking for a picture:
net.picture()
This is useful to see the layers and connections.
Propagating the network places an array on the input layer, and sends the values through the network. We can try any input vector:
net.propagate([-2, 2])
[0.8616290092468262]
If we would like to see the activations on all of the units in the network, we can take a picture with the same input vector. You should show some colored squares in the layers representing the activation levels at each unit:
net.picture([-2, 2])
In these visualizations, the color gives an indication of its relative activation value of each neuron.
For input layers, the default is to give a gray scale value representing the possible ranges, black meaning negative and white meaning more positive.
For non-input layers, the more red a unit is, the smaller its value, and the more blue, the larger its value. Values close to zero will appear whiter.
Notice that if you propagate this untrained network with zeros, then the hidden layer activations are all white. This means that there is no activation at any node in the network. This is because the bias units are initialized at zero.
net.propagate([0,0])
[0.7818366289138794]
net.picture([0,0])
For this little experiment, we want to train the network on our table from above. To do that, we add the inputs and the targets to the dataset, one at a time:
net.dataset.append([0, 0], [0])
net.dataset.append([0, 1], [1])
net.dataset.append([1, 0], [1])
net.dataset.append([1, 1], [0])
net.dataset.info()
Dataset: Dataset for XOR Network
Information:
Input Summary:
Target Summary:
net.reset()
net.train(epochs=5000, accuracy=1.0, tolerance=0.2, report_rate=100)
======================================================== | Training | Training Epochs | Error | Accuracy ------ | --------- | --------- # 4531 | 0.02997 | 1.00000
Perhaps the network learned none, some, or all of the patterns. You can reset the network, and try again, by re-running the above cell.
net.evaluate(show=True)
======================================================== Testing validation dataset with tolerance 0.2... # | inputs | targets | outputs | result --------------------------------------- 0 | [[0.00, 0.00]] | [[0.00]] | [0.13] | correct 1 | [[0.00, 1.00]] | [[1.00]] | [0.82] | correct 2 | [[1.00, 0.00]] | [[1.00]] | [0.82] | correct 3 | [[1.00, 1.00]] | [[0.00]] | [0.20] | correct Total count: 4 correct: 4 incorrect: 0 Total percentage correct: 1.0
The dashboard allows you to interact, test, and generally work with your network via a GUI.
net.dashboard()
Dashboard(children=(Accordion(children=(HBox(children=(VBox(children=(Select(description='Dataset:', index=1, …
ConX has a number of methods for visualizing images. In this example below, we get each picture of the network as an "image" and give the list of images to cx.view
.
for i in range(4):
display(net.picture(net.dataset.inputs[i], scale=0.15))
There are five ways to propagate activations through the network:
inputs
) - propagate these inputs through the networkinputs
) - propagate these inputs to this bank (returns the output at that layer)bank-name
, activations
) - propagate the activations from bank-name
to outputsbank-name
, activations
, scale=SCALE) - returns an image of the layer activationsbank-name
, activations
, scale=SCALE) - gets a matrix of images for each feature (channel) at the layernet.propagate_from("hidden", [0, 1, 0])
[0.0340755395591259]
net.propagate_to("hidden", [0.5, 0.5])
[0.2813264727592468, 0.031041061505675316, 0.1953100860118866]
net.propagate_to("hidden", [0.1, 0.4])
[0.36196693778038025, 0.26838693022727966, 0.06002996116876602]
There is also a propagate_to_image() that takes a bank name, and inputs.
net.propagate_to_image("hidden", [0.1, 0.4]).resize((500, 100))
You can re-plot the plots from the entire training history with:
net.plot_results()
You can plot the following values from the training history:
You can plot any subset of the above on the same plot:
net.plot(["acc", "loss"])
You can also see the activations at a particular unit, given a range of input values for two input units. Since this network has only two inputs, and one output, we can see the entire input and output ranges:
for i in range(net["hidden"].size):
net.plot_activation_map(from_layer="input", from_units=(0,1),
to_layer="hidden", to_unit=i)
net.plot_activation_map(from_layer="input", from_units=(0,1),
to_layer="output", to_unit=0)
net.plot_activation_map()