In [1]:

# preliminaries: let's load some stuff
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.image import AxesImage
import seaborn as sns
sns.set_context('talk')

from matplotlib import animation, rc
from IPython.display import HTML
%config InlineBackend.figure_format = 'retina'
np.random.seed(12345)

/usr/local/lib/python3.6/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm

More than a sum of parts: network computing with neurons¶

One of the most important ideas in theoretical neuroscience is the (surprising) finding that brains are more than just a collection of the computations performed by individual neurons. This is partly because the properties we think of as "single neuron" computations are often the byproduct of network effects. But it is also reflective of a deeper truth: nonlinearity and recurrence unlock powerful computational capabilities in networks.

Let's unpack that a bit.

Linearity and its discontents¶

First off, nonlinearity describes any phenomenon which is, well, not linear. So what does linear mean? Mostly this: if we make the input to a system twice as strong, the response of the system will scale in proportion. So, if we're thinking of input-ouput functions of neurons, we'd get something like:

In [2]:

input_range = np.arange(3, 10, 0.1)
plt.figure(figsize=(6, 4), facecolor="white")
plt.plot(input_range, 0.5 * input_range - 3)
plt.xlabel('Input')
plt.ylabel('Output')
plt.title('A linear input-output function');

So far so good. But as we know, neurons don't have negative outputs. Their input-output functions look more like this:

In [3]:

plt.figure(figsize=(6, 4), facecolor="white")
plt.plot(input_range, np.maximum(0.5 * input_range - 3, 0))
plt.xlabel('Input')
plt.ylabel('Output')
plt.title('A nonlinear input-output function');

Notice something really interesting: Even though this function is linear for inputs > 6, it is nonlinear over all. It's a very, very mild nonlinearity, but that's enough for fascinating things to happen.

In the next section, we'll discuss why.

Composition and recurrence¶

One of the mathematically nicest but also most limiting features of linear systems is that when we compose linear systems (feeding the output of one into another), they remain linear. Using lots of linear building blocks in sequence doesn't really buy us more than having one giant linear system.

But when we compose nonlinear systems, all sorts of things can happen. In fact, if we take three sets of neurons — an input layer, a middle (often called hidden) layer, and an output layer — this is enough to approximate any function.

But there's another idea we might have: what if we feed the output of a system (linear or nonlinear) back into itself? We call that idea recurrence, and it's also incredibly powerful. In fact, recurrent linear systems do some pretty cool things, but recurrent nonlinear systems have can have really, really complex behavior.

The rest of this section is about exploring how nonlinearity and recurrence can work together to do something that no collection of isolated neurons can do.

Hopfield networks¶

One of the most influential examples of how nonlinearity and recurrence can work together to perform interesting computations was given by John Hopfield in his seminal 1982 PNAS paper (cited over 20,000 times!). In a nutshell, what Hopfield showed was that collections of neurons with nonlinear interactions could create dynamics that resulted in new kinds of collective behavior. In particular, he applied these ideas to provide a new model of associative memory we'll explore below.

Like most models in neuroscience, Hopfield's model makes a number of simplifications — it's not a simulator. Rather, in theoretical neuroscience, the key question is often how to capture just enough to detail to illustrate a concept while keeping the math tractable.

Our treatment is based on Hopfield's own very readable review here. All the code is below, but we'll focus on concepts.

Just enough math¶

So here goes:

Take a bunch of neurons. Let's index them with $j = 1\ldots N$.

Let the activity of neuron $j$ be $V_j$. In this demo, we will assume that these values are binary, i.e. $V_j$ is either 0 or 1. (This would normally be the membrane voltage. Since we have one per neuron, it's a vector.)

Let the connection strength from neuron $k$ to neuron $j$ be $T_{jk}$. (These form a matrix of synaptic strengths.)

Let's assume every neuron is also getting some input $I_j$. (These would be external currents (also a vector).)

Now here's how it behaves:

Figure out the total input to each neuron: $x_j = I_j + \sum_k T_{jk} V_k$ (or, if you like linear algebra, $\mathbf{x} = \mathbf{T V + I}$).
Each neuron has a nonlinear response: set $V_j = 1$ if $x_j > 0$ and $V_j = 0$ if $x_j < 0$.
Repeat.

Note that this process is nonlinear (from step 2) and recurrent (from steps 1 and 3). Because each neuron's $V$ keeps changing, the inputs to its connections keep changing, and that will lead to interesting dynamics.

Visualized in a graph form, Hopfield networks allow the neurons (shown as circles) can be turned "on" and "off" in each step, depending on the connection strengths $T$ and the input $I$, as well as the state of neurons $V_j$ in the previous step:

This raises two important questions:

Does the system ever settle down?
Can we use these dynamics to do something interesting?

The answer to both questions is yes, and for fascinating reasons.

For the first question, it's that the system has something called a Lyapunov function, which is something like an energy that always decreases over time. And this function has a minimum (under certain mathematical conditions). In these cases, the system is guaranteed to settle down.

As for the second question,

Implementing the Hopfield network¶

We'll start by selecting some patterns for the network to remember.

In this demo, we'll use a formula for getting $T$ that remembers certain patterns, and one of its requirements is that these patterns can't be just anything. For this formula to work, they have to be approximately "non-overlapping" (orthogonal) for the following to work.

Here, we'll choose a particularly simple form for a set of orthogonal patterns: horizontal and vertical stripes. Each of these patterns is a 32×32-pixel image that encodes a possible $V$ of a 1024-neuron Hopfield network.

In [4]:

Npat = 10  # number of patterns
Ndim = 32  # each pattern will be an image Ndim x Ndim

patterns = np.empty((Npat, Ndim**2))

plt.figure(figsize=(Npat * 2, 2), facecolor="white")

for ii in range(Npat):
    plt.subplot(1, Npat, ii + 1)
    patterns[ii] = np.mod((np.arange(Ndim**2) // 2**ii), 2)
    plt.imshow(patterns[ii].reshape(Ndim, Ndim))
    plt.axis('off')

plt.suptitle('Network patterns', y=1)

Out[4]:

Text(0.5, 1, 'Network patterns')

Define a connection matrix according to this formula in Hopfield's original paper:

$$ T_{ij} ~=~ \begin{cases} \sum_{s=1}^S \left ( 2 V_{i}^{(s)} - 1 \right ) \left ( 2 V_{j}^{(s)} - 1 \right ) & \text{if } i \ne j \\ 0 & \text{if } i = j \end{cases} $$

will store the set of $S$ patterns, denoted by $V^{(s)}$ for $s = 1, \cdots S$. In our example, $T$ would be a 1024×1024 matrix.

In [5]:

T = (2 * patterns - 1).T @ (2 * patterns - 1)
np.fill_diagonal(T, 0)
plt.figure(figsize=(8, 6), facecolor='white')
plt.imshow(T)
plt.colorbar()
plt.title('Connection matrix')

Out[5]:

Text(0.5, 1.0, 'Connection matrix')

And if we check the dot products of the patterns against each other, we can see that they're orthogonal:

In [6]:

plt.figure(figsize=(8, 6), facecolor="white")
plt.imshow((2 * patterns - 1) @ (2 * patterns - 1).T)
plt.title('Pattern dot products')
plt.colorbar()

Out[6]:

<matplotlib.colorbar.Colorbar at 0x7fdea2b844a8>

Define a function to evolve the network one time step:

In [7]:

def evolve(V, T, I):
    return T @ V + I > 0

Experiment 1: stable states¶

Pick a random starting pattern to initialize the network to.

(Each time you run this cell, you'll get a new pattern.)

And let's see what happens when we let the network evolve:

(The following function animates the network's evolution based on this blog post.)

In [8]:

def plot_network(Vinit, Nframes=100):
    img = plt.matshow(Vinit.reshape((Ndim, Ndim)))
    plt.axis('off');
    
    def evolver(Vinit):
        v = Vinit
        yield np.zeros(v.shape)
        yield v
        for _ in range(Nframes):
            v = evolve(v, T, I)
            yield v
    
    # animation function. This is called sequentially
    def animate(v):
        img.set_data(v.reshape((Ndim, Ndim)))
        return (img,)
    
    # call the animator. blit=True means only re-draw the parts that have changed.
    anim = animation.FuncAnimation(img.figure, animate, 
                                   frames=evolver(Vinit.copy()), interval=200, repeat=False)

    return HTML(anim.to_jshtml())

In [9]:

V = np.random.rand(Ndim**2) > 0.5
I = 0

plt.figure(facecolor="white")
ax = plt.imshow(V.reshape((Ndim, Ndim)))
plt.title('Initial configuration')
plt.axis('off');
plt.show()

plot_network(V, 20)

Out[9]:

Try running the above cell a few times.

Can you think of a reason why not all initial configurations evolve into one of the "memorized" patterns?

Experiment 2: noise corruption¶

Let's corrupt one of the patterns with some random bit noise and see what the network does with it.

In [10]:

Vinit = patterns[2]

fraction_flips = 0.3
noise = np.random.rand(*V.shape) <= fraction_flips
V = np.logical_xor(Vinit, noise)
I = 0

fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(12, 4), facecolor="white")
axes[0].matshow(Vinit.reshape((Ndim, Ndim)))
axes[0].set_title('Original Pattern')
axes[0].axis('off');

axes[1].matshow(noise.reshape((Ndim, Ndim)))
axes[1].set_title('Noise')
axes[1].axis('off');

axes[2].matshow(V.reshape((Ndim, Ndim)))
axes[2].set_title('Combined')
axes[2].axis('off');

In [11]:

plot_network(V, 20)

Out[11]:

Try running this a few times with different patterns. How does the parameter fraction_flips (the fraction of bits that are corrupted in the initial pattern) affect recovery?

Big questions to ponder¶

How might a network like this be used by brains?
How could it be learned?
How could you test the hypothesis that a specific circuit(s) implemented a Hopfield-like network?