# SYDE 556/750: Simulating Neurobiological Systems¶

## Learning¶

• What do we mean by learning?

• When we use an integrator to keep track of location, is that learning?
• What about the learning used to complete a pattern in the Raven's Progressive Matrices task?
• Neither of these require any connection weights to change in the model
• But both allow future performance to be affected by past performance
• I suggest the term 'adaptation' to capture all such future-affected-by-past phenomena
• So, we'll stick with a simple definition of learning

• Changing connection weights between groups of neurons
• Why might we want to change connection weights?
• This is what traditional neural network approaches do
• Change connection weights until it performs the desired task
• Once it's doing the task, stop changing the weights
• But we have a method for just solving for the optimal connection weights
• So why bother learning?

### Why learning might be useful¶

• We might not know the function at the beginning of the task
• Example: a creature explores its environment and learns that eating red objects is bad, but eating green objects is good
• what are the inputs and outputs here?
• The desired function might change
• Example: an ensemble whose input is a desired hand position, but the output is the muscle tension (or joint angles) needed to get there
• why would this change?
• The optimal weights we solve for might not be optimal
• How could they not be optimal?
• What assumptions are we making?

### The simplest approach¶

• What's the easiest way to deal with this, given what we know?
• If we need new decoders
• Let's solve for them while the model's running
• Gather data to build up our $\Gamma$ and $\Upsilon$ matrices
• Example: eating red but not green objects
• Decoder from state to $Q$ value (utility of action) for eating
• State is some high-dimensional vector that includes the colour of what we're looking for
• And probably some other things, like whether it's small enough to be eaten
• Initially doesn't use colour to get output
• But we might experience a few bad outcomes after red, and good after green
• These become new $x$ samples, with corresponding $f(x)$ outputs
• Gather a few, recompute decoder
• Could even do this after every timestep
• Example: converting hand position to muscle commands
• Send random signals to muscles
• Observe hand position
• Use that to train decoders
• Example: going from optimal to even more optimal
• As the model runs, we gather $x$ values
• Recompute decoder for those $x$ values

### What's wrong with this approach¶

• Feels like cheating
• Why?
• Two kinds of problems:
• Not biologically realistic
• How are neurons supposed to do all this?
• store data
• solve decoders
• timing
• Computationally expensive
• Even if we're not worried about realism

• Traditionally, learning is the main method of constructing a model network
• Usually incremental learning (gradient descent)
• As you get examples, shift the connection weights slightly based on that example
• Don't have to consider all the data when making an update
• Example: Perceptron learning (1957)
• $\Delta w_j = \alpha(y_d - y)x_i$

• Can't do all possible functions
• Effectively just linear functions of $x$ (with a threshold; i.e. a linear classifier)
• Is that a problem?

## Backprop and the NEF¶

• How are nonlinear functions included?
• Multiple layers

• But now a new rule is needed
• Same as perceptron for first (output) layer
• Backprop adds: Estimate correct "hidden layer" input, and repeat
• What would this be in NEF terms?
• Remember that we're already fine with linear decoding
• Encoders (and $\alpha$ and $J^{bias}$) are input layer of weights, decoders are output layer
• Note that in the NEF, we combine many of these layers together
• We can just use the standard perceptron rule for decoders
• As long as there are lots of neurons, and we've initialized them well with the desired intercepts, maximum rates, and encoders we should be able to decode lots of functions
• So, what might backprop add to that?

## Biologically realistic perceptron learning¶

• (MacNeil & Eliasmith, 2011) derive a simple, plausible learning rule starting with a delta rule
• $E = 1/2 \int (x-\hat{x})^2 dx$
• $\delta E/\delta d_i = (x-\hat{x})a_i$ (as usual for finding decoders)
• So, to move down the gradient:
• $\Delta d_i = -\kappa (x - \hat{x})a_i$ (NEF notation)
• $\Delta d_i = \kappa (y_d - y)a_i$ (the standard perceptron/delta rule)
• How do we make it realistic?
• Decoders don't exist in the brain
• Need weights
• The NEF tells us:
• $\omega_{ij} = d_i \cdot e_j$
• $\Delta \omega_{ij} = \kappa (y_d - y)a_i \cdot e_j$
• define $(y_d - y)$ = $E$ (for error) and re-arrange
• $\Delta \omega_{ij} = \kappa a_i E \cdot e_j$
• What's $E \cdot e_j$?
• That's the current that this neuron would get if it had $E$ as an input
• But we don't want this current to drive the neuron
• Rather, we want it to change the weight
• It's a modulatory input
• This is the "Prescribed Error Sensitivity" PES rule
• Any model in the NEF could use this instead of computing decoders
• Requires some other neural group computing the error $E$
• Used in Spaun for Q-value learning (reinforcement task)
• Can even be used to learn circular convolution
• Is this realistic?
• Local information only
• Need an error signal
• Does it look like anything like this happens in the brain?
• Yes
• Retinal slip error is computed in oculomotor system
• Dopamine seems to act as prediction error
• Weight changes proportional to pre-synaptic activity and post-synaptic activity (Hebbian rule)
In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib

In [2]:
#From the learning examples in Nengo - a Communication Channel
import nengo
import numpy as np
from nengo.processes import WhiteSignal

model = nengo.Network('Learn a Communication Channel')
with model:
stim = nengo.Node(output=WhiteSignal(10, high=5, rms=0.5), size_out=2)

pre = nengo.Ensemble(60, dimensions=2)
post = nengo.Ensemble(60, dimensions=2)

nengo.Connection(stim, pre)
conn = nengo.Connection(pre, post, function=lambda x: np.random.uniform(-1, 1, 2))

inp_p = nengo.Probe(stim)
pre_p = nengo.Probe(pre, synapse=0.01)
post_p = nengo.Probe(post, synapse=0.01)

sim = nengo.Simulator(model)
sim.run(10.0)

0%

0%

In [3]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/pre_learn.py.cfg')

In [4]:
t=sim.trange()

figure(figsize=(12, 8))
subplot(2, 1, 1)
plot(t, sim.data[inp_p].T[0], c='k', label='Input')
plot(t, sim.data[pre_p].T[0], c='b', label='Pre')
plot(t, sim.data[post_p].T[0], c='r', label='Post')
ylabel("Dimension 1")
legend(loc='best')
title('Random function computation')

subplot(2, 1, 2)
plot(t, sim.data[inp_p].T[1], c='k', label='Input')
plot(t, sim.data[pre_p].T[1], c='b', label='Pre')
plot(t, sim.data[post_p].T[1], c='r', label='Post')
ylabel("Dimension 2")
legend(loc='best');

In [5]:
#Now learn
with model:
error = nengo.Ensemble(60, dimensions=2)
error_p = nengo.Probe(error, synapse=0.03)

# Error = actual - target = post - pre
nengo.Connection(post, error)
nengo.Connection(pre, error, transform=-1)

# Add the learning rule to the connection
conn.learning_rule_type = nengo.PES()

# Connect the error into the learning rule
learn_conn = nengo.Connection(error, conn.learning_rule)

sim = nengo.Simulator(model)
sim.run(10.0)

0%

0%

In [28]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/simple_learn.py.cfg')

In [6]:
t=sim.trange()

figure(figsize=(12, 8))
subplot(3, 1, 1)
plot(t, sim.data[inp_p].T[0], c='k', label='Input')
plot(t, sim.data[pre_p].T[0], c='b', label='Pre')
plot(t, sim.data[post_p].T[0], c='r', label='Post')
ylabel("Dimension 1")
legend(loc='best')
title('Learn a communication channel')

subplot(3, 1, 2)
plot(t, sim.data[inp_p].T[1], c='k', label='Input')
plot(t, sim.data[pre_p].T[1], c='b', label='Pre')
plot(t, sim.data[post_p].T[1], c='r', label='Post')
ylabel("Dimension 2")
legend(loc='best');

subplot(3, 1, 3)
plot(sim.trange(), sim.data[error_p], c='b')
ylim(-1, 1)
legend(("Error[0]", "Error[1]"), loc='best');
title('Error')

Out[6]:
<matplotlib.text.Text at 0x1a04cba36d8>
In [7]:
#Turning learning on and off to test generalization
def inhibit(t):
return 2.0 if t > 10.0 else 0.0

with model:
inhib = nengo.Node(inhibit)
inhib_conn = nengo.Connection(inhib, error.neurons, transform=[[-1]] * error.n_neurons)

sim = nengo.Simulator(model)
sim.run(16.0)

0%

0%

In [30]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/control_learn.py.cfg')

In [8]:
t=sim.trange()

figure(figsize=(12, 8))
subplot(3, 1, 1)
plot(t, sim.data[inp_p].T[0], c='k', label='Input')
plot(t, sim.data[pre_p].T[0], c='b', label='Pre')
plot(t, sim.data[post_p].T[0], c='r', label='Post')
ylabel("Dimension 1")
legend(loc='best')
title('Learn a communication channel')

subplot(3, 1, 2)
plot(t, sim.data[inp_p].T[1], c='k', label='Input')
plot(t, sim.data[pre_p].T[1], c='b', label='Pre')
plot(t, sim.data[post_p].T[1], c='r', label='Post')
ylabel("Dimension 2")
legend(loc='best');

subplot(3, 1, 3)
plot(sim.trange(), sim.data[error_p], c='b')
ylim(-1, 1)
legend(("Error[0]", "Error[1]"), loc='best');
title('Error')

Out[8]:
<matplotlib.text.Text at 0x1a04e9f41d0>
In [9]:
#Compute a nonlinear functions
#model.connections.remove(err_fcn) #uncomment to try other fcns
#del err_fcn
model.connections.remove(inhib_conn)
model.nodes.remove(inhib)
model.connections.remove(learn_conn)

def nonlinear(x):
return x[0]*x[0], x[1]*x[1]

with model:
err_fcn = nengo.Connection(pre, error, function=nonlinear, transform=-1)

conn.learning_rule_type = nengo.PES(learning_rate=1e-4)
# Connect the error into the learning rule
learn_conn = nengo.Connection(error, conn.learning_rule)

sim = nengo.Simulator(model)
sim.run(26.0)

0%

0%

In [35]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/square_learn.py.cfg')

In [10]:
t=sim.trange()

figure(figsize=(12, 8))
subplot(3, 1, 1)
plot(t, sim.data[inp_p].T[0], c='k', label='Input')
plot(t, sim.data[pre_p].T[0], c='b', label='Pre')
plot(t, sim.data[post_p].T[0], c='r', label='Post')
ylabel("Dimension 1")
legend(loc='best')
title('Learn a nonlinear function')

subplot(3, 1, 2)
plot(t, sim.data[inp_p].T[1], c='k', label='Input')
plot(t, sim.data[pre_p].T[1], c='b', label='Pre')
plot(t, sim.data[post_p].T[1], c='r', label='Post')
ylabel("Dimension 2")
legend(loc='best');

subplot(3, 1, 3)
plot(sim.trange(), sim.data[error_p], c='b')
ylim(-1, 1)
legend(("Error[0]", "Error[1]"), loc='best');
title('Error')

Out[10]:
<matplotlib.text.Text at 0x1a04f0b8978>
• This rule can be used to learn any nonlinear vector function
• It does as well, or better, than the NEF decoder optimization
• It's a 'spike-based' rule... meaning it works in a spiking network
• It has been used for 'constant supervision' as well as 'reinforcement learning' (occasional supervision) tasks (Spaun uses it for the RL task)
• It moves the focus of learning research from weight changes or 'learning rules' to error signals
• Backprop is one way of propagating error signals (unfortunately not bio-plausible)
• Pretty much ignores encoders (which should maybe be about capturing all the incoming information, so as to compute any function over that information... though they can be optimized for a given fcn as well)

## Applications of PES¶

### Classical conditioning¶

• Classical or Pavlovian conditioning uses an unconditioned stimuli (US) (meat for a dog) that ellicits an unconditioned response (UR) (salivating) to cause a conditioned response (CR) (salivating after learning) to be ellicited by a conditioned stimulus (CS) (ringing a bell).

• The best known model of this is the Rescorla-Wagner model that states:

$\Delta V_x = \alpha (\lambda - \sum_x V)$

where $V_x$ is the value of conditioned stimulus $x$, $\alpha$ is a learning rate and salience parameter, $\lambda$ is the max value (usually 1).

• In the model below there is only 1 element in $\sum V$ (or you can assume there is little association between other stimuli and the US). The difference in brackets is like a reward prediction error.

In this model:

• There are three different US that are provided to the model, one after the other.
• Each has a different hardwired UR.
• There is also a CS provided (a different one for each US)
• The model attempts to learn to trigger the correct CR in response to the CS.
• After learning, the CR should start to respond before the corresponding UR.
In [12]:
import nengo
import numpy as np

D = 3
N = D*50

def us_stim(t):
# cycle through the three US
t = t % 3
if 0.9 < t< 1: return [1, 0, 0]
if 1.9 < t< 2: return [0, 1, 0]
if 2.9 < t< 3: return [0, 0, 1]
return [0, 0, 0]

def cs_stim(t):
# cycle through the three CS
t = t % 3
if 0.7 < t< 1: return [0.7, 0, 0.5]
if 1.7 < t< 2: return [0.6, 0.7, 0.8]
if 2.7 < t< 3: return [0, 1, 0]
return [0, 0, 0]

model = nengo.Network(label="Classical Conditioning")
with model:
us_stim = nengo.Node(us_stim)
cs_stim = nengo.Node(cs_stim)

us = nengo.Ensemble(N, D)
cs = nengo.Ensemble(N*2, D*2)

nengo.Connection(us_stim, us[:D])
nengo.Connection(cs_stim, cs[:D])
nengo.Connection(cs[:D], cs[D:], synapse=0.2)

ur = nengo.Ensemble(N, D)
nengo.Connection(us, ur)

cr = nengo.Ensemble(N, D)
learn_conn = nengo.Connection(cs, cr, function=lambda x: [0]*D)
learn_conn.learning_rule_type = nengo.PES(learning_rate=3e-4)

error = nengo.Ensemble(N, D)
nengo.Connection(error, learn_conn.learning_rule)
nengo.Connection(ur, error, transform=-1)
nengo.Connection(cr, error, transform=1, synapse=0.1)

stop_learn = nengo.Node([0])
nengo.Connection(stop_learn, error.neurons, transform=-10*np.ones((N, 1)))

In [13]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/learning2-conditioning.py.cfg')


### Cortical Consolidation¶

• There is evidence that when you first learn a skill, it takes a lot of effort and you tend to perform fairly slowly.
• We would think of this as requiring a lot of intervention from the basal ganglia in selecting actions.
• As you get better at the skill you become much faster, and BG is used less because cortex 'takes over' cental aspects of that skill, consolidating it into cortico-cortical connections.
• The next model shows a toy version of this kind of behaviour.

In this model:

• there is a slow mapping from pre->wm->target (because of long synaptic time constants)
• there is a fast, direct connection from pre->post
• the fast connection is trained using the error signal from the slow system
• the fast system learns to produce the correct output before the slow system
• if you change the 'context' the fast system will learn the output
• there is a more complete model that uses this kind of thing (but with BG) in Aubin et al., 2016
In [14]:
import nengo
import numpy as np

tau_slow = 0.2

model = nengo.Network("Cortical Consolidation")
with model:
pre_value = nengo.Node(lambda t: np.sin(t))

pre = nengo.Ensemble(100, 1)
post = nengo.Ensemble(100, 1)
target = nengo.Ensemble(100, 1)
nengo.Connection(pre_value, pre)

conn = nengo.Connection(pre, post, function=lambda x: np.random.random(),
learning_rule_type=nengo.PES())

context = nengo.Node(1)
nengo.Connection(context, wm[1])
nengo.Connection(pre, wm[0], synapse=tau_slow)

nengo.Connection(wm, target, synapse=tau_slow,
function=lambda x: x[0]*x[1])

error = nengo.Ensemble(n_neurons=100, dimensions=1)
nengo.Connection(post, error, synapse=tau_slow*2, transform=1) #Delay the fast connection so they line up
nengo.Connection(target, error, transform=-1)

nengo.Connection(error, conn.learning_rule)

stop_learn = nengo.Node([0])
nengo.Connection(stop_learn, error.neurons, transform=-10*np.ones((100,1)))

both = nengo.Node(None, size_in=2) #For plotting
nengo.Connection(post, both[0], synapse=None)
nengo.Connection(target, both[1], synapse=None)

In [15]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/learning3-consolidation.py.cfg')


### Reinforcement Learning¶

• As mentioned in the last lecture, RL is a useful way to think about action selection.
• You have a set of actions and a set of states, and you figure out the value of each action in each state, letting you construct a big table $Q(s,a)$ which you can use to pick good actions.
• RL figures out what those values are through trial and error.

$\Delta Q(s,a) = \alpha (R + \gamma Q_{predicted} - Q_{old})$ where $R$ is reward, $\alpha$ is a learning rate and $\gamma$ is a discount factor.

In the model:

• the agent has three actions (go forward, turn left, and turn right)
• its only sense are three range finders (radar)
• initially it should always go forward
• it gets reward proportial to its forward speed, but a large negative reward for hitting walls
• the error signal is simply the difference between the computed utility and the instantaneous reward
• $\Delta Q(s,a) = \alpha (R - Q_{current})$
• this error will only be applied to whatever action is currently being chosen, which means it cannot learn to do actions that will lead to future rewards
In [1]:
import grid

mymap="""
#########
#       #
#       #
#   ##  #
#   ##  #
#       #
#########

"""

class Cell(grid.Cell):
def color(self):
return 'black' if self.wall else None
if char == '#':
self.wall = True

world = grid.World(Cell, map=mymap, directions=4)

body = grid.ContinuousAgent()

import nengo
import numpy as np

def move(t, x):
speed, rotation = x
dt = 0.001
max_speed = 20.0
max_rotate = 10.0
body.turn(rotation * dt * max_rotate)
success = body.go_forward(speed * dt * max_speed)
if not success: #Hit a wall
return -1
else:
return speed

model = nengo.Network("Simple RL", seed=2)
with model:
env = grid.GridNode(world, dt=0.005)

#set up node to project movement commands to
movement_node = nengo.Node(move, size_in=2, label='reward')
nengo.Connection(movement, movement_node)

def detect(t):
#put sensors at -45 0 45 compared to facing direction
angles = (np.linspace(-0.5, 0.5, 5) + body.dir ) % world.directions
return [body.detect(d, max_distance=4)[0] for d in angles]

#set up low fidelity sensors; noise might help exploration

#set up BG to allow 3 actions (left/fwd/right)
bg = nengo.networks.actionselection.BasalGanglia(3)
thal = nengo.networks.actionselection.Thalamus(3)
nengo.Connection(bg.output, thal.input)

#start with a kind of random selection process, but like going fwd most
def u_fwd(x):
return 0.8
def u_left(x):
return 0.6
def u_right(x):
return 0.7

conn_fwd = nengo.Connection(radar, bg.input[0], function=u_fwd, learning_rule_type=nengo.PES())
conn_left = nengo.Connection(radar, bg.input[1], function=u_left, learning_rule_type=nengo.PES())
conn_right = nengo.Connection(radar, bg.input[2], function=u_right, learning_rule_type=nengo.PES())

nengo.Connection(thal.output[0], movement, transform=[[1],[0]])
nengo.Connection(thal.output[1], movement, transform=[[0],[1]])
nengo.Connection(thal.output[2], movement, transform=[[0],[-1]])

errors = nengo.networks.EnsembleArray(n_neurons=50, n_ensembles=3)
nengo.Connection(movement_node, errors.input, transform=-np.ones((3,1)))
#inhibit learning for actions not currently chosen (recall BG is high for non-chosen actions)
nengo.Connection(bg.output[0], errors.ensembles[0].neurons, transform=np.ones((50,1))*4)
nengo.Connection(bg.output[1], errors.ensembles[1].neurons, transform=np.ones((50,1))*4)
nengo.Connection(bg.output[2], errors.ensembles[2].neurons, transform=np.ones((50,1))*4)
nengo.Connection(bg.input, errors.input, transform=1)

nengo.Connection(errors.ensembles[0], conn_fwd.learning_rule)
nengo.Connection(errors.ensembles[1], conn_left.learning_rule)
nengo.Connection(errors.ensembles[2], conn_right.learning_rule)

In [2]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/learning5-utility.py.cfg')


### Better RL¶

• To improve our RL it would be good to predict future rewards more accurately.
• It would be good to learn the function $Q(s,a)$.
• Let's assume that your policy is fixed, so future actions are fixed.
• As well, future rewards are 90% as good as current rewards (i.e. they are discounted).
• Consequently, we have:

$Q(s,t) = R(s,t) + 0.9 R(s+1, t+1) + 0.9^2 R(s+2, t+2) + ...$.

• So also,

$Q(s+1,t+1) = R(s+1,t+1) + 0.9 R(s+2, t+2) + 0.9^2 R(s+3, t+3) + ...$. $0.9 Q(s+1,t+1) = 0.9 R(s+1,t+1) + 0.9^2 R(s+2, t+2) + 0.9^3 R(s+3, t+3) + ...$.

• Substituting this last equation into the first gives

$Q(s,t) = R(s,t) + 0.9 Q(s+1, t+1)$.

• This suggests an error rule:

$Error(t) = Q(s-1) - (R(s-1) + 0.9 Q(s))$

In this model:

• the agent always moves randomly, it's not using what it learns to change its movement (it is just trying to anticipate future rewards)
• the agent is given a reward whenever it is in the green square, and a punishment (negative reward) whenever it is in the red square
• it learns to anticipate the reward/punishment as shown in the value graph
• we convert the error rule into the continuous domain by using a long time constant for s-1 and a short time constant for s (assuming we switch states at each time step).
In [1]:
import grid

mymap="""
#######
#     #
# # # #
# # # #
#G   R#
#######
"""

class Cell(grid.Cell):
def color(self):
if self.wall:
return 'black'
elif self.reward > 0:
return 'green'
elif self.reward < 0:
return 'red'
return None
self.reward = 0
if char == '#':
self.wall = True
if char == 'G':
self.reward = 10
elif char == 'R':
self.reward = -10

world = grid.World(Cell, map=mymap, directions=4)

body = grid.ContinuousAgent()

import nengo
import numpy as np

tau=0.1

def move(t, x):
speed, rotation = x
dt = 0.001
max_speed = 20.0
max_rotate = 10.0
body.turn(rotation * dt * max_rotate)
body.go_forward(speed * dt * max_speed)

if int(body.x) == 1:
world.grid[4][4].wall = True
world.grid[4][2].wall = False
if int(body.x) == 4:
world.grid[4][2].wall = True
world.grid[4][4].wall = False

model = nengo.Network("Predict Value", seed=2)
with model:
env = grid.GridNode(world, dt=0.005)

movement = nengo.Node(move, size_in=2)

def detect(t):
angles = (np.linspace(-0.5, 0.5, 3) + body.dir) % world.directions
return [body.detect(d, max_distance=4)[0] for d in angles]
noise=nengo.processes.WhiteSignal(10, 0.1, rms=1))

def braiten(x):
turn = x[2] - x[0]
spd = x[1] - 0.5
return spd, turn

def position_func(t):
return body.x / world.width * 2 - 1, 1 - body.y/world.height * 2, body.dir / world.directions
position = nengo.Node(position_func)
state = nengo.Ensemble(100, 3)
nengo.Connection(position, state, synapse=None)

reward = nengo.Node(lambda t: body.cell.reward)

value = nengo.Ensemble(n_neurons=50, dimensions=1)

learn_conn = nengo.Connection(state, value, function=lambda x: 0,
learning_rule_type=nengo.PES(learning_rate=1e-4,
pre_tau=tau))
nengo.Connection(reward, learn_conn.learning_rule,
transform=-1, synapse=tau)
nengo.Connection(value, learn_conn.learning_rule,
transform=-0.9, synapse=0.01)
nengo.Connection(value, learn_conn.learning_rule,
transform=1, synapse=tau)

In [2]:
from nengo_gui.ipython import IPythonViz
IPythonViz(model,'configs/learning6-value.py.cfg')


### Unsupervised learning¶

• Hebbian learning
• Neurons that fire together, wire together
• $\Delta \omega_{ij} = \kappa a_i a_j$
• Just that would be unstable
• Why?
• BCM rule (Bienenstock, Cooper, & Munro, 1982)
• $\Delta \omega_{ij} = \kappa a_i a_j (a_j-\theta)$
• $\theta$ is an activity threshold
• If post-synaptic neuron is more active than this threshold, increase strength
• Otherwise decrease it
• Other than that, it's a standard Hebbian rule
• Where would we get $\theta$?
• Need to store something about the overall recent activity of neuron $j$ so it can be compared to its current activity
• Just have $\theta$ be a pstc-filtered spiking of $a_j$
In [7]:
import nengo
import numpy as np

model = nengo.Network()
with model:
sin = nengo.Node(lambda t: np.sin(t*4))

pre = nengo.Ensemble(100, dimensions=1)
post = nengo.Ensemble(100, dimensions=1)

nengo.Connection(sin, pre)
conn = nengo.Connection(pre, post, solver=nengo.solvers.LstsqL2(weights=True))

pre_p = nengo.Probe(pre, synapse=0.01)
post_p = nengo.Probe(post, synapse=0.01)

sim = nengo.Simulator(model)
sim.run(2.0)

Building finished in 0:00:01.
Simulating finished in 0:00:01.

In [8]:
plot(sim.trange(), sim.data[pre_p], label="Pre")
plot(sim.trange(), sim.data[post_p], label="Post")
ylabel("Decoded value")
legend(loc="best");

In [9]:
conn.learning_rule_type = nengo.BCM(learning_rate=5e-10)

with model:
trans_p = nengo.Probe(conn, 'weights', synapse=0.01, sample_every=0.01)

sim = nengo.Simulator(model)
sim.run(20.0)

Building finished in 0:00:01.
Simulating finished in 0:00:11.

In [10]:
figure(figsize=(12, 8))
subplot(2, 1, 1)
plot(sim.trange(), sim.data[pre_p], label="Pre")
plot(sim.trange(), sim.data[post_p], label="Post")
ylabel("Decoded value")
ylim(-1.6, 1.6)
legend(loc="lower left")

subplot(2, 1, 2)
# Find weight row with max variance
neuron = np.argmax(np.mean(np.var(sim.data[trans_p], axis=0), axis=1))
plot(sim.trange(dt=0.01), sim.data[trans_p][..., neuron])
ylabel("Connection weight");

In [12]:
def sparsity_measure(vector):
# Max sparsity = 1 (single 1 in the vector)
v = np.sort(np.abs(vector))
n = v.shape[0]
k = np.arange(n) + 1
l1norm = np.sum(v)
summation = np.sum((v / l1norm) * ((n - k + 0.5) / n))
return 1 - 2 * summation

print("Starting sparsity: {0}".format(sparsity_measure(sim.data[trans_p][0])))
print("Ending sparsity: {0}".format(sparsity_measure(sim.data[trans_p][-1])))

Starting sparsity: 0.1774754410346847
Ending sparsity: 0.42159758311458473

• Result: only a few neurons will fire
• Sparsification
• What would this do in NEF terms?
• Still represent $x$, but with very sparse encoders (assuming the function doesn't change)
• This is still a rule on the weight matrix, but functionally seems to be more about encoders than decoders
• What could we do, given that?

## The homeostatic Prescribed Error Sensitivity (hPES) rule¶

$\Delta \omega_{ij} = \kappa a_i (\alpha_j S e_j \cdot E + (1-S) a_j (a_j-\theta))$

• Works as well (or better) than PES
• Seems to be a bit more stable, but analysis is ongoing
• Biological evidence?
• Spike-Timing Dependent Plasticity

• Still work to do for comparison, but seems promising
• Error-driven for improving decoders
• Hebbian sparsification to improve encoders
• Perhaps to sparsify connections (energy savings in the brain, but not necessarily in simulation)