General Linear Model

Application: population receptive field fitting

A tutorial by Jan Willem de Gee ([email protected])

This is Python, so let's start with importing some modules:

In [11]:
import os
import numpy as np
import scipy as sp
import scipy.stats as stats
import scipy.signal as signal
import matplotlib.pyplot as plt
from sympy import *
import math

Let's do inline plotting:

In [12]:
%matplotlib inline 

Let's first create and plot the true pupil response to a transient input. This is called an "Inpulse Response Function" (IRF).

In [13]:
# define an Impulse Response Function:
def pupil_IRF(timepoints, s=1.0/(10**26), n=10.1, tmax=0.93):
    
    """ pupil_IRF defines the IRF (response to a transient input) of the pupil.
    
    Parameters
    ----------
    t: IRF is defined with respect to 't'
    s: scaling factor
    n: sets the width
    tmax: sets the time to peak 
    IRF_len: function is evaluated for t = [0:IRF_len]
    
    Returns
    -------
    y: IRF evaluated for t = [0:IRF_len]
    yprime: IRF first derivative evaluated for t = [0:IRF_len]
    
    """
    
    # in sympy:
    t = Symbol('t')
    y = ( (s) * (t**n) * (math.e**((-n*t)/tmax)) )
    yprime = y.diff(t)
    
    # lambdify:
    y = lambdify(t, y, "numpy")
    yprime = lambdify(t, yprime, "numpy")
    
    # evaluate:
    y = y(timepoints)
    yprime = yprime(timepoints)
    
    return (y, yprime)

# create the IRF:
sample_rate = 10
IRF_len = 3.0 # in seconds
timepoints = np.linspace(0,IRF_len,IRF_len*sample_rate)

IRF, IRF_prime = pupil_IRF(timepoints=timepoints)
IRF = IRF / IRF.std()
IRF_prime = IRF_prime / IRF_prime.std()

# plot the IRF:
fig = plt.figure()
plt.plot(timepoints, IRF, color='r')
# plt.plot(timepoints, IRF_prime, color='g')
plt.legend(['IRF'])
plt.title('Impulse Response Function')
plt.xlabel('time (s)')
plt.ylabel('a.u.')
Out[13]:
<matplotlib.text.Text at 0x10ff31490>

Let's simulate convolved timeseries data based on this IRF.

In [14]:
duration = 60 # in seconds
times_l = np.array([5,15,16,25])
times_r = np.array([35,45,46,55])
input_signal_l = np.zeros(duration * sample_rate)
input_signal_r = np.zeros(duration * sample_rate)
for i in times_l:
    input_signal_l[i*sample_rate] = 1.5
for i in times_r:
    input_signal_r[i*sample_rate] = 0.5
input_signal = input_signal_l + input_signal_r

# convolve inputs with IRF:    
convolved_signal = (sp.convolve(input_signal, IRF, 'full'))[:-(IRF.shape[0]-1)]

# let's add some noise:
convolved_signal_noise = convolved_signal + np.random.normal(0,0.1,len(convolved_signal))

# plot simulated convolved signal with noise:
timepoints = np.linspace(0,duration,duration*sample_rate)
fig = plt.figure()
plt.plot(timepoints, convolved_signal_noise, 'b')
plt.legend(['BOLD time series'], loc=1)
for i in times_l:
    plt.axvline(i, color='r', alpha=0.5)
for i in times_r:
    plt.axvline(i, color='g', alpha=0.5)
# plt.legend(['"true" neural signal', 'pupil time series'], loc=1)
plt.title('BOLD time series, with measurement noise')
plt.xlabel('time (s)')
plt.ylabel('a.u.')
Out[14]:
<matplotlib.text.Text at 0x110159850>

Let's create a regressor separately for every type of event. In our case, we want to separately model (predict) a BOLD response every time a stimulus was presented on the left side of the screen, and model (predict) a response every time a stimulus was presented on the rights side of the screen:

In [15]:
duration = 60 # in seconds
times_l = np.array([5,15,16,25])
times_r = np.array([35,45,46,55])
regr_l = np.zeros(duration * sample_rate)
regr_r = np.zeros(duration * sample_rate)
for i in times_l:
    regr_l[i*sample_rate] = 1
for i in times_r:
    regr_r[i*sample_rate] = 1

regr_l_convolved = (sp.convolve(regr_l, IRF, 'full'))[:-(IRF.shape[0]-1)]
regr_r_convolved = (sp.convolve(regr_r, IRF, 'full'))[:-(IRF.shape[0]-1)]

# demean measured data and regressors:
regr_l_convolved = regr_l_convolved - regr_l_convolved.mean()
regr_r_convolved = regr_r_convolved - regr_r_convolved.mean()
convolved_signal_noise = convolved_signal_noise - convolved_signal_noise.mean()

timepoints = np.linspace(0,duration,duration*sample_rate)
fig = plt.figure()
plt.plot(timepoints, convolved_signal_noise, 'b')
plt.plot(timepoints, regr_l_convolved, color='r', lw=2)
plt.plot(timepoints, regr_r_convolved+0.05, color='g', lw=2)
plt.legend(['BOLD time series', 'regressor left events', 'regressor right events'], loc=1)
plt.title('regressors')
plt.xlabel('time (s)')
plt.ylabel('a.u.')
Out[15]:
<matplotlib.text.Text at 0x1102aa190>

For every regressor (in the above toy example, we have two regressors) we want to find an associated scalar value (the "beta", $b$) that we can use to scale that particular regressor with, such that it best describes the measured data. Eyebaling the above figure, for the regressor predicting visual events that occured left on the screen, we will need a $b$ > 1; in contrast, for the regressor predicting visual events that occured right on the screen, we will need a $b$ < 1.

In the GLM, with a procedure called "multiple regression" we look for betas that minimimize the sum of squares of errors across all $k$ regressors in our design matrix at the same time.

To do so, we set up the following equation (for a derivation, see all the way below):

$ b = (X'X)^{-1} X'y $

In which,

$b$ is a vector containing the betas (size: number of regressors; in the above toy example: 2);

$X$ is the design matrix (size: length BOLD time series x number of regressors);

$y$ is the measured BOLD timeseries.

In [16]:
# make design matrix:
designMatrix = np.vstack((regr_l_convolved, regr_r_convolved))

# # plot design matrix:
# fig = plt.figure(figsize=(2,2))
# plt.imshow(designMatrix.T)
# plt.xticks([0,1])
# plt.title('design matrix')
# plt.xlabel('nr samples')
# plt.ylabel('length run')

# multiple regression:
designMatrix = np.mat(designMatrix).T
betas = ((designMatrix.T * designMatrix).I * designMatrix.T) * np.mat(convolved_signal_noise).T
betas = np.array(betas)

print 'betas: {}'.format(betas.ravel())
betas: [ 1.49740104  0.50932406]

Indeed, we find betas ~1.5 and ~0.5. This is what we used as input to simulate the BOLD timeseries. So this works!! Note that we don't find the exact same values due to the random noise we added to the BOLD time series.

Let's plot our exaplained signal. We construct this by multiplying our regessors with their respective betas, and add them all up:

In [17]:
explained_signal = regr_l_convolved*betas[0] + regr_r_convolved*betas[1]
residuals = convolved_signal_noise - explained_signal

fig = plt.figure()
plt.plot(timepoints, convolved_signal_noise, 'b')
plt.plot(timepoints, explained_signal, 'm', lw=2)
plt.legend(['BOLD timeseries', 'explained signal'], loc=1)
plt.title('predicted signal')
plt.xlabel('time (s)')
plt.ylabel('a.u.')
Out[17]:
<matplotlib.text.Text at 0x11011d890>

Let's also plot our residual error, the part of the signal that we were not able to explain. We construct this my subtracting our explained signal from the BOLD timeseries. Note that this should look like random (white) noise. If not, then there is still some consistent components in the signal that we failed to exaplain.

In [18]:
fig = plt.figure()
plt.plot(timepoints, residuals, 'c')
plt.legend(['residuals'], loc=1)
plt.title('residuals')
plt.xlabel('time (s)')
plt.ylabel('a.u.')
Out[18]:
<matplotlib.text.Text at 0x110169790>

Finally, to construct a PRF, let's plot a "heatmap" of our betas. The location in the plot of a particular beta should correspond to the location on the screen that that regressor was trying to exaplain:

In [19]:
fig = plt.figure()
ax = fig.add_subplot(111)
a = ax.imshow(betas.T, interpolation='nearest', vmin=0, vmax=2, cmap='seismic') # cmap='YlOrRd')
ax.set_xticks([0,1])
ax.set_yticks([])
ax.set_xticklabels(['beta left', 'beta right'])
fig.colorbar(a, ticks=[0, 1, 2], orientation='vertical', )
Out[19]:
<matplotlib.colorbar.Colorbar instance at 0x1107ea878>

Here you see that this particular voxel indeed was driven more strongly by visual events occuring on the left side of the screen, than by event occuring on the right side of the screen.

Now, this was just a toy example. In practice we won't create two regressors for two locations in the visual field. Instead, we will create for example 30 by 30 regressors that cover the visual field with high resolution. In that case, the fitted pRF of one voxel might look something like this:

In [20]:
sizex = 30
sizey = 30
x, y = np.mgrid[-sizex+10:sizex+10+1, -sizey+10:sizey+10+1]
g = np.exp(-0.333*(x**2/float(sizex)+y**2/float(sizey)))
PRF = g/g.sum()
plt.imshow(PRF)
plt.xticks([]); plt.yticks([])
plt.title('"Real" pRF')
Out[20]:
<matplotlib.text.Text at 0x11086a3d0>

Derivation of the multiple regression equation:

In raw form the regression equation is:

$ y = a + b_1X_1 + b_2X_2 + ... + B_kX_k + e $

This says that y, our dependent variable, is composed of a linear part and error. The linear part is composed of an intercept, a, and k independent variables, $X_1 ... X_k$ along with their associated regression weights $b_1 ... b_k$.

In matrix terms, the same equation can be written:

$ y = X b + e $

If we solve for the $b$ weights, we find that

$ b = (X'X)^{-1} X'y $

To give you an idea why it looks like that, first remember the regression equation:

$ y = X b + e $

Let's assume that error will equal zero on average. Now we can simply drop the error term, in order to sketch a proof:

$ y = Xb $

Now we want to solve for $b$, so we need to get rid of $X$. First we will make $X$ into a square, symmetric matrix by premultiplying both sides of the equation by $X'$:

$ X'y = X'Xb $

And now we have a square, symmetric matrix that with any luck has an inverse, which we will call $(X'X)^{-1}$. Multiply both sides by this inverse, and we have:

$ (X'X)^{-1} X'y = (X'X)^{-1} (X'X)b $

It turns out that a matrix multiplied by its inverse is the identity matrix $(A^{-1}A=I)$:

$ (X'X)^{-1} X'y=Ib $

and a matrix multiplied by the identity matrix is itself $(AI = IA = A)$:

$ (X'X)^{-1} X'y=b $

which is the desired result.