Preparation of raw data for encoding task (vim-2)

We shall have two categorically distinct kinds of data: visual stimulus in the form of natural movies, and blood oxygen level-dependent (BOLD) signals measured using functional magnetic resonance imaging (fMRI), as an indicator of brain activity.

Stimuli Image

fMRI Image

Our chief goal will be to create a working encoder, which predicts brain activity based on the contents of visual stimulus.

Overview of the data set

The data set of interest, called "Gallant Lab Natural Movie 4T fMRI Data set" (vim-2) contains the following types of data (of interest to us today):

Stimuli

Stimuli.tar.gz 4089110463 (3.8 GB)

BOLD response

VoxelResponses_subject1.tar.gz 3178624411 (2.9 GB)
VoxelResponses_subject2.tar.gz 3121761551 (2.9 GB)
VoxelResponses_subject3.tar.gz 3216874972 (2.9 GB)

All stimuli are stored in the Stimuli.mat file, Matlab v.7.3 format. Variables take the following forms.

  • st: training stimuli. 128x128x3x108000 matrix (108000 128x128 rgb frames).
  • sv: validation stimuli. 128x128x3x8100 matrix (8100 128x128 rgb frames).

For the training data, stimulus was presented at 15fps over 7200 timepoints = 120 minutes, for a total of 108000 frames. For the validation data, stimulus was presented at 15fps over 540 timepoints = 9 minutes for a total of 8100 frames.

The validation stimulus was presented a total of ten times to each subject, and the response values used above correspond to the average taken over these trials. The "raw" validation response values (pre-averaging) are available, but the mean values serve our purposes just fine.

Finally, no re-arranging of the data is required, fortunately. The authors note:

"The order of the stimuli in the "st" and "sv" variables matches the order of the stimuli in the "rt" and "rv" variables in the response files."

Moving forward, the first task is to decompress the data.

$ tar -xzf Stimuli.tar.gz
$ tar -xzf VoxelResponses_subject1.tar.gz
$ tar -xzf VoxelResponses_subject2.tar.gz
$ tar -xzf VoxelResponses_subject3.tar.gz

This leaves us with files Stimuli.mat and VoxelResponses_subject{1,2,3}.mat. Our key tool of use here will be the PyTables library (http://www.pytables.org/usersguide/index.html), which is convenient for hierarchical data. To see what is stored in the data (and corroborate with the documentation), run the following.

$ ptdump Stimuli.mat
/ (RootGroup) ''
/st (EArray(108000, 3, 128, 128), zlib(3)) ''
/sv (EArray(8100, 3, 128, 128), zlib(3)) ''

Note the coordinates. There is not much of a hierarchy here, just two folders in the root group. Note the order of the dimensions; dim 1 is the observation index, dim 2 is the RGB channel, and dims 3 and 4 specify pixel position in an array.

The response data has a richer structure.

$ ptdump VoxelResponses_subject1.mat 
/ (RootGroup) ''
/rt (EArray(73728, 7200), zlib(3)) ''
/rv (EArray(73728, 540), zlib(3)) ''
/rva (EArray(73728, 10, 540), zlib(3)) ''
(...Warnings...)
/ei (Group) ''
/ei/TRsec (Array(1, 1)) ''
/ei/datasize (Array(3, 1)) ''
/ei/imhz (Array(1, 1)) ''
/ei/valrepnum (Array(1, 1)) ''
/roi (Group) ''
/roi/FFAlh (EArray(18, 64, 64), zlib(3)) ''
/roi/FFArh (EArray(18, 64, 64), zlib(3)) ''
/roi/IPlh (EArray(18, 64, 64), zlib(3)) ''
/roi/IPrh (EArray(18, 64, 64), zlib(3)) ''
/roi/MTlh (EArray(18, 64, 64), zlib(3)) ''
/roi/MTplh (EArray(18, 64, 64), zlib(3)) ''
/roi/MTprh (EArray(18, 64, 64), zlib(3)) ''
/roi/MTrh (EArray(18, 64, 64), zlib(3)) ''
/roi/OBJlh (EArray(18, 64, 64), zlib(3)) ''
/roi/OBJrh (EArray(18, 64, 64), zlib(3)) ''
/roi/PPAlh (EArray(18, 64, 64), zlib(3)) ''
/roi/PPArh (EArray(18, 64, 64), zlib(3)) ''
/roi/RSCrh (EArray(18, 64, 64), zlib(3)) ''
/roi/STSrh (EArray(18, 64, 64), zlib(3)) ''
/roi/VOlh (EArray(18, 64, 64), zlib(3)) ''
/roi/VOrh (EArray(18, 64, 64), zlib(3)) ''
/roi/latocclh (EArray(18, 64, 64), zlib(3)) ''
/roi/latoccrh (EArray(18, 64, 64), zlib(3)) ''
/roi/v1lh (EArray(18, 64, 64), zlib(3)) ''
/roi/v1rh (EArray(18, 64, 64), zlib(3)) ''
/roi/v2lh (EArray(18, 64, 64), zlib(3)) ''
/roi/v2rh (EArray(18, 64, 64), zlib(3)) ''
/roi/v3alh (EArray(18, 64, 64), zlib(3)) ''
/roi/v3arh (EArray(18, 64, 64), zlib(3)) ''
/roi/v3blh (EArray(18, 64, 64), zlib(3)) ''
/roi/v3brh (EArray(18, 64, 64), zlib(3)) ''
/roi/v3lh (EArray(18, 64, 64), zlib(3)) ''
/roi/v3rh (EArray(18, 64, 64), zlib(3)) ''
/roi/v4lh (EArray(18, 64, 64), zlib(3)) ''
/roi/v4rh (EArray(18, 64, 64), zlib(3)) ''

Within the root group, we have arrays rt, rv, and rva which contain BOLD response data, group roi which contains arrays used as indices for assigning particular responses to particular voxels, and group ei which stores pertinent experimental information. In particular, under the roi group there are numerous arrays named according to anatomical/functional regions of interest (ROI). For example, v4rh corresponds to the V4 region in the left hemisphere. Since there are far less ROIs than there are voxels $18 \times 64 \times 64 = 73728$, note that each region is composed of many voxels.

Stimulus: checking original data

Let's begin by taking a look inside the stimulus.

In [1]:
import numpy as np
import tables
import matplotlib
from matplotlib import pyplot as plt
import pprint as pp

# Open file connection.
f = tables.open_file("data/vim-2/Stimuli.mat", mode="r")

# Get object and array.
stimulus_object = f.get_node(where="/", name="st")
print("stimulus_object:")
print(stimulus_object)
print(type(stimulus_object))
print("----")

stimulus_array = stimulus_object.read()
print("stimulus_array:")
#print(stimulus_array)
print(type(stimulus_array))
print("----")

# Close the connection.
f.close()

# Check that it is closed.
if not f.isopen:
    print("Successfully closed.")
else:
    print("File connection is still open.")
stimulus_object:
/st (EArray(108000, 3, 128, 128), zlib(3)) ''
<class 'tables.earray.EArray'>
----
stimulus_array:
<class 'numpy.ndarray'>
----
Successfully closed.

Let's inspect the content of this array to ensure it has what we expect.

In [2]:
# Open file connection.
f = tables.open_file("data/vim-2/Stimuli.mat", mode="r")
In [3]:
# Get object and array.
stimulus_object = f.get_node(where="/", name="st")
print("stimulus_object:")
print(stimulus_object)
print(type(stimulus_object))
print("----")
stimulus_object:
/st (EArray(108000, 3, 128, 128), zlib(3)) ''
<class 'tables.earray.EArray'>
----
In [4]:
# Print out basic attributes.
stimulus_array = stimulus_object.read()
#print("stimulus_array:")
#print(stimulus_array)
print("(type)")
print(type(stimulus_array))
print("(dtype)")
print(stimulus_array.dtype)
print("----")
(type)
<class 'numpy.ndarray'>
(dtype)
uint8
----
In [5]:
# Print out some frames.
num_frames = stimulus_array.shape[0]
num_channels = stimulus_array.shape[1]
frame_w = stimulus_array.shape[2]
frame_h = stimulus_array.shape[3]

frames_to_play = 5

oneframe = np.zeros(num_channels*frame_h*frame_w, dtype=np.uint8).reshape((frame_h, frame_w, num_channels))
im = plt.imshow(oneframe)

for t in range(frames_to_play):
    oneframe[:,:,0] = stimulus_array[t,0,:,:] # red
    oneframe[:,:,1] = stimulus_array[t,1,:,:] # green
    oneframe[:,:,2] = stimulus_array[t,2,:,:] # blue
    plt.imshow(oneframe)
    plt.show()
In [6]:
# Close file connection.
f.close()

Anyways, clearly the content is roughly as we would expect (a series of video clips), although the orientation is incorrect, and the framerate is much higher than the sampling rate of the responses we'll be working with.

Exercises:

  1. Swap the coordinates (see help(np.swapaxes)) to fix the orientation.

  2. In addition to the first handful of frames, check the last handful of frames to ensure that the content is as we expect.

  3. Repeat the visualization procedure above for st (the training data) to ensure it is also as we expect.

  4. Do the above procedures for the first 100 frames. Note that the current framerate is 15fps. By modifying the for loop above, using np.arange instead of range, we can easily "down-sample" to a lower temporal frequency. Try downsampling to a rate of 1fps (by displaying one out of every fifteen frames).

Stimulus: spatial down-sampling

For computational purposes, it is advisable to spatially down-sample the stimulus data, in addition to the temporal down-sampling required to align it with the BOLD signal responses (this can be done later). Here we carry out the spatial down-sampling. Using the function resize from the transform module of scikit-image, this can be done easily. First, a simple test.

In [2]:
from skimage import transform
from matplotlib import pyplot as plt
import imageio
import tables
import numpy as np

im = imageio.imread("img/bishop.png") # a 128px x 128px image
In [3]:
med_h = 96 # desired height in pixels
med_w = 96 # desired width in pixels
im_med = transform.resize(image=im, output_shape=(med_h,med_w), mode="reflect")

small_h = 32
small_w = 32
im_small = transform.resize(image=im, output_shape=(small_h,small_w), mode="reflect")

tiny_h = 16
tiny_w = 16
im_tiny = transform.resize(image=im, output_shape=(tiny_h,tiny_w), mode="reflect")
In [4]:
myfig = plt.figure(figsize=(18,4))

ax_im = myfig.add_subplot(1,4,1)
plt.imshow(im)
plt.title("Original image")
ax_med = myfig.add_subplot(1,4,2)
plt.imshow(im_med)
plt.title("Resized image (Medium)")
ax_small = myfig.add_subplot(1,4,3)
plt.imshow(im_small)
plt.title("Resized image (Small)")
ax_small = myfig.add_subplot(1,4,4)
plt.imshow(im_tiny)
plt.title("Resized image (Tiny)")

plt.show()

Everything appears to be working as expected. Let us now do this for the entire set of visual stimulus.

In [5]:
# Open file connection.
f = tables.open_file("data/vim-2/Stimuli.mat", mode="r")
In [6]:
# Specify which subset to use.
touse_subset = "sv" 
In [7]:
# Get object, array, and properties.
stimulus_object = f.get_node(where="/", name=touse_subset)
stimulus_array = stimulus_object.read()
num_frames = stimulus_array.shape[0]
num_channels = stimulus_array.shape[1]
In [8]:
# Swap the axes.
print("Stimulus array (before):", stimulus_array.shape)
stimulus_array = np.swapaxes(a=stimulus_array, axis1=0, axis2=3)
stimulus_array = np.swapaxes(a=stimulus_array, axis1=1, axis2=2)
print("Stimulus array (after):", stimulus_array.shape)
Stimulus array (before): (8100, 3, 128, 128)
Stimulus array (after): (128, 128, 3, 8100)
In [9]:
# Prepare new array of downsampled stimulus.
ds_h = 64
ds_w = 64

new_array = np.zeros(num_frames*num_channels*ds_h*ds_w, dtype=np.float32)
new_array = new_array.reshape((ds_h, ds_w, num_channels, num_frames))
In [10]:
# Iterate over frames to be resized.
for t in range(num_frames):
    new_array[:,:,:,t] = transform.resize(image=stimulus_array[:,:,:,t],
                                          output_shape=(ds_h,ds_w),
                                          mode="reflect")
    if t % 1000 == 0:
        print("Update: t =", t)
        
f.close()
Update: t = 0
Update: t = 1000
Update: t = 2000
Update: t = 3000
Update: t = 4000
Update: t = 5000
Update: t = 6000
Update: t = 7000
Update: t = 8000

Note that resizing the images yields pixel values on $[0,1]$ rather than $\{0,1,\ldots,255\}$, which is why we set the dtype to np.float32:

In [12]:
print("(Pre-downsize) max:", np.max(stimulus_array),
      "min:", np.min(stimulus_array),
      "ave:", np.mean(stimulus_array))
print("(Post-downsize) max:", np.max(new_array),
      "min:", np.min(new_array),
      "ave:", np.mean(new_array))
(Pre-downsize) max: 255 min: 0 ave: 90.50942391854745
(Post-downsize) max: 1.0 min: 0.0 ave: 0.3549391

Let's now save this as a binary file for fast reading/writing.

In [13]:
fname = "data/vim-2/" + str(touse_subset) + "_downsampled.dat"
with open(fname, mode="bw") as fbin:
    new_array.tofile(fbin)

fname = "data/vim-2/" + str(touse_subset) + "_downsampled.info"
with open(fname, mode="w") as f:
    f.write("dtype: "+str(new_array.dtype)+"\n")
    f.write("shape: "+str(new_array.shape)+"\n")

Exercises:

  1. Repeat the above down-sampling and writing to disk for st, the training data.

  2. Try some extremely small versions, such as 32x32, 16x16, and so on. What do you think the potential benefits are of doing this? What about potential drawbacks?

To ensure that everything worked as desired, we would like to load up these arrays to check that everything has been saved correctly. First, check the dimensions we saved:

In [1]:
! cat data/vim-2/st_downsampled.info
! cat data/vim-2/sv_downsampled.info
cat: data/vim-2/st_downsampled.info: No such file or directory
cat: data/vim-2/sv_downsampled.info: No such file or directory

Then plugging these values in below, we can check that the contents are indeed as we expect.

In [23]:
print("Training data:")
fname = "data/vim-2/" + "st" + "_downsampled.dat"
with open(fname, mode="br") as fbin:
    
    # Read array.
    arr_tocheck = np.fromfile(file=fbin, dtype=np.float32).reshape((64, 64, 3, 108000))

    # Check a few frames.
    num_frames = arr_tocheck.shape[3]
    frames_to_play = 10
    for t in np.arange(0, frames_to_play*15, 15):
        plt.imshow(arr_tocheck[:,:,:,t])
        plt.show()
Training data:
In [25]:
print("Testing data:")
fname = "data/vim-2/" + "sv" + "_downsampled.dat"
with open(fname, mode="br") as fbin:
    
    # Read array.
    arr_tocheck = np.fromfile(file=fbin, dtype=np.float32).reshape((64, 64, 3, 8100))

    # Check a few frames.
    num_frames = arr_tocheck.shape[3]
    frames_to_play = 10
    for t in np.arange(0, frames_to_play*15, 15):
        plt.imshow(arr_tocheck[:,:,:,t])
        plt.show()
Testing data: