Description of data sources

Here we give a brief description of where to find the main data sets used in this tutorial. Detailed descriptions of how to work with this data once it has been downloaded are given within the main tutorial content (links given below).

MNIST handwritten digits

This is arguably the most well-known benchmark data set for the pattern recognition task. The data is available at

for anyone with an internet connection. No registration is required.

Once the raw data has been acquired, we assume that it is stored in the data/MNIST directory, in whatever your working directory is.

Visual image data

The vim-2 data set, also known as the "Gallant Lab Natural Movie 4T fMRI Data set", is available from the website of Collaborative Research in Computational Neuroscience (CRCNS), at the following URL:

This requires free registration to, which can be done quickly using their "Request Account" page:

The application is screened, and so it may take a day or two before it is (hopefully) accepted.

If you are just downloading it locally, then logging in and downloading via your browser is perfectly acceptable, but if you are using a remote server for computation, be it your own or some cloud-based solution, it is best to make use of the download scripts that are provided:

Under "Batch download method", there is a link ( to a page which requires input of your username and password. From here, we get access to the sub-directory within tools. Looking inside tools/download, there are a handful of files, including crcns-download-tools-instuctions, which explains how to set up the configuration file and how to use the download/verification scripts. Setup requires only a few minutes; just follow the lucid instructions and take a break while the files are downloaded.

Once the raw data has been acquired, we assume that it is stored in the data/vim-2 directory, in whatever your working directory is.