Nipype Tutorial: Cortical Surface Extraction, BDP, SVReg in Parallel

Welcome to our interactive Nipype tutorial. Below you will find a mix of code and explanatory text. If you are viewing this from an active Jupyter Notebook server, you can execute our code as you read through our tutorial. If you are unfamiliar with Jupyter Notebook, you can learn the basics here.

Or, from the toolbar, go to Help|User|Interface Tour.

Here are some essentials:

  • The current cell is surrounded by a colored border
  • Runnable code is contained in a greyed box, denoted by In [#]: next to the cell
    • To execute the code in the current cell, use Shift+Enter
    • ! denotes a bash command that will be executed. All bash commands have been commented out, so you may chose whether or not to execute them. We advise that you read each bash command before chosing to execute them. To execute the commands, uncomment the command, then use Shift+Enter to run the bash command.
  • Pressing Shift+Enter on a text cell simply moves you to the cell below

Overview of Workflow

Nipype includes features that allow you to run multiple processes in parallel, in a single Nipype workflow. In a workflow that runs through BrainSuite's CSE, BDP, and SVReg, parallel processing is especially useful, and can help reduce processing time. In particular, once BFC has finished running, BDP and the rest of CSE can be run in parallel. Finally, when CSE is complete, SVReg can be started, regardless of whether BDP has finished.

Below, we will walk through our example code, which is downloadable here. The code creates a workflow, brainsuite_workflow, that uses through CSE, BDP and SVReg, and is run using Python's multiprocessing library by calling Nipype's MultiProc plugin. Finally, ThicknessPVC is run after brainsuite_workflow has finished executing.

Checking your installation

First, we'll check that all of BrainSuite's executables have been added to your system's path variable.

If you have not done so already, first install BrainSuite, by following the instructions on our installation page. Be sure to follow the instructions for installing the required MATLAB Compiler Runtime, which is required for SVReg, BDP and ThicknessPVC.

The following python code will ensure that your system's path variable has been set up properly. In particular, the following paths must be added (replace your_path with the path to your BrainSuite directory):

  • /your_path/BrainSuite16a1/bin/
  • /your_path/BrainSuite16a1svreg/bin/
  • /your_path/BrainSuite16a1/bdp/

Run the cell below and follow the instructions printed out.

(Note that bse is just one of many steps in CSE. Since the rest of CSE is contained in the same folder as bse, it is sufficient to ensure that bse can be called properly)

In [1]:
from __future__ import unicode_literals, print_function
from builtins import str

from distutils.spawn import find_executable
print('Message to user:')
if(find_executable('bse') and find_executable('svreg.sh') and find_executable('bdp.sh')):
    print('Your system path has been set up correctly. Continue on with the tutorial.')
else:
    print('Your system path has not been set up correctly.')
    print('Please add the above paths to your system path variable and restart the kernel for this tutorial.')
    print('Edit your ~/.bashrc file, and add the following line, replacing your_path with the path to BrainSuite16a1:\n')
    print('export PATH=$PATH:/your_path/BrainSuite16a1/svreg/bin:/your_path/BrainSuite16a1/bdp:/your_path/BrainSuite16a1/bin')

You must also have Nipype and its dependencies installed. See Nipype's installation page for more information. Another good resource for Nipype installation is the Nipype Beginner's Guide.

Be sure to update your Nipype installation to the latest development version, so that your installation includes our BrainSuite interface. (git clone git://github.com/nipy/nipype.git automatically downloads the developer version).

Alternatively, you may retain your current installation of Nipype, and manually download the folder for the BrainSuite interface, and copy it into your Nipype Interfaces directory. (Note that our Nipype tutorial code has been tested using the latest development version of Nipype.)

Acquiring the data

Below are bash commands to download the required data for this tutorial. The commands will download our tutorial data, then unzip a folder to your Documents directory. Uncomment the commands, and use Shift+Enter to run the cell. If you already have the tutorial data, you can skip this step.

In [2]:
'''
!curl http://users.bmap.ucla.edu/~jwong/BrainSuiteNipype_Tutorial.zip -o "BrainSuiteNipype_Tutorial.zip"
!unzip -o -qq BrainSuiteNipype_Tutorial.zip -d ~/Documents
'''

The Code

Here, we will discuss some important components of our workflow. Code cells are explained in the text cells above them.

First, we must adjust Nipype's execution configuration. By default, Nipype will automatically remove outputs from Nodes that are not directly connected to other Nodes using workflow.connect(). In this workflow, since SVReg does not directly take a single input from a CSE Node, but instead takes in a directory and a filename prefix, we don't want Nipype to remove outputs deemed unnecessary. More information about Nipype configuration can be found here.

We directly set global options in our workflow script, import the pipeline, then declare a workflow. Note that setting Nipype configurations must be done before importing the Nipype pipeline:

In [3]:
from nipype import config #Set configuration before importing nipype pipeline
cfg = dict(execution={'remove_unnecessary_outputs' : False}) #We do not want nipype to remove unnecessary outputs
config.update_config(cfg)

import nipype.pipeline.engine as pe
import nipype.interfaces.brainsuite as bs
import nipype.interfaces.io as io
import os

brainsuite_workflow = pe.Workflow(name='brainsuite_workflow')
brainsuite_workflow.base_dir='./'

As described on our SVReg Page, SVReg requires a collection of files produced by CSE, all contained within the same folder. The Nipype interface for the CSE command line tools automatically appends the appropriate suffixes onto auto-generated output file names, when the user does not specify an output file name. However, in order to collect all these required outputs in the same folder, we must use Nipype’s DataSink.

Here, we create a nodes containing DataSink objects. Note that you should pass absolute paths to DataSink's base_directory input. We use os.path.expanduser() to replace the abbreviation ~ with the absolute path to the user's home directory.

In [4]:
#Note that you must pass an absolute path to DataSink base_directory
#We use os.path.expanduser to expand ~/ to our user home directory

#We use this DataSink Node to save CSE outputs that will eventually be required by SVReg
ds = pe.Node(io.DataSink(), name='DATASINK')
ds.inputs.base_directory= os.path.expanduser('~/Documents/')

#We will use this DataSink Node to save BFC file for BDP
ds_BFC = pe.Node(io.DataSink(), name='DATASINK_BFC') 
ds_BFC.inputs.base_directory= os.path.expanduser('~/Documents/BrainSuiteNipype_Tutorial/')

Next we create all nodes for our pipeline.

In [5]:
bseObj = pe.Node(interface=bs.Bse(), name='BSE')
bseObj.inputs.inputMRIFile = '~/Documents/BrainSuiteNipype_Tutorial/2523412.nii.gz' #Provided input files
bfcObj = pe.Node(interface=bs.Bfc(),name='BFC')


bdp = pe.Node(interface=bs.BDP(), name='BDP')

#This file does not exist yet. 
#Hence we delay execution of BDP until this file does exist after ds_BFC sinks it to this specified location.
bdp.inputs.bfcFile = '~/Documents/BrainSuiteNipype_Tutorial/DWI/2523412.bfc.nii.gz'
bdp.inputs.inputDiffusionData = '~/Documents/BrainSuiteNipype_Tutorial/DWI/2523412.dwi.nii.gz' #Provided input files
bdp.inputs.BVecBValPair = ['~/Documents/BrainSuiteNipype_Tutorial/DWI/2523412.dwi.bvec', '~/Documents/BrainSuiteNipype_Tutorial/DWI/2523412.dwi.bval'] #Provided input files

pvcObj = pe.Node(interface=bs.Pvc(), name = 'PVC')

In this example, we use the atlas files that are included with BrainSuite. We determine your BrainSuite atlas directory through distutils.spawn.find_executable()

In [6]:
from distutils.spawn import find_executable
brainsuite_atlas_directory = find_executable('bse')[:-3] + '../atlas/'

cerebroObj = pe.Node(interface=bs.Cerebro(), name='CEREBRO')
#Provided atlas files
cerebroObj.inputs.inputAtlasMRIFile =(brainsuite_atlas_directory + 'brainsuite.icbm452.lpi.v08a.img')
cerebroObj.inputs.inputAtlasLabelFile = (brainsuite_atlas_directory + 'brainsuite.icbm452.v15a.label.img')

We continue creating nodes.

In [7]:
cortexObj = pe.Node(interface=bs.Cortex(), name='CORTEX')
scrubmaskObj = pe.Node(interface=bs.Scrubmask(), name='SCRUBMASK')
tcaObj = pe.Node(interface=bs.Tca(), name='TCA')
dewispObj=pe.Node(interface=bs.Dewisp(), name='DEWISP')
dfsObj=pe.Node(interface=bs.Dfs(),name='DFS')
pialmeshObj=pe.Node(interface=bs.Pialmesh(),name='PIALMESH')
hemisplitObj=pe.Node(interface=bs.Hemisplit(),name='HEMISPLIT')

svreg = pe.Node(interface=bs.SVReg(), name='SVREG')
svreg.inputs.subjectFilePrefix = '~/Documents/BrainSuiteNipype_Tutorial/2523412'

Add nodes to our workflow

In [8]:
brainsuite_workflow.add_nodes([bseObj, bfcObj, ds_BFC, bdp, pvcObj, cerebroObj, cortexObj, scrubmaskObj, 
                               tcaObj, dewispObj, dfsObj, pialmeshObj, hemisplitObj, ds, svreg])

Connect Nodes to each other

In [9]:
brainsuite_workflow.connect(bseObj, 'outputMRIVolume', bfcObj, 'inputMRIFile')
brainsuite_workflow.connect(bfcObj, 'outputMRIVolume', pvcObj, 'inputMRIFile')
brainsuite_workflow.connect(bfcObj, 'outputMRIVolume', cerebroObj, 'inputMRIFile')

As an example, suppose we wish to keep BDP related files in their own folder. We use ds_BFC to sink BFC's outputMRIVolume. We also connect ds_BFC's out_file to bdp's dataSinkDelay. This is a way of forcing BDP to delay execution until ds_BFC has finished sinking its files.

The data that gets passed to the dataSinkDelay input is irrelevant and is only used to aid in delaying execution; our Nipype interface does not change the parameters of BDP in any way based on the data received from DataSink’s out_file, and simply appends an empty string to the parameter list. However, by making this connection, we create a ‘dependency’ for BDP, forcing the pipeline to delay BDP’s execution until the connected DataSink (in ths case, ds_BFC) has finished saving outputs. The dataSinkDelay feature is included in the interfaces for BDP and SVReg.

Later in the workflow, we again use dataSinkDelay to delay SVReg's execution

In [10]:
brainsuite_workflow.connect(bfcObj, 'outputMRIVolume', ds_BFC, 'DWI')
#ds_BFC will create ~/Documents/BrainSuiteNipype_Tutorial/DWI/2523412.bfc.nii.gz
brainsuite_workflow.connect(ds_BFC, 'out_file', bdp, 'dataSinkDelay')
#BDP's dataSinkDelay forces BDP to delay execution until bs_BFC has finished saving the bfc file

Finish making connections

In [11]:
brainsuite_workflow.connect(cerebroObj, 'outputLabelVolumeFile', cortexObj, 'inputHemisphereLabelFile')
brainsuite_workflow.connect(pvcObj, 'outputTissueFractionFile', cortexObj, 'inputTissueFractionFile')
brainsuite_workflow.connect(cortexObj, 'outputCerebrumMask', scrubmaskObj, 'inputMaskFile')
brainsuite_workflow.connect(cortexObj, 'outputCerebrumMask', tcaObj, 'inputMaskFile')
brainsuite_workflow.connect(tcaObj, 'outputMaskFile', dewispObj, 'inputMaskFile')
brainsuite_workflow.connect(dewispObj, 'outputMaskFile', dfsObj, 'inputVolumeFile')
brainsuite_workflow.connect(dfsObj, 'outputSurfaceFile', pialmeshObj, 'inputSurfaceFile')
brainsuite_workflow.connect(pvcObj, 'outputTissueFractionFile', pialmeshObj, 'inputTissueFractionFile')
brainsuite_workflow.connect(cerebroObj, 'outputCerebrumMaskFile', pialmeshObj, 'inputMaskFile')
brainsuite_workflow.connect(dfsObj, 'outputSurfaceFile', hemisplitObj, 'inputSurfaceFile')
brainsuite_workflow.connect(cerebroObj, 'outputLabelVolumeFile', hemisplitObj, 'inputHemisphereLabelFile')
brainsuite_workflow.connect(pialmeshObj, 'outputSurfaceFile', hemisplitObj, 'pialSurfaceFile')

Here, we make our DataSink connections. ds will save all files required by SVReg in one folder. The name of datasink’s input represents a folder name that is appended to DataSink’s base directory. Thus the argument ‘BrainSuiteNipype_Tutorial’ indicates that we wish to save the output from bseObj in a folder called BrainSuiteNipype_Tutorial, within datasink’s base directory.

Multiple connections to ports of the same name are not allowed. However, using [email protected] allows us to create ports of different names, that point to the same folder. Furthermore, we can add text after [email protected]. The suffix text again allows us to create different named ports, that point to the same folder. See Nipype's page on DataSink for more information.

In [12]:
#**DataSink connections**
brainsuite_workflow.connect(bseObj, 'outputMRIVolume', ds, 'BrainSuiteNipype_Tutorial')
brainsuite_workflow.connect(bseObj, 'outputMaskFile', ds, '[email protected]')
brainsuite_workflow.connect(bfcObj, 'outputMRIVolume', ds, '[email protected]')
brainsuite_workflow.connect(pvcObj, 'outputLabelFile', ds, '[email protected]')
brainsuite_workflow.connect(pvcObj, 'outputTissueFractionFile', ds, '[email protected]')
brainsuite_workflow.connect(cerebroObj, 'outputCerebrumMaskFile', ds, '[email protected]')
brainsuite_workflow.connect(cerebroObj, 'outputLabelVolumeFile', ds, '[email protected]')
brainsuite_workflow.connect(cerebroObj, 'outputAffineTransformFile', ds, '[email protected]')
brainsuite_workflow.connect(cerebroObj, 'outputWarpTransformFile', ds, '[email protected]')
brainsuite_workflow.connect(cortexObj, 'outputCerebrumMask', ds, '[email protected]')
brainsuite_workflow.connect(scrubmaskObj, 'outputMaskFile', ds, '[email protected]')
brainsuite_workflow.connect(tcaObj, 'outputMaskFile', ds, '[email protected]')
brainsuite_workflow.connect(dewispObj, 'outputMaskFile', ds, '[email protected]')
brainsuite_workflow.connect(dfsObj, 'outputSurfaceFile', ds, '[email protected]')
brainsuite_workflow.connect(pialmeshObj, 'outputSurfaceFile', ds, '[email protected]')
brainsuite_workflow.connect(hemisplitObj, 'outputLeftHemisphere', ds, '[email protected]')
brainsuite_workflow.connect(hemisplitObj, 'outputRightHemisphere', ds, '[email protected]')
brainsuite_workflow.connect(hemisplitObj, 'outputLeftPialHemisphere', ds, '[email protected]')
brainsuite_workflow.connect(hemisplitObj, 'outputRightPialHemisphere', ds, '[email protected]')

As explained before, we must ensure that SVReg does not execute until all the required files produced by CSE have been saved to a central folder by DataSink. Simply calling SVReg in a new workflow, separate from brainsuite_workflow would work, but in this case, SVReg would have to wait until brainsuite_workflow completes entirely; this might mean waiting for BDP, an unrelated process in terms of dependencies, to finish. However, we want SVReg to execute immediately once its dependencies have been satisfied, in parallel with BDP if possible. Thus we keep SVReg in brainsuite_workflow.

To force SVReg to delay execution until DataSink has finished, we use the dataSinkDelay feature. As before, we connect the DataSink’s ‘out_file’ output to SVReg’s ‘dataSinkDelay’ input, effectively forcing SVReg to delay execution until ds has finished sinking its files.

(If we didn't use dataSinkDelay to create a dependency, the parallel execution model would submit SVReg's job at the very beginning of the pipeline, without waiting for the CSE nodes to finish, which is not what we want.)

In [13]:
brainsuite_workflow.connect(ds, 'out_file', svreg, 'dataSinkDelay')
#SVReg's dataSinkDelay forces SVReg to delay execution until ds has finished saving the CSE files

Now that the setup is complete, we are ready to run our workflow. The syntax used below tells the workflow engine to use Python's multiprocessing library during execution. More information on Nipype plugins can be found here.

After brainsuite_workflow has completed, we run thicknessPVC as a stand-alone node.

The workflow will take around 6 hours to fully complete. You may follow the status of the workflow by opening the terminal that is associated with this Jupyter session.

The ShimWarning you will receive is expected, and is a working issue in Nipype, which should not affect run results.

In [14]:
brainsuite_workflow.run(plugin='MultiProc') #Run workflow, using MultiProc

thicknessPVC = bs.ThicknessPVC() #Run thicknessPVC when brainsuite_workflow is done
thicknessPVC.inputs.subjectFilePrefix = '~/Documents/BrainSuiteNipype_Tutorial/2523412'
thicknessPVC.run()

#Print message when all processing is complete.
print('Processing has completed.')

The file 2523412.nii.gz is part of the Beijing Enhanced dataset, released by Beijing Normal University as part of the International Neuroimaging Datasharing Initiative (INDI) under a Creative Commons Attribution-NonCommerical 3.0 Unported License (CC-BY-NC). More information on the complete dataset is available online here, and it can be downloaded from NITRC.