BIDS Containers

Author/Maintainer: Dianne Patterson Ph.D. dkp @ email.arizona.edu
Date Created: 2019_07_30
Date Updated: 2020_04_06
Tags: BIDS, containers
OS: UNIX (e.g., Mac or Linux)

Introduction

This page provides detailed instructions, examples and scripts for using Singularity containers that I provide on the HPC for the neuroimaging community. Some discussion of Docker containers is also provided. As always, check Date Updated above to ensure you are not getting stale information. My preference is to point to official documentation when I can find it and think it is useful. If you have found documentation you believe I should consider linking to, please let me know (dkp @ email.arizona.edu).

BIDS Containers are BIG and Resource Intensive

  • The containers we use for neuroimage processing tend to be large and resource intensive. That brings its own set of problems.
  • For Docker on the mac, you need to check Preferences ➜ Disk to allow Docker plenty of room to store the containers it builds (several containers can easily require 50-100 GB of space). In addition, check Preferences ➜ Advanced to make sure you give Docker plenty of resources to run tools like Freesurfer. For example, I allocate 8 CPUs and 32 GB of memory to Docker).
  • In theory, it is possible to move your Docker storage to an external disk. In my experience this is flaky and results in slowness and lots of crashing (early 2019). Of course, it could be better in the future.
  • The implications of container size for Singularity containers are explored here.
  • Below I provide information about neuroimaging Singularity containers I maintain on the U of A HPC. You are free to use these if you have an HPC account. In addition to providing containers, I also provide example scripts for running those containers. You’ll have to copy and modify those scripts for your own directories and group name. A detailed walk-through is provided here BET.

Create Subdirectories under Derivatives for your Outputs

In general, you want to put results of running these containers into a derivatives directory. fMRIPrep creates subdirectories under derivatives, which seems like a good idea as it keeps the outputs of different containers separated and it does not annoy the bids validator. At the time of this writing, 02/27/2020, the standards for naming in the derivatives directory have not been finalized.

Note

It is a good idea to create the derivatives directory before you run your Docker or Singularity container. Sometimes the containerized app looks for a directory and fails if it does not find it.

BET

This is a small neuroimaging container (under 500 MB), which runs quickly. This walk-through will provide you with experience working at the unix commandline, transferring data, running interactive and batch mode processes on the HPC, building a Singularity container from a Docker container, and running a bids-compliant workflow with Singularity on the HPC. This assumes you have an HPC account and are familiar with the material on the HPC page page. You will also need some experience with the Unix comand line.

Login to OOD and Get Ready

  • Open a File Explorer window in your home directory (Files ➜ Home Directory).
  • From the File Explorer window, select Open in Terminal (at the top, 2nd from the left) and choose Ocelote.

Use the Dataset on the HPC

Copy the tutorial data to your home directory: The first command below, cd, ensures you are actually in your home directory. The second command copies the dataset to your home directory:

cd
cp /extra/dkp/tutorial_dataset/bids_data.zip .

Optional: Try Data Transfer

Download bids_data.zip. The dataset is too big to upload with the OOD upload button. Use scp instead, e.g. (note you will need to use your username instead of dkp):

scp -v bids_data.zip dkp@filexfer.hpc.arizona.edu:~

Prepare the Dataset

The first command below unzips the dataset. The second command changes directories so you are in the unzipped dataset folder. The third command creates a derivatives subdirectory for your output. The last command changes your directory again, so you are up one level from bids_data:

unzip bids_data.zip
cd bids_data
mkdir derivatives
cd ..

Build the BET Singularity Container

Copy interact_small.sh from /extra/dkp/Scripts:

cp /extra/dkp/Scripts/interact_small.sh .

Start an interactive session:

./interact_small.sh

interact_small.sh tells you what groups you are in, so you can start it correctly. I am in group dkp, so the following works for me. It may take several minutes to start your interactive job depending on how busy the HPC is:

./interact_small.sh dkp

Once you have an interactive prompt, you can build the Singularity container. First, you have to load the Singularity module so the HPC will understand Singularity commands you issue. Second, you can build the container by pulling it from dockerhub:

module load singularity
singularity build bet.sif docker://bids/example

Note

If you have any trouble with this step, you can use /extra/dkp/singularity-images/bet.sif instead. You are welcome to copy it, but you can also use it without copying it.

Run BET with Singularity

You should be in interactive mode and have the Singularity module loaded. Determine whether the Singularity module is loaded:

module list

You should see something like this (we are interested in #3):

Currently Loaded Modulefiles:
1) pbspro/19.2.4   2) gcc/6.1.0(default)   3) singularity/3/3.5.3

Run Singularity on the dataset:

singularity run --cleanenv --bind bids_data:/data ./bet.sif  /data /data/derivatives participant --participant_label 1012

If you do not have the bet.sif container in your home directory, you can use the one in /extra/dkp/singularity-images:

singularity run --cleanenv --bind bids_data:/data /extra/dkp/singularity-images/bet.sif  /data /data/derivatives participant --participant_label 1012

If Singularity runs properly it creates sub-1012_brain.nii.gz in the bids_data/derivatives directory. Confirm that it worked:

cd bids_data
ls derivatives

Provided that worked, we can run the group-level BIDS command. Let’s try it from the bids_data directory this time:

singularity run --cleanenv --bind ${PWD}:/data ../bet.sif /data /data/derivatives group

That should create avg_brain_size.txt in the derivatives directory.

Understanding the Singularity Command

Singularity takes a number of options. So far you’ve seen build and run. build creates the sif file. run uses that file to perform some processing.

  • --cleanenv prevents conflicts between libraries outside the container and libraries inside the container; although sometimes the container runs fine without --cleanenv, it is generally a good idea to include it.
  • --bind Singularity (like Docker) has a concept of what is inside the container and what is outside the container. The BIDS standard requires that certain directories exist on the inside of every BIDS container, e.g., /data (or sometimes /bids_dataset) and /outputs. You must bind your preferred directory outside the container to these internal directories. Order is important (outside:inside). Here are our two examples: --bind bids_data:/data or --bind ${PWD}:/data
  • What container are we running? You must provide the unix path to the container. There are three examples here:
    • ./bet.sif assumes that bet.sif is in the same directory where you are running the singularity command.
    • /extra/dkp/singularity-images/bet.sif provides the path to bet.sif in /extra/dkp/singularity-images.
    • ../bet.sif says the container is up one directory level from where you are running the singularity command.
  • BIDS requires that we list input and output directories. This is relative to the bind statement that defines the directory on the outside corresponding to /data on the inside. Thus /data/derivatives will correctly find our derivatives directory outside the container. This is the same for Docker containers.
  • Finally, further BIDS options are specified just like they would be for the corresponding Docker runs.

Running a Batch Job

Batch jobs, like interactive mode, use your allocated time. Copy runbet.sh from /extra/dkp/Scripts:

cp /extra/dkp/Scripts/runbet.sh .

The script consists of two sections. The first section specifies the resources you need to run the script. All the scripts I make available to you have pretty good estimates of the time and resources required.

The second part of the script is a standard bash script. It defines a variable Subject and calls Singularity.

Open the script with the editor, because you will need to modify it to specify your group name on line 13. Change the group name from group_list=dkp to your own group name, e.g.,:

#PBS -W group_list=akielar

In addition, on line 40 put in your email address instead of mine:

#PBS -M joe@.arizona.edu

Save the script. We will pass the subject variable to qsub using -v sub=1012:

qsub -v sub=1012 runbet.sh

Look in the active jobs window to see if your job is queued. It runs very quickly so it may complete before you have a chance to see it. When it finishes, it creates a text log (e.g., bet.o3117511) using the name you specified on line 10 of the PBS script. The job submission system should also send you an email from root telling you the job is complete. See below. Exit status=0 means the job completed correctly. The job used 11 seconds of walltime and 5 seconds of cpu time:

PBS Job Id: 3117556.head1.cm.cluster
Job Name:   bet
Execution terminated
Exit_status=0
resources_used.cpupercent=30
resources_used.cput=00:00:05
resources_used.mem=77428kb
resources_used.ncpus=1
resources_used.vmem=746916kb
resources_used.walltime=00:00:11

Other Batch Scripts

Other scripts are available on bitbucket and in /extra/dkp/Scripts. Some of them create additional directory structure or run as arrays. Read more about qsub.

In addition to altering the group and the email in the PBS portion of the scripts, many of the scripts export a top level directory STUFF. This can be very useful because you’ll want to place some files and directories outside of your BIDS data directory. Change STUFF to point to your directory, instead of /extra/dkp, e.g.,:

export STUFF=/extra/akielar
  • Change the MRIS variable if appropriate as well. This is the directory containing your BIDS compliant subject directories.

BIP

BIP (Bidirectional Iterative Parcellation), runs FSL dwi processing with BedpostX and Probtrackx2. The twist is that BIP runs Probtrackx2 iteratively until the size of the connected grey matter regions stabilizes. This provides a unique and useful characterization of connectivity (the locations and volumes of the connected grey matter regions) not available with other solutions.

Patterson, D. K., Van Petten, C., Beeson, P., Rapcsak, S. Z., & Plante, E. (2014). Bidirectional iterative parcellation of diffusion weighted imaging data: separating cortical regions connected by the arcuate fasciculus and extreme capsule. NeuroImage, 102 Pt 2, 704–716. http://doi.org/10.1016/j.neuroimage.2014.08.032

A detailed description of how to run BIP is availabe as a Readme on the bipbids Bitbucket site. BIP runs in three stages: setup, prep and bip. setup prepares the T1w and dwi images. prep runs eddy, dtifit and bedpostX. bip does the iterating for the selected tracts.

Docker

Any of these steps can be run with a local Docker container: diannepat/bip2. Run docker pull diannepat/bip2 to get the container and download the helpful bash script bip_wrap.sh.

Singularity

To take advantage of GPU processing for the prep and bip steps, you should run the Singularity container.

You can build the Singularity container from the Singularity_bip recipe . See Build Singularity Containers from Recipes.

For more about the GPU code, see Bedpostx_GPU (BedpostX_gpu runs in 5 minutes instead of 12-24 hours for BedpostX) and Probtrackx_GPU (200x faster). The result of running Probtrackx_GPU is slightly different than running probtrackx so don’t mix results from the regular and GPU versions.

A Singularity container is available on the HPC: /extra/dkp/singularity-images/bip2.sif. Several example scripts that facilitate running the different stages of BIP can be found in /extra/dkp/Scripts and on Bitbucket). Here are examples:

# This script calls prep on a single subject that you pass to qsub
qsub -v sub=1012 runbip2prep.sh

# This script calls prep on an array of subjects
qsubr Scripts/arraybip2prep.sh SubjectLists/plante181-252.subjects

# This script runs a single subject and tract that you pass to qsub
qsub -v "sub=1012, tract=arc_l" runbip2bip_1.sh

# This script also run an array job.
# It includes several calls to singularity, each for a different tract:
qsubr Scripts/arraybip2bip_2.sh SubjectLists/plante253-294.subjects
  • As with all of the scripts I’ve made available, you’ll want to copy them into your own directory, modify the STUFF and MRIS directory, the associated group name and the email address.
  • In these GPU scripts, you’ll note that 28 CPUs are requested: #PBS -l select=1:ncpus=28:mem=56gb:ngpus=1. This is a whole node on Ocelote and is necessary in order to use the GPU processing capabilities. In addition, you’ll notice that the cuda 10 module is loaded: module load cuda10.0. Cuda is required for GPU processing. Any Cuda version later than cuda80 is backward compatible, so feel free to use the latest and greatest Cuda version available when you run.
  • Example code to run on el gato’s GPUs is commented out but available in the scripts. There are two differences between El Gato and Ocelote with regard to the GPUs: (1) el gato has 16 nodes per GPU instead of 28. (2) el gato uses module load cuda10 instead of module load cuda10.0.
  • Just like bip_wrap.sh, these scripts assume derivative data will go into subdirectories of derivatives: fsl_anat_proc, fsl_dwi_proc and bip. These subdirectories are the same ones used by dwiqc, described below, so if you’ve run one, you will not have to worry about creating redundant output or wasting time repeating the same steps.

Warning

BedpostX_gpu will create all the appropriate output files even when it fails to run correctly! Clues are: The files (dyads, mean, merged etc) are too small. The logs in the dwi.bedpostX will report the failure. ProbtrackX fails to run correctly when BedpostX has failed.

Job Times

prep takes roughly 3 hours of walltime. bip takes 5-10 minutes for a single tract and single subject. This varies by tract.

DWIQC

DWIQC uses FSL 6.X to run topup, eddy and the eddy quality control metrics. It relies on Siemens fieldmaps and several text files to define the parameters for running.

Docker

DWIQC can be run with a local Docker container: diannepat/dwiqc. Run docker pull diannepat/dwiqc to get the container and download the helpful bash script dwiqc_wrap.sh.

Singularity

A singularity image is available on the HPC: /extra/dkp/singularity-images/dwiqc.sif, along with scripts that facilitate running it (/extra/dkp/Scripts) at the participant: rundwiqc.sh and group rundwiqc_group.sh levels, and as an array job at the participant level arraydwiqc.sh. Each script has appropriate parameters for cpu, memory and walltime. The dwiqc Readme provides more information.

Here are examples of using the script on the HPC:

# Here we run a single subject
qsub -v sub=1012 Scripts/rundwiqc.sh

# Here we run an array job
qsubr Scripts/arraydwiqcPB.sh SubjectLists/plante078-084.subjects

Job Times

DWIqc on the HPC takes about 2 hours to run. There is probably some variation based on head size and number of directions.

fMRIPrep

There is lots of online documentation for fMRIPrep.

Note

You need to download a Freesurfer license and make sure your container knows where it is. See The Freesurfer License

Note

Be careful to name sessions with dashes and not underscores. That is itbs-pre will work fine, but itbs_pre will cause you pain in later processing.

You can rerun fMRIPrep and it’ll find its old output and avoid repeating steps, especially if you have created the -w work directory (the runfmriprep.sh script does this). So, it is not the end of the world if you underestimate the time needed to run.

Determine which version of fMRIPrep is in the Singularity container (This is especially useful for fmriprep as it changes frequently and several versions are available in /extra/dkp/singularity-images):

singularity run --cleanenv /extra/dkp/singularity-images/fmriprep.sif --version

A usable Singularity job script is available here: runfmriprep.sh. This should be easy to modify for your own purposes.

  • fMRIPrep needs to know where your freesurfer license is. Ensure the singularity command points to your license and that it is outside of your BIDS directory:

    --fs-license-file ${STUFF}/license.txt
    
  • Create a work directory for fMRIPrep. This is where it will store intermediate steps it can use if it has to start over. Like the freesurfer license, this directory should not be in your BIDS directory:

    -w ${STUFF}/fmriprep_work
    
  • BIDS singularity images are big and take some effort to build (Gory details are in Building Large Containers if you want to do it yourself).

  • Currently, the singularity command in this script points to the fMRIPrep singularity container in /extra/dkp/singularity-images:

    /extra/dkp/singularity-images/fmriprep.sif
    
  • Permissions should allow you to use containers in this directory freely. If my version of the containers is satisfactory for you, then you do not need to replicate them in your own directory. I am hoping we’ll have a shared community place for these images at some point (other than my directory).

  • You do not NEED to change any other arguments. --stop-on-first-crash is a good idea. You may wish to test with the reduced version that does not run reconall. It is currently commented out, but it’ll succeed or fail faster than the default call.

  • Once you are satified with the changes you’ve made to your script, run your copy of the script like this:

    qsub -v sub=1012 Scripts/runfmriprep.sh
    
  • qsub needs the path to your script (in my case, it is in the Scripts directory).

  • When the job finishes (or crashes), it’ll leave behind a text log, e.g., fmriprep_extra.o2252098. You can view a log of the job in the Jobs dropdown on the OOD dashboard.

  • Read this log with Edit in OOD or cat at the command line. It may suggest how many resources you should have allocated to the job (scroll to the bottom of the job file). This can tell you whether you have vastly over or under-estimated. In addition, it’ll provide a log of what the container did, which may help you debug.

  • See BIDS containers for more detail about individual containers.

MRIQC

MRIQC is from the Poldracklab, just like fmriprep. MRIQC runs quickly and produces nice reports that can alert you to data quality problems (mostly movement, but a few other issues).

Docker

The mriqc Docker container is available on dockerhub:

docker pull poldracklab/mriqc

A wrapper script for the Docker container is also available, mriqc_wrap.sh.

Singularity

MRIQC is available on the HPC: /extra/dkp/singularity-images/mriqc.sif, along with two scripts that facilitate running it at the participant: /extra/dkp/Scripts/runmriqc.sh and group /extra/dkp/Scripts/runmriqc_group.sh levels. Here is an example of using the script on the HPC. We pass in the subject number:

qsub -v sub=1012 Scripts/runmriqc.sh

An participant level mriqc run with one T1w anatomical image and one fMRI file took about 25 minutes of walltime with 2 cpus. A group run with a single subject took just under 4 minutes with 2 cpus.

The HTML reports output by MRIQC can be viewed on OOD by selecting View.

Job Times

MRIQC job times vary by the number of files to be processed. Examples on the HPC are 22 minutes for a T1w image only; 1+ hours for a T1w image and 4 fMRI images.

MRtrix3_connectome

MRtrix3_connectome facilitates running the MRtrix software, which processes DWI images to create a connectome. To determine which version of MRtrix3_connectome you have, you can run the following command on Docker:

docker run --rm bids/mrtrix3_connectome -v

Or, on the HPC, you can run the equivalent Singularity command (this is quick, you do not need to submit a job):

module load singularity
singularity run mrtrix3_connectome.sif -v

Singularity

  • Here’s a script for running MRtrix3_connectome on the HPC using the HCP (Human Connectome) atlas runmrtrix3_hcp.sh

  • And, here’s a script for running MRtrix3_connectome on the HPC using the Desikan atlas runmrtrix3_desikan.sh

  • Finally, below is a call to qsub to run the desikan script on a particular subject.

  • Note that you should copy the script to your own area on the HPC and change the group_list in the script from dkp to your group. The time and CPU requirements should be roughly correct:

    qsub -v sub=1012 runmrtrix3_desikan.sh
    

Note

As of 04/06/2020: Read how operating systems interact with singularity before attempting to run the MRtrix3_connectome container. You have to run mrtrix3_connectome on ElGato or using a special argument in your qsub statement on Ocelote to set CentOS to version 7.

Job Times

Jobs were run on ElGato, because Ocelote’s operating system was too old to run this MRtrix3_connectome container at the time of this writing. See the section on Host OS and Singularity Interactions

Test Data DWI image with 64 directions, 2 mm isotropic voxels and 10 B0 images. T1w image with 1 mm isotropic voxels.

  • desikan Parcellation 16 cores, 8 GB of RAM and 13 hours walltime. Working sample script: runmrtrix3_desikan.sh
  • hcpmmp1 Parcellation 16 cores 13 GB of RAM, and 22 hours walltime. Working sample script: runmrtrix3_hcp.sh