Spatial RNA-seq data analysis using Space Ranger on SGE Cluster

2 minute read

Published:

Running spaceranger as cluster mode that uses Sun Grid Engine (SGE) as queuing.

There are 2 steps to analyze Spatial RNA-seq data1.

Step 1: spaceranger mkfastq demultiplexes raw base call (BCL) files generated by Illumina sequencers into FASTQ files.
Step 2: spaceranger count takes FASTQ files from spaceranger mkfastq and performs alignment, filtering, barcode counting, and UMI counting.

Running pipelines on cluster requires the following:

1. Load Space Ranger module (spaceranger-1.0.0)1 or, download and uncompress spaceranger at your $HOME directory and add PATH in ~/.bashrc.

2. Update job config file (spaceranger-1.0.0/external/martian/jobmanagers/config.json) for threads and memory. For example

"threads_per_job": 8,
"memGB_per_job": 64,

3. Update template file (spaceranger-1.0.0/external/martian/jobmanagers/sge.template).

#!/bin/bash
#$ -pe smp __MRO_THREADS__
##$ -l mem_free=__MRO_MEM_GB__G (comment this line if your cluster do not support it!)
#$ -q b.q
#$ -S /bin/bash
#$ -m abe
#$ -M <e-mail>
cd __MRO_JOB_WORKDIR__
source ../spaceranger-1.0.0/sourceme.bash (update with complete path)

For clusters whose job managers do not support memory requests, it is possible to request memory in the form of cores via the --mempercore command-line option. This option scales up the number of threads requested via the __MRO_THREADS__ variable according to how much memory a stage requires.
Read more at Cluster Mode

4. Download spatial gene expression, image file and reference genome datasets from 10XGenomics.

5. Create sge.sh file

TR="$HOME/refdata-cellranger-mm10-3.0.0"

Output files will appear in the out/ subdirectory within this pipeline output directory.

cd $HOME/10xgenomics/out

For pipeline output directory, the --id argument is used i.e Adult_Mouse_Brain.

FASTQS="$HOME/V1_Adult_Mouse_Brain_fastqs"

spaceranger count --disable-ui \
--id=Adult_Mouse_Brain \
--transcriptome=${TR} \
--fastqs=${FASTQS} \
--sample=V1_Adult_Mouse_Brain \
--image=$DATA_DIR/V1_Adult_Mouse_Brain_image.tif \
--slide=V19L01-041 \
--area=C1 \
--jobmode=sge \
--mempercore=8 \
--jobinterval=5000 \
--maxjobs=3

6. Execute a command in screen and, detach and reconnect

Use screen command to get in/out of the system while keeping the processes running.

screen -S screen_name

bash sge.sh

If you want to exit the terminal without killing the running process, simply press Ctrl+A+D.

To reconnect to the screen: screen -R screen_name

7. Monitor work progress through a web browser

Open _log file present in output folder Adult_Mouse_Brain

If you see serving UI as http://cluster.university.edu:3600?auth=rlSdT_QLzQ9O7fxEo-INTj1nQManinD21RzTAzkDVJ8, then type the following from your laptop

ssh -NT -L 9000:cluster.university.edu:3600 user@cluster.university.edu

user@cluster.university.edu's password:

Then access the UI using the following URL in your web browser http://localhost:9000/