Spatial gene expression data analysis on Cluster (10X Genomics, Space Ranger)
Published:
Running spaceranger as cluster mode that uses Sun Grid Engine (SGE) as queuing.
There are 2 steps to analyze Spatial RNA-seq data1.
Step 1: spaceranger mkfastq
demultiplexes raw base call (BCL
) files generated by Illumina sequencers into FASTQ files.
Step 2: spaceranger count
takes FASTQ files from spaceranger mkfastq
and performs alignment, filtering, barcode counting, and UMI counting.
Running pipelines on cluster requires the following:
1. Load Space Ranger module (spaceranger-1.0.0
)1 or, download and uncompress spaceranger at your $HOME
directory and add PATH in ~/.bashrc
.
2. Update job config file (spaceranger-1.0.0/external/martian/jobmanagers/config.json
) for threads and memory. For example
"threads_per_job": 8,
"memGB_per_job": 64,
3. Update template file (spaceranger-1.0.0/external/martian/jobmanagers/sge.template
).
#!/bin/bash
#$ -pe smp __MRO_THREADS__
##$ -l mem_free=__MRO_MEM_GB__G
(comment this line if your cluster do not support it!)
#$ -q b.q
#$ -S /bin/bash
#$ -m abe
#$ -M <e-mail>
cd __MRO_JOB_WORKDIR__
source ../spaceranger-1.0.0/sourceme.bash
(update with complete path)
For clusters whose job managers do not support memory requests, it is possible to request memory in the form of cores via the --mempercore
command-line option. This option scales up the number of threads requested via the __MRO_THREADS__
variable according to how much memory a stage requires.
Read more at Cluster Mode
4. Download spatial gene expression, image file and reference genome datasets from 10XGenomics.
5. Create sge.sh
file
TR="$HOME/refdata-cellranger-mm10-3.0.0"
Output files will appear in the out/ subdirectory within this pipeline output directory.
cd $HOME/10xgenomics/out
For pipeline output directory, the --id
argument is used i.e Adult_Mouse_Brain.
FASTQS="$HOME/V1_Adult_Mouse_Brain_fastqs"
spaceranger count --disable-ui \
--id=Adult_Mouse_Brain \
--transcriptome=${TR} \
--fastqs=${FASTQS} \
--sample=V1_Adult_Mouse_Brain \
--image=$DATA_DIR/V1_Adult_Mouse_Brain_image.tif \
--slide=V19L01-041 \
--area=C1 \
--jobmode=sge \
--mempercore=8 \
--jobinterval=5000 \
--maxjobs=3
6. Execute a command in screen and, detach and reconnect
Use screen
command to get in/out of the system while keeping the processes running.
screen -S screen_name
bash sge.sh
If you want to exit the terminal without killing the running process, simply press Ctrl+A+D
.
To reconnect to the screen: screen -R screen_name
7. Monitor work progress through a web browser
Open _log
file present in output folder Adult_Mouse_Brain
If you see serving UI as http://cluster.university.edu:3600?auth=rlSdT_QLzQ9O7fxEo-INTj1nQManinD21RzTAzkDVJ8
, then type the following from your laptop
ssh -NT -L 9000:cluster.university.edu:3600 user@cluster.university.edu
user@cluster.university.edu's password:
Then access the UI using the following URL in your web browser http://localhost:9000/