Skip to main content

Parabricks

NVIDIA Parabricks is a GPU-accelerated software suite for secondary analysis of next-generation sequencing (NGS) data. Also known as Clara, Parabricks matches the results of popular open-source tools such as GATK.

Installation

Parabricks is provided as a Docker image. On Palmetto, the Docker image has already been downloaded and converted to be run with Apptainer.

You can access and use the Parabricks container with Palmetto's module system.

$ module spider parabricks

------------------------------------------------------------------------------------------------------------------------------------------------
ngc/parabricks:
------------------------------------------------------------------------------------------------------------------------------------------------
Versions:
ngc/parabricks/4.1.1-1
ngc/parabricks/4.5.0-1

------------------------------------------------------------------------------------------------------------------------------------------------
For detailed information about a specific "ngc/parabricks" package (including how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other modules.
For example:

$ module spider ngc/parabricks/4.5.0-1
------------------------------------------------------------------------------------------------------------------------------------------------

Choosing a compute node

The installation requirements list the supported GPUs. Parabricks can run on any NVIDIA GPU that supports CUDA architecture 75, 80, 86, 89, 90, 100 or 120 and has at least 16GB of GPU RAM. Many tools have defaults settings that require more than 16GB of VRAM which are documented on the installation requirements page.

On Palmetto, we have the following supported GPUs:

GPUVRAM
A4048GB
A10040GB
A10080GB
L40S48GB
H10080GB
H200141GB
note

There are varying number of GPUs of each type with a different number of GPUs per server. Consult the Hardware Table for more information. If you'd like to purchase a GPU node for use with Parabricks, send us a ticket!

Running Parabricks

This section will cover how to run the Getting Started Tutorials from NVIDIA

Downloading the Data

srun wget -O /scratch/$USER/parabricks_sample.tar.gz \
"https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz" 2>/dev/null
srun --mem=4G tar xvf /scratch/$USER/parabricks_sample.tar.gz -C /scratch/$USER

FQ2BAM

salloc -c16 --mem=64G --gpus=a100:1 -t 30

module load ngc/parabricks

pbrun fq2bam \
--ref /scratch/$USER/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-fq /scratch/$USER/parabricks_sample/Data/sample_1.fq.gz /scratch/$USER/parabricks_sample/Data/sample_2.fq.gz \
--out-bam /scratch/$USER/fq2bam_output.bam

HaplotypeCaller

salloc -n1 --mem=64G --gpus=a100:1 -t 30

module load ngc/parabricks

pbrun haplotypecaller \
--ref /scratch/$USER/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-bam /scratch/$USER/fq2bam_output.bam \
--out-variants /scratch/$USER/variants.vcf