Parabricks
NVIDIA Parabricks is a GPU-accelerated software suite for secondary analysis of next-generation sequencing (NGS) data. Also known as Clara, Parabricks matches the results of popular open-source tools such as GATK.
Installation
Parabricks is provided as a Docker image. On Palmetto, the Docker image has already been downloaded and converted to be run with Apptainer.
You can access and use the Parabricks container with Palmetto's module system.
$ module spider parabricks
------------------------------------------------------------------------------------------------------------------------------------------------
ngc/parabricks:
------------------------------------------------------------------------------------------------------------------------------------------------
Versions:
ngc/parabricks/4.1.1-1
ngc/parabricks/4.5.0-1
------------------------------------------------------------------------------------------------------------------------------------------------
For detailed information about a specific "ngc/parabricks" package (including how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other modules.
For example:
$ module spider ngc/parabricks/4.5.0-1
------------------------------------------------------------------------------------------------------------------------------------------------
Choosing a compute node
The installation requirements list the supported GPUs. Parabricks can run on any NVIDIA GPU that supports CUDA architecture 75, 80, 86, 89, 90, 100 or 120 and has at least 16GB of GPU RAM. Many tools have defaults settings that require more than 16GB of VRAM which are documented on the installation requirements page.
On Palmetto, we have the following supported GPUs:
| GPU | VRAM |
|---|---|
| A40 | 48GB |
| A100 | 40GB |
| A100 | 80GB |
| L40S | 48GB |
| H100 | 80GB |
| H200 | 141GB |
There are varying number of GPUs of each type with a different number of GPUs per server. Consult the Hardware Table for more information. If you'd like to purchase a GPU node for use with Parabricks, send us a ticket!
Running Parabricks
This section will cover how to run the Getting Started Tutorials from NVIDIA
Downloading the Data
srun wget -O /scratch/$USER/parabricks_sample.tar.gz \
"https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz" 2>/dev/null
srun --mem=4G tar xvf /scratch/$USER/parabricks_sample.tar.gz -C /scratch/$USER
FQ2BAM
salloc -c16 --mem=64G --gpus=a100:1 -t 30
module load ngc/parabricks
pbrun fq2bam \
--ref /scratch/$USER/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-fq /scratch/$USER/parabricks_sample/Data/sample_1.fq.gz /scratch/$USER/parabricks_sample/Data/sample_2.fq.gz \
--out-bam /scratch/$USER/fq2bam_output.bam
HaplotypeCaller
salloc -n1 --mem=64G --gpus=a100:1 -t 30
module load ngc/parabricks
pbrun haplotypecaller \
--ref /scratch/$USER/parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-bam /scratch/$USER/fq2bam_output.bam \
--out-variants /scratch/$USER/variants.vcf