Job Arrays
Overview
A Slurm Job Array allows you to submit a collection of similar jobs with a
single sbatch
command. It is an efficient method for submitting multiple jobs
that are conceptually similar, such as tasks that vary by input data or
parameters. Using job arrays, you can easily submit and manage a large number of
jobs without needing to write separate scripts or manually submit each job
individually. This reduces the overhead associated with managing multiple jobs
and improves the overall efficiency of large-scale computational tasks.
Why Use Job Arrays?
- Efficiency
- Submitting a group of jobs as a job array reduces the need for submitting individual jobs and writing separate scripts for each one.
- Resource Allocation
- Job arrays allow you to manage resource allocation for multiple tasks in parallel efficiently.
- Parallelism
- If the jobs are independent of each other, they can run concurrently on different compute nodes, maximizing resource utilization.
- Simplified Job Management
- You can monitor, cancel, and control all jobs in a job array at once, reducing the complexity of managing large numbers of jobs.
Slurm Flags for Job Arrays
To submit a job array, you can use the --array
flag in the sbatch
command.
This flag is used to specify the range or indices of the jobs that will be part
of the job array.
--array=<index_list>
The format of the index list is start-end
for a range:
--array=1-16
start-end:step
for a range with a step size:
--array=100-200:2
Or individual indices separated by commas:
--array=1,3,5,7
Environment Variables
The following environment variables will be set when using job arrays:
Slurm Environment Variable | Definition |
---|---|
SLURM_ARRAY_JOB_ID | The first job ID of the array. This is the job ID printed when the job is submitted with sbatch . |
SLURM_ARRAY_TASK_ID | The index value of the job array. |
SLURM_ARRAY_TASK_COUNT | The number of tasks in the job array. |
SLURM_ARRAY_TASK_MAX | The highest job array index. |
SLURM_ARRAY_TASK_MIN | The lowest job array index value. |
Monitoring Job Arrays
Once your job array is submitted, you can monitor and manage individual jobs in the array:
Check status of job array
squeue -j <job_array_id>
This will display the status of all jobs in the job array.
Cancel a job array
Cancel the entire job array:
scancel 12345
Cancel an individual job in the array:
scancel 12345_5
Examples
Simple Example
#!/bin/bash
#SBATCH --job-name=my_job_array # Job array name
#SBATCH --array=0-9 # Job array indices from 0 to 9
#SBATCH --ntasks=1 # Each job is a single task
#SBATCH --cpus-per-task=1 # Each task uses 1 CPU core
#SBATCH --mem=4GB # Each job requests 4 GB of memory
#SBATCH --time=00:30:00 # Time limit: 30 minutes
# Load necessary modules (if needed)
module load anaconda3
# Run the task for each index in the job array
python myscript.py input_file_${SLURM_ARRAY_TASK_ID}.txt
The script will run 10 jobs (indices 0 through 9), each executing the Python
script myscript.py
on a different input file (input_file_0.txt,
input_file_1.txt, ..., input_file_9.txt).
FASTQC Example
You have a directory with multiple FASTQ files and you want to run FastQC on each file using a Slurm job array. The job array index will allow each job to process a different file.
Using a Text File with File Names
Suppose you have a text file called file_list.txt
, where each line contains
the path to a FASTQ file that needs to be processed by FastQC. You want to use
the job array index to dynamically select which file to process.
#!/bin/bash
#SBATCH --job-name=fastqc_array # Job array name
#SBATCH --array=1-10 # Job array indices (1 to 10, for 10 files)
#SBATCH --ntasks=1 # Each job is a single task
#SBATCH --cpus-per-task=1 # Each task uses 1 CPU core
#SBATCH --mem=2GB # Each job requests 2 GB of memory
#SBATCH --time=00:30:00 # Time limit: 30 minutes
# Load the FastQC module
module load biocontainers fastqc/0.12.1
# Define the path to the file list
FILE_LIST="fastq_files.txt"
# Read the file from the list corresponding to the job array index
FILE=$(sed -n "${SLURM_ARRAY_TASK_ID}p" $FILE_LIST)
# Run FastQC on the selected file
fastqc $FILE -o fastqc_results/
Explanation
The --array=1-10 flag creates 10 jobs in the array, each processing a different
file. The sed -n "${SLURM_ARRAY_TASK_ID}p" $FILE_LIST
command reads the file
corresponding to the current job's index (SLURM_ARRAY_TASK_ID) from the
file_list.txt.
For example, job 1 will process the first file in the list, job 2 the second file, and so on. The fastqc $FILE -o fastqc_results/ command runs FastQC on the selected file and outputs the results into the fastqc_results/ directory.
Using ls to Dynamically List Files in a Directory
Alternatively, if you don't have a pre-defined list of files but simply want to
process all FASTQ files in a directory, you can use ls
to list the files in a
directory and use the job array index to pick a file.
#!/bin/bash
#SBATCH --job-name=fastqc_array # Job array name
#SBATCH --array=0-9 # Job array indices (0 to 9, for 10 files)
#SBATCH --ntasks=1 # Each job is a single task
#SBATCH --cpus-per-task=1 # Each task uses 1 CPU core
#SBATCH --mem=2GB # Each job requests 2 GB of memory
#SBATCH --time=00:30:00 # Time limit: 30 minutes
# Load the FastQC module
module load biocontainers fastqc/0.12.1
# Define the directory where the FASTQ files are located
FASTQ_DIR="/path/to/fastq_files"
# Get the list of FASTQ files in the directory
FILES=($(ls $FASTQ_DIR/*.fastq))
# Select the file corresponding to the job array index
FILE=${FILES[$SLURM_ARRAY_TASK_ID]}
# Run FastQC on the selected file
fastqc $FILE -o fastqc_results/
Explanation
The FILES=($(ls $FASTQ_DIR/*.fastq)) command dynamically lists all .fastq files in the specified directory (/path/to/fastq_files) and stores them in an array.
Then file corresponding to the current job's index (SLURM_ARRAY_TASK_ID) is selected from the array. The fastqc $FILE -o fastqc_results/ command runs FastQC on the selected file and stores the results in the fastqc_results/ directory.
In the first, we start the indexes at 1 to correspond to the line numbers in the
file. In the second example, we start the indexes at 0 since the index of the
FILES
array in Bash starts at 0. Both examples have 10 jobs total. :::