Migration Guide (PBS to Slurm)
Welcome! We are glad that you are interested in migrating your workflow to the new Palmetto 2 cluster using the Slurm job scheduler!
While there are many things that are different on the new cluster, there is a wealth of similarity between the two systems that will help existing users. This page has everything you need to get started.
New Account System
Accounts on the new Palmetto 2 cluster are separate from accounts on the Palmetto 1 cluster.
Users will need to obtain an account on Palmetto 2 before proceeding with any of the other steps in this migration guide.
To learn more, see the account setup page.
New Login Node Address
The login node for Palmetto 2 has a different hostname, so you will need to use the new address when connecting via SSH.
Users should connect to the Palmetto 2 login node using this address:
slogin.palmetto.clemson.edu
.
The s
at the front of the new address stands for Slurm! This should help
you remember the new address.
You can review the updated SSH connection instructions for more details.
New OnDemand Instance
The new Palmetto 2 cluster has a separate instance of Open OnDemand. You will need to update your browser bookmarks.
The new address for Open OnDemand is: https://ondemand.rcd.clemson.edu
New Job Scheduler
The RCD team is excited to welcome users to the new Slurm scheduling experience on Palmetto 2.
On the former Palmetto 1 cluster, the Portable Batch System (PBS) was used to schedule and manage jobs on the cluster.
What makes Slurm different from PBS?
Under the hood, there are many differences between Slurm and PBS. However, most of these differences are transparent to users.
The primary difference that users will notice is different commands. For
example, instead of using qsub
, users must use srun
, sbatch
, or salloc
.
Users are encouraged to review this guide for a direct comparison between the two systems and advice on how to convert existing workflows.
Why move to Slurm?
The way PBS handles scheduling often results in jobs getting stuck in the incorrect queue or never running at all. This was frustrating for users who were waiting for their jobs to run and a burden for our support staff to monitor and fix.
The switch to Slurm allows users to have more control over when, where, and how their jobs get scheduled. With its fair share algorithm, jobs will no longer get indefinitely stuck in queues due to request size or rare/limited resources.
Switching to Slurm also allows RCD to add more features to job scheduling, resulting in an overall better experience for Users on Palmetto 2. Since all the scheduling is done automatically by Slurm, instead of pseudo-manually with PBS, this also opens up time for RCD staff to better help users by developing helpful tools for Palmetto.
Slurm Cheat Sheet
SchedMD provides a convenient Cheat Sheet that has all of the basic Slurm commands and their usage.
View the Cheat SheetThis cheat sheet is super useful - you might want to print out a copy to use at your desk!
PBS to Slurm Command Map
The table below shows common PBS commands and their Slurm counterparts.
Description | PBS Command | Slurm Command |
---|---|---|
Submit a batch job | qsub (without -I ) | sbatch – see instructions for using sbatch |
Submit an interactive job | qsub (with -I ) | salloc – see instructions for using salloc |
Job statistics or information | qstat | multiple options – see job monitoring options in Slurm |
View job logs while running | qpeek | not needed – see instructions for viewing job output |
Cancel Job | qdel | scancel – see instructions for using scancel |
See available compute resources | whatsfree or freeres | whatsfree or sinfo – see checking compute availability in Slurm |
PBS to Slurm Environment Variables Map
The table below shows common PBS environment variables and their Slurm counterparts.
PBS Variable | Slurm Variable | Description |
---|---|---|
$PBS_JOBID | $SLURM_JOB_ID | Job id |
$PBS_JOBNAME | $SLURM_JOB_NAME | Job name |
$PBS_O_WORKDIR | $SLURM_SUBMIT_DIR | Submitting directory |
cat $PBS_NODEFILE | $SLURM_JOB_NODELIST or srun hostname | Nodes allocated to the job |
N/A | $SLURM_NTASKS | Total number of tasks or MPI processes (NOTE: not total number of cores if --cpus-per-task is not 1) |
N/A | $SLURM_CPUS_PER_TASK | Number of CPU cores for each task or MPI process |
Converting PBS Batch Scripts for Slurm
- In PBS, users would use
#PBS
commands in their batch script files to inform the scheduler about what options they wanted to execute their job with. In Slurm, users must use#SBATCH
comments instead. - In PBS, users use
qsub job-script
to submit the job to scheduler; in Slurm, users would usesbatch job-script
instead. - NOTE: The modules loaded before the job is submitted will be carried to
the batch job environment. Therefore, it is highly recommended to put
module purge
at the beginning of the job script.
For example, in PBS, users might use the job script like:
#!/bin/bash
#PBS -N my-job-name
#PBS -l select=2:ncpus=8:mpiprocs=2:mem=2gb:ngpus=1:gpu_model=a100:interconnect=hdr,walltime=02:00:00
export OMP_NUM_THREADS=8
python3 run-my-science-workflow.py
The same script, written for Slurm, would look like (we would recommend to use
the long name for variables, for example using --nodes
instead of -N
):
#!/bin/bash
#SBATCH --job-name my-job-name
#SBATCH --nodes 2
#SBATCH --tasks-per-node 2
#SBATCH --cpus-per-task 8
#SBATCH --gpus-per-node a100:1
#SBATCH --mem 2gb
#SBATCH --time 02:00:00
#SBATCH --constraint interconnect_hdr
export OMP_NUM_THREADS=8
python3 run-my-science-workflow.py
Below are some brief explanations of parameters used here:
--nodes
selects the number of nodes for the job, which is the same to theselect
using along withplace=scatter
in PBS, which means selecting different physical nodes not chunks.--tasks-per-node
is the number of tasks in each node, which is equivalent to thempiprocs
in PBS.--cpus-per-task
controls the number of cores in eachtask
in the above. The default is 1 unless using multi-thread, where--cpus-per-task
is usually set to the number forOMP_NUM_THREADS
.- The total number of cores is not specified explicitly. It would be the value
in
--tasks-per-node
multiplied by--cpus-per-task
. --mem
is for memory per node.--gpus-per-node
can specify the gpu model and number of gpus per node following the format--gpus-per-node <gpu_model>:<gpu_number>
.--time
is the walltime of the job, the max of which is 72 hours for c2 nodes.interconnect
on Slurm has not been fully implemented yet and not commented in the job script for now.
Converting PBS Interactive Job Workflows for Slurm
- In PBS, users use
qsub
to request for interactive job; in Slurm, users will usesalloc
instead. (notice the command for interactive job (salloc
) is different from the one for batch jobsbatch
) - NOTE: The modules loaded before the job is submitted will be carried to
the interactive job environment. Therefore, it is highly recommended to run
module purge
once the interactive job is allocated.
For example, users can use the following command to request a PBS interactive job:
qsub -I -l select=2:ncpus=4:mem=2gb:ngpus=1:gpu_model=a100:interconnect=hdr,walltime=02:00:00
The corresponding command in Slurm would look like:
salloc --nodes 2 --tasks-per-node 4 --cpus-per-task 1 --mem 2gb --time 02:00:00 --gpus-per-node a100:1 --constrains interconnect_fdr
The explanations of the parameters can be found in the above section.
PBS Select Quantity vs Slurm Task Quantity
Although the syntax/usage of Slurm could be similar to PBS, there are some main
differences. The most important one is --nodes
is not required. Its value will
be determined by the tasks requested:
- If
--tasks-per-node
is specified, all the cpu cores will be allocated to the same node, which means the number of nodes is 1 in this case. NOTE: the number of tasks/cpu cores must be smaller than the number of cpu cores on a single node. - If you need more tasks/cpu cores than the number of cpu cores on a single
node, but you don't care which nodes these cpu cores will be allocated,
you can specify
--ntasks
. In this case, you job may wait less time in the queue since it can be landing on different nodes. A potential drawback is the performance of each cpu cores on different nodes might be different considering the heterogeneous nature of Palmetto cluster. - As mentioned above,
--mem
is for the memory per node. Besides--mem
, there are some other options, such as memory per cpu,--mem-per-cpu
and memory per gpu,--mem-per-gpu
.
PBS Queues vs Slurm Partitions
In PBS, jobs were submitted to queues. In Slurm, the analogous concept is partitions.
To learn more, see the partition flag instructions.