Job Control with Slurm
The new Palmetto 2 cluster uses the Slurm Workload Manager to schedule jobs.
Lifecycle of a Slurm Job
The life of a job begins when you submit the job to the scheduler. If accepted, it will enter the Pending state.
Thereafter, the job may move to other states, as defined below:
- Pending - the job has been accepted by the scheduler and is eligible for execution; waiting for resources.
- Held - the job is not eligible for execution because it was held by user request, administrative action, or job dependency.
- Running - the job is currently executing on the compute node(s).
- Completed - the job finished executing successfully.
- Failed - the job finished executing, but returned an error.
- Timeout - the job was executing, but was terminated because it exceeded the amount of wall time requested.
- Out of Memory - the job was executing, but was terminated because it exceeded the amount of memory requested.
- Canceled - the job was canceled by the requestor or a system administrator.
This diagram below provides a visual representation of these states:
For more details, you can review the Job State Codes in the Slurm Documentation.
Slurm Job Control Commands
This section provides essential commands and approaches to efficiently monitor and manage jobs within the Slurm workload manager on Palmetto.
Listing Slurm jobs with squeue
You can list the status of your current jobs using the squeue command:
squeue --me
Don't forget to add the --me flag to only show your jobs. Slurm will show all
jobs by default.
This returns basic information for your queued and running jobs. You can control
the output fields and format using the --format flag. For instance, to see all
available fields for each job:
squeue --me --format "%all"
For more information, see the official
Slurm squeue documentation. Find
additional examples at the bottom of the page. You can also access this
documentation by running man squeue from a terminal.
Checking on Slurm jobs with scontrol
You can list detailed information about a specific job using the scontrol
command:
scontrol show job <job-id>
where you replace <job-id> with the ID of a running or queued job. Retrieve
this ID using the squeue command as described above. The scontrol show job
command gives detailed information about your job including information about
the resources allocated to the job.
In this example, we used scontrol to show job information. However, this
powerful command also interacts with other aspects of the slurm configuration
including nodes and partitions. For instance, you can view detailed node
information with scontrol show node <hostname> where you replace <hostname>
with the host name for a specific node.
For more information, see the official
Slurm scontrol documentation. You
can also access this documentation by running man scontrol from a terminal.
See the output of a Slurm job during execution
Slurm logs your job outputs to a text file. By default, this log file is located
in the working directory used when creating the job and has the naming format
slurm-<job-id>.out, where <job-id> matches the job ID returned by squeue.
The name and location of the log file can be changed when
submitting a job. You can view the
output of your running job using standard linux commands. For instance, you can
display live updates using the tail command:
tail -f path/to/slurm-<job-id>.out
The -f (follow) flag gives continuous updates as the file changes.
Canceling a Slurm job with scancel
You can terminate a running Slurm job and free the associated resources using
the scancel command:
scancel <job-id>
where <job-id> matches the job ID returned by squeue.
To cancel all jobs submitted by your user, you can use the following command:
scancel --me
For more information, see the official
Slurm scancel documentation. You can
also access this documentation by running man scancel from a terminal.
Connecting to a running Slurm job
Sometimes you may wish to connect to a running job for performance monitoring or debugging purposes. You can get a shell within your job using the steps below:
-
First, you need to determine which node(s) your job is running on.
You can use
squeue --meto see a list of your running jobs.Pay attention to the
JOBIDandNODELISTfields. If your job requested multiple nodes, you will see a range and/or list of ranges.For example, you could see output similar to the following:
$ squeue --me
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2346 work1 submit.s bfgodfr R 3:39:09 4 node[0483-0486]
2344 work1 submit.s bfgodfr R 3:45:19 1 node0432
2347 work1 submit.s bfgodfr R 10:57 4 node[0432-0435]
2349 work1 submit.s bfgodfr R 0:07 4 node[0432,0435,0483,0486] -
Make a note of the desired Job ID and node hostname from the list.
From the example above, we could select job ID
2349and hostnamenode0483. -
To get a shell session within your selected job and node, you can use the
sruncommand below:srun --jobid <jobid> --overlap --nodelist <hostname> --pty bash --loginReplace
<jobid>and<hostname>accordingly.The correct command for the example above would be:
srun --jobid 2349 --overlap --nodelist node0483 --pty bash --login -
You will now have a shell session on the node running your job and can try some of the suggestions below.
Viewing available Slurm partitions to run your job
The sinfo -s command allows you to see all
partitions you have access to. For example:
sinfo -s
Example output could be:
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
work1* up 3-00:00:00 3/16/0/19 node[0401,0403-0419,0421]
rcde up infinite 3/18/0/21 node[0400-0401,0403-0419,0421,1036]
rcdeprio up infinite 1/3/0/4 node[0400-0401,0404,1036]
training up 2:00:00 0/1/0/1 node0400
This would mean I have access to the work1 (The * indicates this is the
default), rcde, rcdeprio, and training partitions. To run a job with a
particular partition, use -p <partition-name> on
srun,
sbatch, or
salloc.