Job Queuing and Control with PBS

The Palmetto cluster uses the Portable Batch Scheduling system (PBS) to manage jobs.

Danger — End of Life Notice

Support for Palmetto 1 has ended and the cluster will shut down permanently on October 31, 2024.

Users should migrate to Palmetto 2 immediately to avoid service disruptions.
Migration Guide

Lifecycle of a PBS Job

The life of a job begins when you submit the job to the scheduler. If accepted, it will enter the Queued state.

Thereafter, the job may move to other states, as defined below:

Queued - the job has been accepted by the scheduler and is eligible for execution; waiting for resources.
Held - the job is not eligible for execution because it was held by user request, administrative action, or job dependency.
Running - the job is currently executing on the compute node(s).
Finished - the job finished executing or was canceled/deleted.

The diagram below demonstrates these relationships in graphical form.

Useful PBS Commands

Here are some basic PBS commands for submitting, querying and deleting jobs:

Command	Action
`qsub -I`	Submit an interactive job (reserves 1 core, 1gb RAM, 30 minutes walltime)
`qsub xyz.pbs`	Submit the job script `xyz.pbs`
`qstat <job id>`	Check the status of the job with given job ID
`qstat -u <username>`	Check the status of all jobs submitted by given username
`qstat -xf <job id>`	Check detailed information for job with given job ID
`qstat -Qf <queuename>`	Check the status of a queue
`qsub -q <queuename> xyz.pbs`	Submit to queue `queuename`
`qdel <job id>`	Delete the job (queued or running) with given job ID
`qpeek <job id>`	"Peek" at the standard output from a running job
`qdel -Wforce <job id>`	Use when job not responding to just `qdel`

For more details and more advanced commands for submitting and controlling jobs, please refer to the PBS Professional User's Guide.

Querying PBS Job Information

PBS provides a variety of useful commands to query the scheduler for information about jobs and make changes.

Check Status of All Jobs in PBS

To list the job IDs and status of all your jobs, you can use qstat:

$ qstat

pbs02:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
7600567.pbs02   username  c1_singl pi-mpi-1     1382   4   8    4gb 00:05 R 00:00
7600569.pbs02   username  c1_singl pi-mpi-2    20258   4   8    4gb 00:05 R 00:00
7600570.pbs02   username  c1_singl pi-mpi-3     2457   4   8    4gb 00:05 R 00:00

Check Status of a Particular Job in PBS

The qstat command can be used to query the status of a particular job:

$ qstat 7600424.pbs02
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
7600424.pbs02     pi-mpi           username           00:00:00 R c1_single

Detailed PBS Job Information

Once a job has finished running, qstat -xf can be used to obtain detailed job information.

info

Job history is only retained for 24 hours after the job ends.

Below is an example of querying detailed information for a finished job:

$ qstat -xf 7600424.pbs02

Job Id: 7600424.pbs02
    Job_Name = pi-mpi
    Job_Owner = username@login001.palmetto.clemson.edu
    resources_used.cpupercent = 103
    resources_used.cput = 00:00:04
    resources_used.mem = 45460kb
    resources_used.ncpus = 8
    resources_used.vmem = 785708kb
    resources_used.walltime = 00:02:08
    job_state = F
    queue = c1_single
    server = pbs02
    Checkpoint = u
    ctime = Tue Dec 13 14:09:32 2016
    Error_Path = login001.palmetto.clemson.edu:/home/username/MPI/pi-mpi.e7600424

    exec_host = node0088/1*2+node0094/1*2+node0094/2*2+node0085/0*2
    exec_vnode = (node0088:ncpus=2:mem=1048576kb:ngpus=0:nphis=0)+(node0094:ncp
	us=2:mem=1048576kb:ngpus=0:nphis=0)+(node0094:ncpus=2:mem=1048576kb:ngp
	us=0:nphis=0)+(node0085:ncpus=2:mem=1048576kb:ngpus=0:nphis=0)
    Hold_Types = n
    Join_Path = oe
    Keep_Files = n
    Mail_Points = a
    Mail_Users = username@clemson.edu
    mtime = Tue Dec 13 14:11:42 2016
    Output_Path = login001.palmetto.clemson.edu:/home/username/MPI/pi-mpi.o760042
	4
    Priority = 0
    qtime = Tue Dec 13 14:09:32 2016
    Rerunable = True
    Resource_List.mem = 4gb
    Resource_List.mpiprocs = 8
    Resource_List.ncpus = 8
    Resource_List.ngpus = 0
    Resource_List.nodect = 4
    Resource_List.nphis = 0
    Resource_List.place = free:shared
    Resource_List.qcat = c1_workq_qcat
    Resource_List.select = 4:ncpus=2:mem=1gb:interconnect=1g:mpiprocs=2
    Resource_List.walltime = 00:05:00
    stime = Tue Dec 13 14:09:33 2016
    session_id = 2708
    jobdir = /home/username
    substate = 92
    Variable_List = PBS_O_SYSTEM=Linux,PBS_O_SHELL=/bin/bash,
	PBS_O_HOME=/home/username,PBS_O_LOGNAME=username,
	PBS_O_WORKDIR=/home/username/MPI,PBS_O_LANG=en_US.UTF-8,
	PBS_O_PATH=/software/examples/:/home/username/local/bin:/usr/lib64/qt-3
	.3/bin:/opt/pbs/default/bin:/opt/gold/bin:/usr/local/bin:/bin:/usr/bin:
	/usr/local/sbin:/usr/sbin:/sbin:/opt/mx/bin:/home/username/bin,
	PBS_O_MAIL=/var/spool/mail/username,PBS_O_QUEUE=c1_workq,
	PBS_O_HOST=login001.palmetto.clemson.edu
    comment = Job run at Tue Dec 13 at 14:09 on (node0088:ncpus=2:mem=1048576kb
	:ngpus=0:nphis=0)+(node0094:ncpus=2:mem=1048576kb:ngpus=0:nphis=0)+(nod
	e0094:ncpus=2:mem=1048576kb:ngpus=0:nphis=0)+(node0085:ncpus=2:mem=1048
	576kb:ngpus=0:nphis=0) and finished
    etime = Tue Dec 13 14:09:32 2016
    run_count = 1
    Stageout_status = 1
    Exit_status = 0
    Submit_arguments = job.sh
    history_timestamp = 1481656302
    project = _pbs_project_default

Similarly, to get detailed information about a running job, you can use qstat -f.

Cancel a PBS Job

To delete a job (whether in queued, running or error status), you can use the qdel command.

qdel 7600424.pbs02

PBS Job Limits on Palmetto

Walltime in PBS

Jobs running in phases 1-6 of the cluster (nodes with interconnect 1g) can run for a maximum walltime of 336 hours (14 days).

Job running in phases 7 and higher of the cluster can run for a maximum walltime of 72 hours (3 days).

Jobs running on node owner queues can run for a maximum walltime of 336 hours (14 days).

These values may be updated. The current limits will be available if you run the checkqueuecfg command.

Number of Jobs in PBS

When you submit a job, it is forwarded to a specific execution queue based on job criteria (how many cores, RAM, etc.). There are three classes of execution queues:

MX queues (c1_ queues): jobs submitted to run on the older hardware (phases 1-6) will be forwarded to theses queues.
IB queues (c2_ queues): jobs submitted to run the newer hardware (phases 7 and up) will be forwarded to these queues.
GPU queues (gpu_ queues): jobs that request GPUs will be forwarded to these queues.
bigmem queue: jobs submitted to the large-memory machines (phase 0).

Each execution queue has its own limits for how many jobs can be running at one time, and how many jobs can be waiting in that execution queue. The maximum number of running jobs per user in execution queues may vary throughout the day depending on cluster load. Users can see what the current limits are using the checkqueuecfg command:

$ checkqueuecfg

MX QUEUES     min_cores_per_job  max_cores_per_job   max_mem_per_queue  max_jobs_per_queue   max_walltime
c1_solo                       1                  1              4000gb                2000      336:00:00
c1_single                     2                 24             90000gb                 750      336:00:00
c1_tiny                      25                128             25600gb                  25      336:00:00
c1_small                    129                512             24576gb                   6      336:00:00
c1_medium                   513               2048             81920gb                   5      336:00:00
c1_large                   2049               4096             32768gb                   1      336:00:00

IB QUEUES     min_cores_per_job  max_cores_per_job   max_mem_per_queue  max_jobs_per_queue   max_walltime
c2_single                     1                 24               600gb                   5       72:00:00
c2_tiny                      25                128              4096gb                   2       72:00:00
c2_small                    129                512              6144gb                   1       72:00:00
c2_medium                   513               2048             16384gb                   1       72:00:00
c2_large                   2049               4096                 0gb                   0       72:00:00

GPU QUEUES     min_gpus_per_job   max_gpus_per_job  min_cores_per_job  max_cores_per_job   max_mem_per_queue  max_jobs_per_queue   max_walltime
gpu_small                     1                  4                  1                 96              3840gb                  20       72:00:00
gpu_medium                    5                 16                  1                256              5120gb                   5       72:00:00
gpu_large                    17                128                  1               1024             20480gb                   5       72:00:00

SMP QUEUE     min_cores  max_cores   max_jobs   max_walltime
bigmem                1         64          3       72:00:00


   'max_mem' is the maximum amount of memory all your jobs in this queue can
   consume at any one time.  For example, if the max_mem for the solo queue
   is 4000gb, and your solo jobs each need 10gb, then you can run a
   maximum number of 4000/10 = 400 jobs in the solo queue, even though the
   current max_jobs setting for the solo queue may be set higher than 400.

The qstat command tells you which of the execution queues your job is forwarded to. For example, here is an interactive job requesting 8 CPU cores, a K40 GPU, and 32gb RAM:

$ qsub -I -l select=1:ncpus=8:ngpus=1:gpu_model=k40:mem=32gb,walltime=2:00:00
qsub (Warning): Interactive jobs will be treated as not rerunnable
qsub: waiting for job 9567792.pbs02 to start

We see from qstat that the job request is forward to the c2_single queue:

[username@login001 ~]$ qstat 9567792.pbs02
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
9567792.pbs02     STDIN            username                  0 Q c2_single

From the output of checkqueuecfg above, we see that each user can have a maximum of 5 running jobs in this queue.

Job Preemption in PBS

Node owners are granted priority access on the hardware they own. However, users are welcome to use any compute nodes on the cluster that are available for their jobs.

If a node owner submits a job to their priority queue while your job is executing on their node, your job will be preempted.

The preemption process works like so:

Your job is sent a graceful termination signal by the operating system.
The scheduler will grant a grace period of 2 minutes for your job to perform any final operations and exit.
If your job is still running after the grace period expires, the operating system will force your processes to terminate.
Your job is returned to the Queued state. Since your job was preempted, it will be sent to the front of the queue.
The owner's job begins executing on their node.
The scheduler will run your job again when resources become available, either on the same node or another node that meets your specifications.

note

If you do not need the latest hardware for your program to work, consider using older hardware that does not have an owner. This will allow you to avoid preemption entirely.

tip

If you plan to run a long job on a node where you would risk preemption, you may want to gracefully handle preemption by:

saving work periodically while running
designing your program to support starting from previous partial work
checking for previously saved work to load from when your job begins

This will allow your job to resume close to where it left off when it starts running again after a preemption event.

Example PBS scripts

A list of example PBS scripts for submitting jobs to the Palmetto cluster can be found here.

Job Queuing and Control with PBS

Lifecycle of a PBS Job​

Useful PBS Commands​

Querying PBS Job Information​

Check Status of All Jobs in PBS​

Check Status of a Particular Job in PBS​

Detailed PBS Job Information​

Cancel a PBS Job​

PBS Job Limits on Palmetto​

Walltime in PBS​

Number of Jobs in PBS​

Job Preemption in PBS​

Example PBS scripts​