Skip to main content

Job Submission

Now that you understand how jobs are controlled and the basic types of jobs, you are ready to submit a job.

When you submit a job, the scheduler will place your job into the queue until it can find available resources to assign.

Examples

The easiest way to learn how to submit a job is to look at example submissions. Below, we have examples for both of the job types.

Start an interactive job

An interactive job can be started using the qsub command. Here is an example of an interactive job:

[username@login001 ~]$ qsub -I -l select=1:ncpus=2:mem=4gb:interconnect=1g,walltime=4:00:00
qsub (Warning): Interactive jobs will be treated as not rerunnable
qsub: waiting for job 8730.pbs02 to start
qsub: job 8730.pbs02 ready

[username@node0021 ~]$ module add anaconda3/2019.10-gcc/8.3.1
[username@node0021 ~]$ python runsim.py
.
.
.
[username@node0021 ~]$ exit
[username@login001 ~]$

Above, we request an interactive job using 1 "chunk" of hardware (select=1), 2 CPU cores per "chunk", and 4gb of RAM per "chunk", for a wall time of 4 hours. Once these resources are available, we receive a Job ID (8730.pbs02), and a command-line session running on node0021.

Next Steps

Now that you have seen an example, review the qsub command section to learn what other options are available.

Submit a batch job

Interactive jobs require you to be logged-in while your tasks are running. In contrast, you may logout after submitting a batch job, and examine the results at a later time. This is useful when you need to run several computational tasks to the cluster, and/or when your computational tasks are expected to run for a long time.

To submit a batch job, you must prepare a batch script (you can do this using an editor like vim or nano). Following is an example of a batch script (call it example.pbs). In the batch job below, we really don't do anything useful (just sleep or "do nothing" for 60 seconds),

#PBS -N example
#PBS -l select=1:ncpus=1:mem=2gb:interconnect=1g,walltime=00:10:00

module add gcc/9.3.0-gcc/8.3.1

cd /home/username
echo Hello World from `hostname`
sleep 60

After saving the above file, you can submit the batch job using qsub:

[username@login001 ~]$ qsub example.pbs
8738.pbs02
tip

Make a note of the job number.

Since batch jobs run in the background, you won't see the script running in your terminal. However, you can use the job control commands to get information about your job's status or take action.

Once the job is completed, you will see the files example.o8738 (containing output if any) and example.e8738 (containing errors if any) from your job.

[username@login001 ~]$ cat example.o8738
Hello World from node0230.palmetto.clemson.edu
Next Steps

Now that you have seen an example, review the qsub command section to learn what other options are available.

The qsub Command

The qsub command is used to submit jobs to the scheduler. The defaults are not very useful, so you will want to pass several options that describe your job type and what resources you want.

Options for qsub

The following switches can be used either with qsub on the command line, or with a #PBS directive in a batch script.

ParameterPurposeExample
-NJob name (7 characters)-N maxrun1
-lJob limits (lowercase L), hardware & other requirements for job.-l select=1:ncpus=8:mem=1gb
-qQueue to direct this job to (work1 is the default, supabad is an example of specific research group's job queue)-q supabad
-oPath to stdout file for this job (environment variables are not accepted here)-o stdout.txt
-ePath to stderr file for this job (environment variables are not accepted here)-e stderr.txt
-mmail event: Email from the PBS server with flag abort\ begin\ end \ or no mail for job's notification.-m abe
-MSpecify list of user to whom mail about the job is sent. The user list argument is of the form: [user[@host],user[@host],...]. If -M is not used and -m is specified, PBS will send email to userid@clemson.edu-M user1@domain1.com,user2@domain2.com
-j oeJoin the output and error streams and write to a single file-j oe

For example, in a batch script:

#PBS -N hydrogen
#PBS -l select=1:ncpus=24:mem=200gb,walltime=4:00:00
#PBS -q bigmem
#PBS -m abe
#PBS -M userid@domain.com
#PBS -j oe

And in an interactive job request on the command line:

qsub -I -N hydrogen -q bigmem -j oe -l select=1:ncpus=24:mem=200gb,walltime=4:00:00

For more detailed information, please take a look at:

man qsub
Unsupported PBS Options

Palmetto does not support all PBS options; see the list below for unsupported options:

  • Using -r n to mark the job as not re-runnable is not supported because our preemption system requires that all jobs be re-runnable. Jobs marked as not re-runnable will be marked as re-runnable automatically.

Resource Limits Specification

The -l switch provided to qsub or along with the #PBS directive can be used to specify the amount and kind of compute hardware (cores, memory, GPUs, interconnect, etc.,), its location, i.e., the node(s) and phase from which to request hardware,

OptionPurpose
selectNumber of chunks and resources per chunk. Two or more "chunks" can be placed on a single node, but a single "chunk" cannot span more than one node.
walltimeExpected wall time of job (job is terminated after this time)
placeControls the placement of the different chunks
tip

A chunk in PBS is a set of resources (CPU cores, memory, and GPUs) that must be allocated on a single physical machine (a single node). Each chunk corresponds with one or more MPI "slots". By default, a single MPI slot is created for each chunk. This can be changed with the mpiprocs option.

PBS may place different chunks on different nodes, or the same node. If needed, you can control the placement with the place option.

Although there are cgroups preventing processes within a job from using more resources than allocated to the job, there are no such controls between chunks. If multiple chunks of a job land on the same node, there are no cgroups or other controls that would prevent a single process within the job from making use of all the resources from each chunk (withing the job) on the node.

Here are some examples of resource limits specification:

-l select=1:ncpus=8:chip_model=opteron:interconnect=10g
-l select=1:ncpus=16:chip_type=e5-2665:interconnect=56g:mem=62gb,walltime=16:00:00
-l select=1:ncpus=8:chip_type=2356:interconnect=10g:mem=15gb
-l select=1:ncpus=1:node_manufacturer=ibm:mem=15gb,walltime=00:20:00
-l select=1:ncpus=4:mem=15gb:ngpus=2,walltime=00:20:00
-l select=1:ncpus=4:mem=15gb:ngpus=1:gpu_model=k40,walltime=00:20:00
-l select=1:ncpus=2:mem=15gb:host=node1479,walltime=00:20:00
-l select=2:ncpus=2:mem=15gb,walltime=00:20:00,place=scatter # force each chunk to be on a different node
-l select=2:ncpus=2:mem=15gb,walltime=00:20:00,place=pack # force each chunk to be on the same node

and examples of options you can use in the job limit specification:

# CPU Options
chip_manufacturer=amd
chip_manufacturer=intel
chip_model=opteron
chip_model=xeon
chip_type=e5345
chip_type=e5410
chip_type=l5420
chip_type=x7542
chip_type=2356
chip_type=6172
chip_type=e5-2665

# Node Manufacturer Options
node_manufacturer=dell
node_manufacturer=hp
node_manufacturer=ibm
node_manufacturer=sun

# GPU count options
ngpus=1
ngpus=2

# GPU Options
gpu_model=k20
gpu_model=k40
gpu_model=p100
gpu_model=v100
gpu_model=a100

# Phase Options
phase=1a
phase=3
phase=19b
phase=28 # You can specify any phase from /etc/hardware-table

# Interconnect Options
interconnect=1g # 1 Gbps Ethernet
interconnect=10ge # 10 Gbps Ethernet
interconnect=56g # 56 Gbps FDR InfiniBand; same as fdr
interconnect=fdr # 56 Gbps FDR InfiniBand; same as 56g
interconnect=100g # 100 Gbps HDR InfiniBand; same as hdr
interconnect=hdr # 100 Gbps HDR InfiniBand; same as 100g

MPI Processes

If you are using MPI, you can tell the scheduler how many MPI processes per chunk to make available using the mpiprocs resource. For example:

# 1 chunk, 4 cores per chunk, 4 MPI processes per chunk
# 4 cores total, 4 MPI processes total, 1 core per process:
-l select=1:ncpus=4:mpiprocs=4:mem=8gb

# 1 chunk, 4 cores per chunk, 2 MPI processes per chunk
# 4 cores total, 2 MPI processes total, 2 cores per process:
-l select=1:ncpus=4:mpiprocs=2:mem=8gb

# 4 chunks, 4 cores per chunk, 4 MPI processes per chunk
# 16 cores total, 16 MPI processes total, 1 core per process:
-l select=4:ncpus=4:mpiprocs=4:mem=8gb

# 4 chunks, 4 cores per chunk, 1 MPI processes per chunk
# 16 cores total, 4 MPI processes total, 4 cores per process:
-l select=4:ncpus=4:mpiprocs=1:mem=8gb

If your program supports MPI solely, you likely want mpiprocs equal to ncpus. If your program supports MPI and OpenMP, you likely want mpiprocs less than ncpus since you will have each MPI process spawn multiple threads (and thus make use of multiple cores). If you have mpiprocs greater than ncpus, you will be oversubscribing, which almost always results in lower performance.

Environment Variables

The following table contains potentially useful environment variables set by the PBS scheduler every time you submit a job:

Variable NameDescription
HOSTNAMEThe name of the current host device on palmetto, e.g. node0581.palmetto.clemson.edu
MODULEPATHThe list of paths containing software available to the module command line tool
NCPUSThe number of requested CPUs per node
PATHThe ordered list of paths used by linux to locate executable files when running a command
PBS_ENVIRONMENTTakes the value PBS_INTERACTIVE for interactive jobs or PBS_BATCH for batch jobs
PBS_JOBDIRPathname of job’s staging and execution directory on the primary host
PBS_JOBIDJob identifier given by PBS when the job is submitted
PBS_JOBNAMEJob name given by user
PBS_NODEFILEThe filename containing a list of node hostnames assigned to the job
TMPDIRPathname of job's local scratch directory

You can access these variables in your programs. For example, you can print the name of your host in bash

echo $HOSTNAME
# example output: "node0581.palmetto.clemson.edu"

or in Python

import os
print(os.environ['HOSTNAME'])
# example output: "node0581.palmetto.clemson.edu"