Compute Node Hardware
Cluster Type
Palmetto is a heterogeneous cluster, which means that compute nodes vary in hardware configuration across the cluster. This is different from many other HPC systems.
However, Palmetto is homogeneous within its phases, which means compute nodes in a given phase of the cluster have the same hardware configuration.
If your job needs a certain type of hardware, make sure the resource list you
pass to qsub
will target compatible nodes.
Phases
The cluster is divided into several "phases"; the basic hardware configuration
(node count, cores, RAM) is given below. For more detailed and up-to-date
information, you can view the file /etc/hardware-table
after logging in.
Ethernet phases
Phases 0 through 6 of the cluster consist of older hardware with 10 Gbps Ethernet interconnect. Maximum run time for a single task is limited to 336 hours.
Phase | Node count | Cores | RAM |
---|---|---|---|
1a | 118 | 8 | 31 GB |
1b | 46 | 12 | 92 GB |
2a | 68 | 16 | 251 GB |
2c | 88 | 16 | 62 GB |
5c | 37 | 8 | 22 GB |
5d | 23 | 12 | 46 GB |
Infiniband phases
Phases 7-29 of the cluster consist of newer hardware with 56 Gbps FDR Infiniband interconnect (Phases 7-17) and 100 Gbps HDR Infiniband interconnect (Phases 18-29). Maximum run time for a single task is limited to 72 hours.
Phase | Node count | Cores | RAM | GPU |
---|---|---|---|---|
7 | 39 | 16 | 62 GB | none |
8a | 71 | 16 | 62 GB | K20 |
8b | 57 | 16 | 62 GB | K20 |
9 | 72 | 16 | 126 GB | K20 |
10 | 80 | 20 | 126 GB | K20 |
11a | 40 | 20 | 126 GB | K40 |
11b | 3 | 20 | 126 GB | K40 |
11c | 41 | 16 | 250 GB | none |
12 | 28 | 24 | 126 GB | K40 |
13 | 24 | 24 | 126 GB | K40 |
14 | 12 | 24 | 126 GB | K40 |
15 | 32 | 24 | 126 GB | K40 |
16 | 40 | 28 | 126 GB | P100 |
17 | 20 | 28 | 126 GB | P100 |
18a | 2 | 40 | 372 GB | V100 |
18b | 65 | 40 | 372 GB | V100 |
18c | 10 | 40 | 748 GB | V100 |
19a | 28 | 40 | 372 GB | V100 |
19b | 7 | 48 | 372 GB | none |
20 | 22 | 56 | 372 GB | V100 |
27 | 34 | 56 | 372 GB | A100 |
28 | 26 | 64 | 250 GB | A100 |
29 | 40 | 64 | 250 GB | A100 |
GPUs
The infiniband nodes each have 2 NVIDIA GPU cards of various models as shown above.
Big-memory nodes
Phase 0 consists of several "bigmem" machines with large amounts of RAM.
Phase | Node count | Cores | RAM |
---|---|---|---|
0a | 3 | 24 | 1 TB |
0b | 5 | 32 | 750 GB |
0c | 1 | 40 | 1 TB |
0d | 2 | 36 | 1.5 TB |
0e | 1 | 40 | 1.5 TB |
0f | 3 | 80 | 1.5 TB |
0g | 1 | 56 | 1 TB |
Hardware Table
The hardware configuration of the different nodes is available in the file
/etc/hardware-table
.
Different hardware is available on the PBS and Slurm sides of the cluster.
Contents of /etc/hardware-table
will vary on login
(PBS) vs slogin
(Slurm).
You can view this file by running the following command from the login node:
cat /etc/hardware-table
Sample Output
The output below shows the contents of /etc/hardware-table
as of January 17,
2023.
Hardware in the cluster can change at any time. For the most up-to-date information, connect to the login node and run the command above.
[coreyf@login002 ~]$ cat /etc/hardware-table
PALMETTO HARDWARE TABLE Last updated: Fri Jul 1 2022
PHASE COUNT MAKE MODEL CHIP(0) CORES RAM(1) /local_scratch Interconnect GPUs
BIGMEM nodes
0a 3 HP DL580 Intel Xeon 7542 24 1.0 TB(2) 99 GB 10ge 0
0b 1 Dell R820 Intel Xeon E5-4640 32 750 GB(2) 740 GB(13) 10ge 0
0c 1 Dell R830 Intel Xeon E5-4627v4 40 1.0 TB(2) 880 GB 10ge 0
0d 2 Lenovo SR650 Intel Xeon 6240 36 1.5 TB(2) 400 GB(13) 10ge 0
0e 1 HP DL560 Intel Xeon E5-4627v4 40 1.5 TB(2) 881 GB 10ge 0
0f 1 HPE DL560 Intel Xeon 6138G 80 1.5 TB(2) 3.6 TB 10ge 0
0f 1 HPE DL560 Intel Xeon 6148G 80 1.5 TB(2) 745 GB(13) 10ge 0
0f 1 HPE DL560 Intel Xeon 6148G 80 1.5 TB(2) 3.6 TB 10ge 0
C1 CLUSTER (older nodes with interconnect=1g)
1a 118 Dell R610 Intel Xeon E5520 8 31 GB 220 GB 1g 0
1b 46 Dell R610 Intel Xeon E5645 12 92 GB 220 GB 1g 0
2a 68 Dell R620 Intel Xeon E5-2660 16 251 GB 2.7 TB 1g 0
2c 88 Dell PEC6220 Intel Xeon E5-2665 16 62 GB 250 GB 1g 0
3 149 Sun X2200 AMD Opteron 2356 8 15 GB 193 GB 1g 0
4 280 IBM DX340 Intel Xeon E5410 8 31 GB 111 GB 1g 0
5c 37 Dell R510 Intel Xeon E5640 8 22 GB 7 TB 1g 0
5d 23 Dell R520 Intel Xeon E5-2450 12 46 GB 2.7 TB 1g 0
6 65 HP DL165 AMD Opteron 6176 24 46 GB 193 GB 1g 0
C2 CLUSTER (newer nodes with interconnect=FDR)
7a 42 HP SL230 Intel Xeon E5-2665 16 62 GB 240 GB 56g, fdr, 10ge 0
7b 12 HP SL250s Intel Xeon E5-2665 16 62 GB 240 GB 56g, fdr, 10ge 0
8a 71 HP SL250s Intel Xeon E5-2665 16 62 GB 900 GB 56g, fdr, 10ge 2 x K20(4)
8b 57 HP SL250s Intel Xeon E5-2665 16 62 GB 420 GB 56g, fdr, 10ge 2 x K20(4)
9 72 HP SL250s Intel Xeon E5-2665 16 125 GB 420 GB 56g, fdr, 10ge 2 x K20(4)
10 80 HP SL250s Intel Xeon E5-2670v2 20 125 GB 800 GB 56g, fdr, 10ge 2 x K20(4)
11a 40 HP SL250s Intel Xeon E5-2670v2 20 125 GB 800 GB 56g, fdr, 10ge 2 x K40(6)
11b 3 HP SL250s Intel Xeon E5-2670v2 20 125 GB 800 GB 56g, fdr, 10ge 0
11c 41 Dell MISC Intel Xeon E5-2650v2 16 250 GB 2.7 TB 56g, fdr, 10ge 0
12 29 Lenovo NX360M5 Intel Xeon E5-2680v3 24 125 GB 800 GB 56g, fdr, 10ge 2 x K40(6)
13 24 Dell C4130 Intel Xeon E5-2680v3 24 125 GB 1.8 TB 56g, fdr, 10ge 2 x K40(6)
14 12 HPE XL1X0R Intel Xeon E5-2680v3 24 125 GB 880 GB 56g, fdr, 10ge 2 x K40(6)
15 32 Dell C4130 Intel Xeon E5-2680v3 24 125 GB 1.8 TB 56g, fdr, 10ge 2 x K40(6)
16 40 Dell C4130 Intel Xeon E5-2680v4 28 125 GB 1.8 TB 56g, fdr, 10ge 2 x P100(8)
17 20 Dell C4130 Intel Xeon E5-2680v4 28 124 GB 1.8 TB 56g, fdr, 10ge 2 x P100(8)
C2 CLUSTER (newer nodes without FDR)
19b 4 HPE XL170 Intel Xeon 6252G 48 372 GB 1.8 TB 56g, 10ge 0
C2 CLUSTER (newest nodes with interconnect=HDR)
18a 2 Dell C4140 Intel Xeon 6148G 40 372 GB 1.9 TB 100g, hdr, 25ge 4 x V100NV(9)
18b 65 Dell R740 Intel Xeon 6148G 40 372 GB 1.8 TB 100g, hdr, 25ge 2 x V100(10)
18c 10 Dell R740 Intel Xeon 6148G 40 748 GB 1.8 TB 100g, hdr, 25ge 2 x V100(10)
19a 28 Dell R740 Intel Xeon 6248G 40 372 GB 1.8 TB 100g, hdr, 25ge 2 x V100(10)
20 22 Dell R740 Intel Xeon 6238R 56 372 GB 1.8 TB 100g, hdr, 25ge 2 x V100S(11)
21 2 Dell R740 Intel Xeon 6248G 40 372 GB 1.8 TB 100g, hdr, 25ge 2 x V100
24a 2 NVIDIA DGXA100 AMD EPYC 7742 128 1 TB 28 TB 100g, hdr, 100ge 8 x A100(17)
24b 1 NVIDIA DGX-1 Intel Xeon E5-2698v4 40 512 GB 6.6 TB 100g, hdr, 100ge 8 x V100
27 34 Dell R740 Intel Xeon 6258R 56 372 GB 1.8 TB 100g, hdr, 25ge 2 x A100(16)
28 26 Dell R750 Intel Xeon 8358 64 250 GB 790 GB 100g, hdr, 25ge 2 x A100(18)
29 40 Dell R750 Intel Xeon 8358 64 250 GB 790 GB 100g, hdr, 25ge 2 x A100(18)
*** PBS resource requests are always lowercase ***
If you don't care which GPU MODEL you get (K20, K40, P100, V100, V100S, V100NV), you can specify gpu_model=any
If you don't care which IB you get (FDR or HDR), you can specify interconnect=any
(0) CHIP has 3 resources: chip_manufacturer, chip_model, chip_type
(1) Leave 2 or 3GB for the operating system when requesting memory in PBS jobs
(2) Specify queue "bigmem" to access the large memory machines
(4) 2 NVIDIA Tesla K20m cards per node, use resource request "ngpus=[1|2]" and "gpu_model=k20"
(6) 2 NVIDIA Tesla K40m cards per node, use resource request "ngpus=[1|2]" and "gpu_model=k40"
(8) 2 NVIDIA Tesla P100 cards per node, use resource request "ngpus=[1|2]" and "gpu_model=p100"
(9) 4 NVIDIA Tesla V100 cards per node with NVLINK2, use resource request "ngpus=[1|2|3|4]" and "gpu_model=v100nv"
(10) 2 NVIDIA Tesla V100 cards per node, use resource request "ngpus=[1|2]" and "gpu_model=v100"
(11) 2 NVIDIA Tesla V100S cards per node, use resource request "ngpus=[1|2]" and "gpu_model=v100s"
(16) 2 NVIDIA A100 40GB cards per node, use resource request "ngpus=[1|2]" and "gpu_model=a100"
(17) 8 NVIDIA A100 cards per node, use resource request "ngpus=[1..8]" and "gpu_model=dgxa100"
(18) 2 NVIDIA A100 80GB cards per node, use resource request "ngpus=[1|2]" and "gpu_model=a100"