Skip to main content

Overview

Today's scientific research involves increasingly large volumes of increasingly complex data. At the same time, CPU performance has improved well beyond storage media, which can create bottlenecks in HPC environments. Read this section to learn how to optimize your storage usage and improve I/O performance on Palmetto.

Palmetto provides three data spaces: home, scratch, and paid storage.

Every user has a home directory and access to the scratch file systems. Paid storage is purchased by our research scientists to house their data and can only be accessed by users having the owner’s approval. Each data space is accessible from anywhere in the cluster.

Storage Video

The video below will walk you through the different data spaces and how they are used.

Storage Hardware Grid

The table below describes each file system on Palmetto and the hardware for the storage medium and connection to the nodes.

NAMESIZEDISK TYPEFILE SYSTEMNETWORK CONNECTIONBACKUP
/home100GB per userHDDzfsethernet yes
/scratch500TBSSDIndigoIB, ethernet no
/scratch1 ⚠️1933TBHDDbeegfsIB, ethernet no
/fastscratch ⚠️168TBNVMebeegfsIB, ethernet no
/local_scratch99GB - 2.7TB per nodeHDD, SSD, NVMeext4internal no
/zfs/{group}Varies by ownerHDDzfsethernet yes
/project/{path}Varies by ownerSSDIndigoIB, ethernet yes
/scratch1 and /fastscratch are deprecated

/scratch1 and /fastscratch are deprecated and are scheduled to be removed from the cluster in the summer of 2024. Instead, use our newer /scratch system for shared scratch space.

What do the abbreviations in the storage hardware grid mean?

The legend below can be used to understand the abbreviations in the storage hardware grid table.

TermExplanation
HDDHard Disk Drive
SSDSolid State Drive
NVMeNon-Volatile Memory Express
IBInfiniband
GBGigabyte
TBTerabyte

Performance Guidelines

Generally speaking, /local_scratch will always be the fastest file system to use because there is no network involved. However, this space cannot be shared between a group of nodes participating in a job, and the data must be moved to permanent storage before the job completes.

/scratch is our newest parallel file system that runs atop flash storage. It performs well across most access patterns. /scratch1 is a parallel file system that runs atop spinning disk drives and is best suited for workflows issuing sequential, large read or write requests. /fastscratch is also a parallel file system but runs atop NVMe flash drives and is best suited for workflows having small, random read or write access patterns. However, /fastscratch will also run well with any kind of sequential workload.

The Palmetto support team encourages you to test your workflows against all three file systems to see which one works best for you.