Skip to main content

Summer 2024 Maintenance

The RCD team has scheduled a maintenance window for the Palmetto Cluster between May 6th and May 13th.

Planned Changes

Owner nodes and HDR phases will move to Palmetto 2

During the maintenance, we will move all nodes from owner queues and HDR phases to Palmetto 2.

This transition will allow us to consolidate our resources and provide a more efficient and streamlined user experience.

/scratch1 and /fastscratch will be decommissioned

The /scratch1 and /fastscratch file systems are approaching end of life and their warranty coverage will soon expire. These file systems will be decommissioned during the maintenance.

danger

All data on /scratch1 and /fastscratch will be permanently lost.

tip

Since large data transfers may take some time, we recommend that users back up any important files stored on scratch file systems immediately.

/scratch will be purged

To improve performance, system administrators will erase the /scratch file system during the maintenance.

danger

All data on /scratch will be permanently lost.

tip

Since large data transfers may take some time, we recommend that users back up any important files stored on scratch file systems immediately.

ZFS group storage is moving to the Indigo Data Lake

Palmetto 2 will move to general availability

CCIT Research Computing and Data is proud to announce the general availability of the Palmetto 2 HPC Resource at Clemson. Palmetto 2 will consist of all HDR and NDR IB phases of nodes, and will also contain 65 new core-dense CPU nodes. Palmetto 2 will utilize Slurm for job scheduling, and have self-service group management for owner resources through ColdFront. Extended office hours will be provided to assist users transitioning to Palmetto 2.

New software modules will be available on Palmetto 2

  • Software modules will be updated to newer versions
  • New Software module system will be implemented so that users can load CPU/GPU specific optimized software
  • Documentation on the New Module System will be made available

ColdFront will be available for self-service user management

We are excited to announce the rollout of Clemson ColdFront, our new allocation and resource management interface. All accounts for Palmetto 2 will be managed through ColdFront, making the process simpler and more efficient.

ColdFront is now available, and controls access to the Slurm Beta/Palmetto 2 cluster. Allocations created in ColdFront now will continue to function on Palmetto 2 after maintenance.

To gain access to Palmetto 2, faculty must:

  1. Create one or more projects in ColdFront. Faculty can have multiple projects to separate research groups and classes.
  2. Add users to the project(s). Project owners and managers can also bulk add accounts by entering a list of usernames in the search field. Optionally, the project owner can delegate a project user to be a manager allowing a trusted user (e.g. graduate student or lab coordinator) to request allocations and manage users on the project owner's behalf.
  3. Request an allocation for Palmetto 2 General Queue to the project(s).

Once the allocation is approved, users added to the allocation will be granted access to the cluster. The project owner and managers can then add and remove users from their project and allocation at any time and changes should propagate quickly (within 30 minutes).

We have a new documentation section that covers more details, and if there are any issues, questions, or feedback, please submit a support request.

Open OnDemand will be updated

  • Open OnDemand Interactive apps will be updated for usage with Palmetto 2
  • Maximum Walltime on all Open OnDemand Interactive apps will be limited to 12 hrs.

New job monitoring and visualization tool: jobstats

  • jobstats will provide indepth job metrics giving users a more comprehensive view of job performance. This will allow users to get a better understanding of how much compute resources are being used.
  • jobstats will have an Open OnDemand app as well to provide job performance visualizations
  • We will be monitoring jobs on the cluster and will automatically terminate idle jobs or jobs that are severely underutilizing requested compute resources.

FAQ

Will I be able to access RCD services during the maintenance?

When the maintenance begins at 9:00 AM on Monday, May 6th, the following RCD services will go offline:

  • Palmetto Cluster
  • Indigo Data Lake
  • Open OnDemand
  • XDMoD
  • ColdFront
  • RCD GitLab

When will users be able to access RCD services again?

Users should expect that RCD services will be available again no earlier than 9:00 AM EDT on Monday, May 13th.

What will happen to jobs that are running when maintenance begins?

All jobs will be canceled when maintenance begins and will not be re-queued.

Users may resubmit their jobs after the maintenance.

How can I back up my important files in scratch storage?

warning

Large data transfers can take a surprising amount of time. Please begin your transfers so they complete before maintenance begins.

tip

We encourage you to only save the files you still need. Use this purge of scratch as an opportunity for spring cleaning!

Within the cluster, you can move files to your home directory (limited to the 100GB quota) or an Indigo project space.

You can also transfer the data outside the cluster. Refer to our data transfer documentation for a complete set of options. We recommend Globus for large data transfers.