Summer 2024 Maintenance
The RCD team has scheduled a maintenance window for the Palmetto Cluster between May 6th and May 13th.
Planned Changes
Owner nodes and HDR phases will move to Palmetto 2
During the maintenance, we will move all nodes from owner queues and HDR phases to Palmetto 2.
This transition will allow us to consolidate our resources and provide a more efficient and streamlined user experience.
/scratch1
and /fastscratch
will be decommissioned
The /scratch1
and /fastscratch
file systems are approaching end of life and
their warranty coverage will soon expire. These file systems will be
decommissioned during the maintenance.
All data on /scratch1
and /fastscratch
will be permanently lost.
Since large data transfers may take some time, we recommend that users back up any important files stored on scratch file systems immediately.
/scratch
will be purged
To improve performance, system administrators will erase the /scratch
file
system during the maintenance.
All data on /scratch
will be permanently lost.
Since large data transfers may take some time, we recommend that users back up any important files stored on scratch file systems immediately.
ZFS group storage is moving to the Indigo Data Lake
Palmetto 2 will move to general availability
CCIT Research Computing and Data is proud to announce the general availability of the Palmetto 2 HPC Resource at Clemson. Palmetto 2 will consist of all HDR and NDR IB phases of nodes, and will also contain 65 new core-dense CPU nodes. Palmetto 2 will utilize Slurm for job scheduling, and have self-service group management for owner resources through ColdFront. Extended office hours will be provided to assist users transitioning to Palmetto 2.
New software modules will be available on Palmetto 2
- Software modules will be updated to newer versions
- New Software module system will be implemented so that users can load CPU/GPU specific optimized software
- Documentation on the New Module System will be made available
ColdFront will be available for self-service user management
We are excited to announce the rollout of Clemson ColdFront, our new allocation and resource management interface. All accounts for Palmetto 2 will be managed through ColdFront, making the process simpler and more efficient.
ColdFront is now available, and controls access to the Slurm Beta/Palmetto 2 cluster. Allocations created in ColdFront now will continue to function on Palmetto 2 after maintenance.
To gain access to Palmetto 2, faculty must:
- Create one or more projects in ColdFront. Faculty can have multiple projects to separate research groups and classes.
- Add users to the project(s). Project owners and managers can also bulk add accounts by entering a list of usernames in the search field. Optionally, the project owner can delegate a project user to be a manager allowing a trusted user (e.g. graduate student or lab coordinator) to request allocations and manage users on the project owner's behalf.
- Request an allocation for Palmetto 2 General Queue to the project(s).
Once the allocation is approved, users added to the allocation will be granted access to the cluster. The project owner and managers can then add and remove users from their project and allocation at any time and changes should propagate quickly (within 30 minutes).
We have a new documentation section that covers more details, and if there are any issues, questions, or feedback, please submit a support request.
Open OnDemand will be updated
- Open OnDemand Interactive apps will be updated for usage with Palmetto 2
- Maximum Walltime on all Open OnDemand Interactive apps will be limited to 12 hrs.
New job monitoring and visualization tool: jobstats
jobstats
will provide indepth job metrics giving users a more comprehensive view of job performance. This will allow users to get a better understanding of how much compute resources are being used.jobstats
will have an Open OnDemand app as well to provide job performance visualizations- We will be monitoring jobs on the cluster and will automatically terminate idle jobs or jobs that are severely underutilizing requested compute resources.
FAQ
Will I be able to access RCD services during the maintenance?
When the maintenance begins at 9:00 AM on Monday, May 6th, the following RCD services will go offline:
- Palmetto Cluster
- Indigo Data Lake
- Open OnDemand
- XDMoD
- ColdFront
- RCD GitLab
When will users be able to access RCD services again?
Users should expect that RCD services will be available again no earlier than 9:00 AM EDT on Monday, May 13th.
What will happen to jobs that are running when maintenance begins?
All jobs will be canceled when maintenance begins and will not be re-queued.
Users may resubmit their jobs after the maintenance.
How can I back up my important files in scratch storage?
Large data transfers can take a surprising amount of time. Please begin your transfers so they complete before maintenance begins.
We encourage you to only save the files you still need. Use this purge of scratch as an opportunity for spring cleaning!
Within the cluster, you can move files to your home directory (limited to the 100GB quota) or an Indigo project space.
You can also transfer the data outside the cluster. Refer to our data transfer documentation for a complete set of options. We recommend Globus for large data transfers.