Cluster Resources

Hardware specifications and limits

Modified

2026-04-22

This page summarizes the available hardware and resource limits.

Hardware overview

Compute nodes (12x)

Each of the 12 compute nodes (node01node12) is identical:

Resource Spec
CPU AMD EPYC 9655P (Zen 5, Turin)
Cores 96
Threads per core 2
Base clock 2.6 GHz
L3 cache 384 MB
RAM 1152 GB DDR5-5600 ECC (24x 48GB, i.e., 12GB per core, 6GB per thread)

GPU node (1x)

The GPU node (gnode01) has two CPUs and two GPUs:

Resource Spec
GPUs 2x NVIDIA H200 (141 GB HBM3e each, PCIe)
CPUs 2x AMD EPYC 9555 (Zen 5, Turin)
Cores 128 (2x 64)
Threads per core 1 (SMT disabled)
Base clock 3.2 GHz
L3 cache 512 MB (2x 256 MB)
RAM 3072 GB DDR5-4800 ECC (48x 64GB, i.e., 24GB per core)

Head node

Resource Spec
CPU AMD EPYC 7443 (Zen 3, Milan)
Cores 24
Base clock 2.85 GHz
L3 cache 128 MB
RAM 1024 GB DDR4-3200 ECC

The head node is for login, package installation, and job submission. Do not run computations here.

Partitions

sinfo  # View current partition status
Partition Nodes Default Max time
compute 12 (node01-12) Yes 20 days
gpu 1 (gnode01) No 20 days

Quality of Service (QoS)

QoS Max time Limits Priority
interactive 3 days 2 jobs, 192 CPUs highest
short 1 hour high
medium 1 day medium
long 7 days low
extended 20 days 1 job lowest
normal (partition default) baseline
Note

The interactive QoS is automatically applied for salloc sessions.

Storage

Path Type Purpose
/srv/home/<user> NFS (NVMe RAID5) Home directory, scripts, active project data
/mnt/sas NFS (HDD array) Long-term storage, archives, separate directories for users, working groups, and projects
ImportantKeep /srv/home lean

/srv/home is kept on fast NVMe storage with limited capacity. Move inactive projects and large datasets you don’t actively need to /mnt/sas to keep space available for everyone. /mnt/sas is slower, but has substantially more capacity.

Archive storage (/mnt/sas)

The /mnt/sas directory is intended for large datasets and archival storage – data that you don’t need immediate access to but want to keep available on the cluster.

Path Purpose
/mnt/sas/users/<username> Your personal archive space
/mnt/sas/groups/<group> Shared data for your research group
/mnt/sas/projects/<project> Project-specific shared storage (restricted access)
/mnt/sas/scratch Temporary workspace (may be cleaned periodically)

Project storage (/mnt/sas/projects)

Some research projects have dedicated shared storage under /mnt/sas/projects/<project>. Access to project directories is restricted – only members of the project group can read or write files there, and no other users can access the data.

  • Project membership is granted by the responsible PI and enforced by the cluster administrator.
  • Access may be time-limited and will automatically expire after the agreed-upon date.
  • If you need access to a project directory, contact the PI responsible for the project.
  • If you believe your access has expired in error, contact the cluster administrator.

Local scratch disk (/localdisk)

Each node has a fast local SSD mounted at /localdisk. This is not shared across nodes — each node has its own independent disk.

Node Disk Usable space
Compute nodes 2x 480 GB SATA SSD (RAID-1) ~417 GB
GPU node 2x 1.92 TB NVMe SSD (RAID-1) ~1.8 TB

When you run a Slurm job, a per-job scratch directory is automatically created and cleaned up afterwards. Two environment variables point to it:

  • TMPDIR — set to /localdisk/slurm-<jobid>
  • LOCALDISK_DIR — same path

Use these for temporary files that benefit from fast local I/O (e.g. intermediate results, caches, temporary databases). Data written here is automatically deleted when your job ends.

# tempdir() automatically uses the job scratch dir (follows TMPDIR)
tempdir()
#> "/localdisk/slurm-12345/RtmpXyz"

# For explicit access to the scratch directory
scratch <- Sys.getenv("LOCALDISK_DIR")
# In shell scripts
echo "$TMPDIR"          # /localdisk/slurm-12345
echo "$LOCALDISK_DIR"   # /localdisk/slurm-12345
import os, tempfile
tempfile.gettempdir()          # /localdisk/slurm-12345
os.environ["LOCALDISK_DIR"]   # /localdisk/slurm-12345
Warning

Do not write directly to /localdisk/ outside of your job’s scratch directory — there is no automatic cleanup for files outside TMPDIR.

WarningReminder: no backups

The cluster storage is not backed up. Any deleted data cannot be restored. This applies to all storage paths (/srv/home, /mnt/sas, /srv/data). The cluster is meant for active computation, not as a primary archive.

Useful commands

# Cluster status
sinfo

# Your running jobs
squeue --me

# Your past jobs
sacct --starttime=today

# Detailed job info
scontrol show job <jobid>

# QoS limits
sacctmgr show qos format=name,maxwall,maxjobspu

# Node details
scontrol show node <nodename>