Tips & Tools

Useful tools and practices for working on the cluster

Modified

2026-01-28

The cluster comes with a number of modern command-line tools pre-installed on the head node. No modules or special setup needed – they’re ready to use.

Text editors

If you need to quickly edit a file on the cluster (a script, a config file, your ~/.bashrc), these terminal-based editors are available.

nano

The simplest editor. If you’ve never used a terminal editor before, start here.

nano my_script.R

Commands are shown at the bottom of the screen. The ^ symbol means Ctrl:

Action Keys
Save Ctrl+O, then Enter
Exit Ctrl+X
Cut line Ctrl+K
Paste Ctrl+U
Search Ctrl+W
Go to line Ctrl+_

micro

A more comfortable terminal editor with familiar keybindings (Ctrl+S to save, Ctrl+Q to quit, Ctrl+C/V for copy/paste). Supports syntax highlighting, mouse interaction, and multiple tabs.

micro my_script.R
Action Keys
Save Ctrl+S
Quit Ctrl+Q
Copy Ctrl+C
Paste Ctrl+V
Cut line Ctrl+K
Undo Ctrl+Z
Find Ctrl+F
Go to line Ctrl+G
Tip

If you’re used to graphical editors, micro will feel more natural than nano. For anything more involved, consider using Positron or VS Code remotely.

File search and navigation

fd – find files by name

fd is a modern replacement for the traditional find command. It’s faster, has simpler syntax, and ignores hidden files and .gitignore patterns by default.

# Find files by name (searches recursively from current directory)
fd analysis
# my_project/analysis.R
# my_project/results/analysis_output.csv

# Find only R files
fd --extension R

# Find files matching a pattern
fd 'sim_[0-9]+\.csv'

# Find files in a specific directory
fd --extension R /srv/home/burk/projects

# Include hidden files (ignored by default)
fd --hidden .envrc

The traditional equivalent would be find . -name '*.R'fd does the same with just fd -e R.

rg (ripgrep) – search file contents

rg searches through file contents. It’s a modern replacement for grep – much faster, shows results in context, and respects .gitignore.

# Search for a string in all files
rg "read.csv"
# analysis.R:3: dt <- read.csv("data/input.csv")
# process.R:12: raw <- read.csv(args[1])

# Search only in R files
rg "library" --type r

# Search case-insensitively
rg -i "anova"

# Show 2 lines of context around matches
rg -C 2 "model <-"

# Count matches per file
rg -c "TODO"

# List files that contain a pattern (without showing matches)
rg -l "data.table"

# Search for a whole word (not substring)
rg -w "dt"

# Search with a regex
rg 'mclapply\(.*mc\.cores'

The traditional equivalent would be grep -r "read.csv" .rg is faster and produces cleaner output.

Disk usage

diskus – fast directory size

diskus quickly shows the total size of a directory. It’s a faster alternative to du -sh.

# Check how much space a directory uses
diskus ~/my_project
# 2.34 GB

# Check your home directory size
diskus ~

To find what’s taking up space, combine with du for a breakdown:

# Top 10 largest subdirectories
du -h --max-depth=1 ~ | sort -hr | head -10
Note

Remember to move inactive projects to /mnt/sas to keep /srv/home lean.

tmux – persistent terminal sessions

When you SSH into the cluster, your session is tied to your connection. If your network drops or you close your laptop, any running interactive session is lost. tmux solves this by running a persistent terminal session on the head node that survives disconnects.

This is especially useful for interactive Slurm sessions – you can start a salloc session inside tmux, detach, and come back later.

Basic workflow

# Start a new tmux session (give it a name)
tmux new -s analysis

# Inside tmux, do your work
salloc --cpus-per-task=4 --mem=8G --time=04:00:00
module load R/4.5.2
R
# ... work ...

# Detach from tmux (session keeps running): Ctrl+b, then d

# You can now disconnect from SSH entirely.
# Later, reconnect via SSH and reattach:
tmux attach -t analysis

Essential commands

Action Keys
Detach from session Ctrl+b, then d
New window Ctrl+b, then c
Next window Ctrl+b, then n
Previous window Ctrl+b, then p
Split horizontally Ctrl+b, then "
Split vertically Ctrl+b, then %
Switch pane Ctrl+b, then arrow key
# List running sessions
tmux ls

# Attach to a session
tmux attach -t analysis

# Kill a session
tmux kill-session -t analysis
TipNaming sessions

Always name your tmux sessions (tmux new -s name) so you can easily find and reattach to them. Without a name, tmux assigns numbers (0, 1, …) which are easy to mix up.

direnv – per-project environments

direnv automatically loads and unloads environment variables when you enter or leave a directory. This is useful for loading project-specific modules or settings without cluttering your ~/.bashrc.

Setup (one-time)

Add this to your ~/.bashrc:

module load spack/1.1.1
module load direnv
eval "$(direnv hook bash)"

Usage

Create an .envrc file in any project directory:

# ~/my_project/.envrc
module load R/4.5.2

The first time, direnv will ask you to allow the file:

cd ~/my_project
# direnv: error .envrc is blocked. Run `direnv allow` to approve its content.
direnv allow
# direnv: loading ~/my_project/.envrc
# direnv: export +PATH ...

From now on, R is loaded automatically when you cd into the project and unloaded when you leave.

File transfer

Transferring files between your local machine and the cluster is a common task. We recommend rsync for command-line transfers – it’s efficient, resumable, and handles large datasets well.

rsync basics

rsync synchronizes files between locations. It only transfers what’s changed, making it ideal for updating projects.

Upload to the cluster:

# Copy a directory to your home folder
rsync -avP ~/my_project/ <user>@bips-cluster:/srv/home/<user>/my_project/

# Copy specific files
rsync -avP data/*.csv <user>@bips-cluster:/srv/home/<user>/project/data/

Download from the cluster:

# Download results to your local machine
rsync -avP <user>@bips-cluster:/srv/home/<user>/project/results/ ~/Downloads/results/

# Download a single file
rsync -avP <user>@bips-cluster:/srv/home/<user>/project/output.csv .

Common flags:

Flag Purpose
-a Archive mode (preserves permissions, timestamps, symlinks)
-v Verbose output
-P Show progress and allow resuming interrupted transfers
-z Compress during transfer (useful for text files over slow connections)
--dry-run Preview what would be transferred without actually copying
--exclude Skip files matching a pattern (e.g., --exclude='*.tmp')
TipTrailing slashes matter

In rsync, a trailing slash on the source has meaning:

  • rsync -av my_project/ remote:my_project/ copies the contents of my_project into my_project
  • rsync -av my_project remote:my_project/ copies the directory itself, creating my_project/my_project

When in doubt, use --dry-run first to see what would happen.

Where to put large data

Before transferring large datasets, consider where they should live:

  • Active project data/srv/home/<user>/ (fast NVMe storage)
  • Large datasets, archives, inactive projects/mnt/sas/<user>/ (high capacity, slower)

See Storage for details on the different storage tiers.

For very large transfers to /mnt/sas, consider running rsync inside a tmux session so it continues if your connection drops:

tmux new -s transfer
rsync -avP /local/big_dataset/ <user>@bips-cluster:/mnt/sas/<user>/big_dataset/
# Ctrl+b, d to detach

GUI applications (SFTP clients)

If you prefer a graphical interface for file transfers, these SFTP clients work well:

Application Platform Notes
FileZilla Windows, macOS, Linux Popular, full-featured, supports SFTP
WinSCP Windows Integrates with PuTTY, includes basic editor
Cyberduck macOS, Windows Clean interface, integrates with Finder/Explorer
Transmit macOS Polished, fast (commercial)

To connect, use these settings:

  • Protocol: SFTP (SSH File Transfer Protocol)
  • Host: bips-cluster (or the full hostname)
  • Port: 22
  • Username: your cluster username
  • Authentication: your SSH key or password
Note

The same storage guidelines apply – use /srv/home/<user> for active work and /mnt/sas/<user> for large or archival data.