Batch Jobs
Submit scripts that run unattended
Batch jobs let you submit work and disconnect. The job runs when resources are available, and you can check results later. This is ideal for long-running analyses, overnight jobs, or running many jobs in parallel.
Anatomy of a batch script
A batch script is a shell script with special #SBATCH directives that tell Slurm what resources you need.
#!/bin/bash
#SBATCH --job-name=my_analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=06:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err
# Run your script
Rscript analysis.RSave this as my_job.slurm (or .sh – the extension doesn’t matter).
Submitting a batch job
Load any required modules on the head node before submitting. Slurm inherits your current environment (including PATH) by default, so software you load before sbatch is available inside the job.
# Create logs directory if it doesn't exist
mkdir -p logs
# Load R (this sets up PATH — inherited by the job)
module load R/4.5.3
# Submit the job
sbatch my_job.slurmYou’ll get a job ID:
Submitted batch job 12345
You can now disconnect – the job will run when resources are available.
SBATCH directives reference
| Directive | Example | Meaning |
|---|---|---|
--job-name |
--job-name=analysis |
Name shown in squeue |
--cpus-per-task |
--cpus-per-task=4 |
Number of CPU cores |
--mem |
--mem=8G |
Memory limit |
--time |
--time=06:00:00 |
Max runtime (HH:MM:SS) |
--output |
--output=logs/%x_%j.out |
Where to write stdout |
--error |
--error=logs/%x_%j.err |
Where to write stderr |
--partition |
--partition=gpu |
Which partition |
--qos |
--qos=medium |
Quality of service |
Output file patterns
Special patterns in --output and --error:
| Pattern | Expands to |
|---|---|
%j |
Job ID |
%x |
Job name |
%N |
Node name |
%a |
Array task ID |
Example: --output=logs/%x_%j.out becomes logs/my_analysis_12345.out
Slurm environment variables
Inside a running job, Slurm sets environment variables that describe the allocation. These are useful in scripts and R code to adapt to the allocated resources automatically:
| Variable | Example value | Description |
|---|---|---|
SLURM_JOB_ID |
12345 |
Unique job ID |
SLURM_JOB_NAME |
my_analysis |
Job name (from --job-name) |
SLURM_CPUS_PER_TASK |
4 |
Number of CPUs allocated to this task |
SLURM_MEM_PER_NODE |
8192 |
Memory allocated in MB (when using --mem) |
SLURM_NTASKS |
1 |
Number of tasks (from --ntasks) |
SLURM_NODELIST |
node03 |
Node(s) the job is running on |
SLURM_JOB_PARTITION |
compute |
Partition the job is in |
SLURM_JOB_QOS |
medium |
QoS the job is running under |
SLURM_ARRAY_TASK_ID |
7 |
Current index in an array job |
SLURM_ARRAY_JOB_ID |
12340 |
Parent job ID of an array job |
SLURM_SUBMIT_DIR |
/srv/home/user/project |
Directory where sbatch was called |
TMPDIR |
/localdisk/slurm-12345 |
Fast local scratch directory for the job |
The most useful one is SLURM_CPUS_PER_TASK — use it to set thread counts so your code automatically adapts to the allocation:
# In R: match parallelism to allocated CPUs
n_cores <- as.integer(Sys.getenv("SLURM_CPUS_PER_TASK", "1"))
# Works with any parallel framework
parallel::mclapply(data, my_fun, mc.cores = n_cores)
future::plan(future::multisession, workers = n_cores)
mirai::daemons(n = n_cores)
ranger::ranger(y ~ ., data = df, num.threads = n_cores)# In bash: use for any tool that takes a thread count
Rscript --vanilla my_script.R --cores=$SLURM_CPUS_PER_TASKThe cluster task prolog sets OMP_NUM_THREADS (and related BLAS thread variables) to match SLURM_CPUS_PER_TASK. Libraries that use OpenMP or BLAS (linear algebra, matrix operations) automatically use the right number of threads — no action needed.
Example: R analysis script
analysis.R
library(data.table)
# Your analysis code
dt <- fread("data/input.csv")
result <- dt[, .(mean_value = mean(value)), by = group]
fwrite(result, "output/results.csv")
message("Analysis complete!")run_analysis.slurm
#!/bin/bash
#SBATCH --job-name=analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=04:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err
Rscript analysis.Rmodule load R/4.5.3
sbatch run_analysis.slurmMonitoring batch jobs
# Check status
squeue --me
# Watch status (updates every 2 seconds)
watch squeue --me
# Check output while running
tail -f logs/analysis_12345.out
# After completion, check what happened
sacct -j 12345 --format=JobID,State,ExitCode,Elapsed,MaxRSSArray jobs
Run the same script many times with different inputs – perfect for simulations, cross-validation, or processing many files.
#!/bin/bash
#SBATCH --job-name=simulation
#SBATCH --array=1-100
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=01:00:00
#SBATCH --output=logs/%x_%a.out
# SLURM_ARRAY_TASK_ID contains the current task number (1-100)
Rscript simulation.R $SLURM_ARRAY_TASK_IDIn your R script:
args <- commandArgs(trailingOnly = TRUE)
task_id <- as.integer(args[1])
# Use task_id to vary your analysis
set.seed(task_id)
# ... run simulation ...Array job options
#SBATCH --array=1-100 # Tasks 1 through 100
#SBATCH --array=1,3,5,7 # Specific tasks
#SBATCH --array=1-100%10 # Max 10 running at oncePassing arguments to scripts
You can pass arguments to your batch script:
process.slurm
#!/bin/bash
#SBATCH --job-name=process
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out
Rscript process.R "$1" "$2"module load R/4.5.3
sbatch process.slurm input.csv output.csvJob dependencies
Run jobs in sequence – job B waits for job A:
# Submit first job
JOB1=$(sbatch --parsable job1.slurm)
# Submit second job, depends on first
sbatch --dependency=afterok:$JOB1 job2.slurmDependency types:
afterok:jobid– Run after job succeedsafterany:jobid– Run after job finishes (success or fail)afternotok:jobid– Run after job fails
Common issues
Job output not appearing
Make sure the output directory exists before submitting:
mkdir -p logs
sbatch my_job.slurmJob failed immediately
Check the error log:
cat logs/my_job_12345.errCommon causes:
- Script not found (wrong path)
- File or directory not found (output directory doesn’t exist — create
logs/first if using--output=logs/...) - R or other software not found — make sure you ran
module loadbeforesbatch
R packages not found
Remember: install packages on the head node (which has internet), then they’re available in batch jobs:
# On head node
module load R/4.5.3
R -e "install.packages('data.table')"
# Then submit your job
sbatch my_job.slurmTips
Test interactively first
Before submitting a long batch job, test your script in an interactive session:
salloc --cpus-per-task=2 --mem=4G --time=00:30:00
module load R/4.5.3
Rscript my_script.R # Make sure it works
exit
sbatch my_job.slurm # Now submit for realUse descriptive job names
Makes it easier to track jobs in squeue:
#SBATCH --job-name=simulation_v2Log everything
Write output files with timestamps or job IDs so you can trace results back to specific runs.