Batch Jobs

Submit scripts that run unattended

Modified

2026-03-23

Batch jobs let you submit work and disconnect. The job runs when resources are available, and you can check results later. This is ideal for long-running analyses, overnight jobs, or running many jobs in parallel.

Anatomy of a batch script

A batch script is a shell script with special #SBATCH directives that tell Slurm what resources you need.

#!/bin/bash
#SBATCH --job-name=my_analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=06:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err

# Run your script
Rscript analysis.R

Save this as my_job.slurm (or .sh – the extension doesn’t matter).

Submitting a batch job

Load any required modules on the head node before submitting. Slurm inherits your current environment (including PATH) by default, so software you load before sbatch is available inside the job.

# Create logs directory if it doesn't exist
mkdir -p logs

# Load R (this sets up PATH — inherited by the job)
module load R/4.5.3

# Submit the job
sbatch my_job.slurm

You’ll get a job ID:

Submitted batch job 12345

You can now disconnect – the job will run when resources are available.

SBATCH directives reference

Directive Example Meaning
--job-name --job-name=analysis Name shown in squeue
--cpus-per-task --cpus-per-task=4 Number of CPU cores
--mem --mem=8G Memory limit
--time --time=06:00:00 Max runtime (HH:MM:SS)
--output --output=logs/%x_%j.out Where to write stdout
--error --error=logs/%x_%j.err Where to write stderr
--partition --partition=gpu Which partition
--qos --qos=medium Quality of service

Output file patterns

Special patterns in --output and --error:

Pattern Expands to
%j Job ID
%x Job name
%N Node name
%a Array task ID

Example: --output=logs/%x_%j.out becomes logs/my_analysis_12345.out

Slurm environment variables

Inside a running job, Slurm sets environment variables that describe the allocation. These are useful in scripts and R code to adapt to the allocated resources automatically:

Variable Example value Description
SLURM_JOB_ID 12345 Unique job ID
SLURM_JOB_NAME my_analysis Job name (from --job-name)
SLURM_CPUS_PER_TASK 4 Number of CPUs allocated to this task
SLURM_MEM_PER_NODE 8192 Memory allocated in MB (when using --mem)
SLURM_NTASKS 1 Number of tasks (from --ntasks)
SLURM_NODELIST node03 Node(s) the job is running on
SLURM_JOB_PARTITION compute Partition the job is in
SLURM_JOB_QOS medium QoS the job is running under
SLURM_ARRAY_TASK_ID 7 Current index in an array job
SLURM_ARRAY_JOB_ID 12340 Parent job ID of an array job
SLURM_SUBMIT_DIR /srv/home/user/project Directory where sbatch was called
TMPDIR /localdisk/slurm-12345 Fast local scratch directory for the job

The most useful one is SLURM_CPUS_PER_TASK — use it to set thread counts so your code automatically adapts to the allocation:

# In R: match parallelism to allocated CPUs
n_cores <- as.integer(Sys.getenv("SLURM_CPUS_PER_TASK", "1"))

# Works with any parallel framework
parallel::mclapply(data, my_fun, mc.cores = n_cores)
future::plan(future::multisession, workers = n_cores)
mirai::daemons(n = n_cores)
ranger::ranger(y ~ ., data = df, num.threads = n_cores)
# In bash: use for any tool that takes a thread count
Rscript --vanilla my_script.R --cores=$SLURM_CPUS_PER_TASK
TipOMP_NUM_THREADS is set automatically

The cluster task prolog sets OMP_NUM_THREADS (and related BLAS thread variables) to match SLURM_CPUS_PER_TASK. Libraries that use OpenMP or BLAS (linear algebra, matrix operations) automatically use the right number of threads — no action needed.

Example: R analysis script

analysis.R

library(data.table)

# Your analysis code
dt <- fread("data/input.csv")
result <- dt[, .(mean_value = mean(value)), by = group]
fwrite(result, "output/results.csv")

message("Analysis complete!")

run_analysis.slurm

#!/bin/bash
#SBATCH --job-name=analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=04:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err

Rscript analysis.R
module load R/4.5.3
sbatch run_analysis.slurm

Monitoring batch jobs

# Check status
squeue --me

# Watch status (updates every 2 seconds)
watch squeue --me

# Check output while running
tail -f logs/analysis_12345.out

# After completion, check what happened
sacct -j 12345 --format=JobID,State,ExitCode,Elapsed,MaxRSS

Array jobs

Run the same script many times with different inputs – perfect for simulations, cross-validation, or processing many files.

#!/bin/bash
#SBATCH --job-name=simulation
#SBATCH --array=1-100
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=01:00:00
#SBATCH --output=logs/%x_%a.out

# SLURM_ARRAY_TASK_ID contains the current task number (1-100)
Rscript simulation.R $SLURM_ARRAY_TASK_ID

In your R script:

args <- commandArgs(trailingOnly = TRUE)
task_id <- as.integer(args[1])

# Use task_id to vary your analysis
set.seed(task_id)
# ... run simulation ...

Array job options

#SBATCH --array=1-100        # Tasks 1 through 100
#SBATCH --array=1,3,5,7      # Specific tasks
#SBATCH --array=1-100%10     # Max 10 running at once

Passing arguments to scripts

You can pass arguments to your batch script:

process.slurm

#!/bin/bash
#SBATCH --job-name=process
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out

Rscript process.R "$1" "$2"
module load R/4.5.3
sbatch process.slurm input.csv output.csv

Job dependencies

Run jobs in sequence – job B waits for job A:

# Submit first job
JOB1=$(sbatch --parsable job1.slurm)

# Submit second job, depends on first
sbatch --dependency=afterok:$JOB1 job2.slurm

Dependency types:

  • afterok:jobid – Run after job succeeds
  • afterany:jobid – Run after job finishes (success or fail)
  • afternotok:jobid – Run after job fails

Common issues

Job output not appearing

Make sure the output directory exists before submitting:

mkdir -p logs
sbatch my_job.slurm

Job failed immediately

Check the error log:

cat logs/my_job_12345.err

Common causes:

  • Script not found (wrong path)
  • File or directory not found (output directory doesn’t exist — create logs/ first if using --output=logs/...)
  • R or other software not found — make sure you ran module load before sbatch

R packages not found

Remember: install packages on the head node (which has internet), then they’re available in batch jobs:

# On head node
module load R/4.5.3
R -e "install.packages('data.table')"

# Then submit your job
sbatch my_job.slurm

Tips

Test interactively first

Before submitting a long batch job, test your script in an interactive session:

salloc --cpus-per-task=2 --mem=4G --time=00:30:00
module load R/4.5.3
Rscript my_script.R  # Make sure it works
exit
sbatch my_job.slurm  # Now submit for real

Use descriptive job names

Makes it easier to track jobs in squeue:

#SBATCH --job-name=simulation_v2

Log everything

Write output files with timestamps or job IDs so you can trace results back to specific runs.