Using `{batchtools}`

Submitting Slurm jobs from R

Modified

2026-03-23

{batchtools} is the recommended approach for multi-node parallelism on this cluster, especially for simulation studies and benchmark experiments. It submits and manages Slurm jobs directly from R — handling sbatch submission, log collection, result retrieval, and job status tracking so you work in R instead of writing shell scripts.

batchtools is robust, well-tested, and the tool most users on this cluster are already familiar with. For more complex pipelines with dependency tracking and caching, see also targets + crew.cluster, which offers more flexibility at the cost of additional complexity.

The cluster ships with a default configuration and a custom Slurm template that work out of the box. You can use them as-is or override them per project.

Quick start

library(batchtools)

# Create a registry (stores jobs and results in a directory)
reg <- makeRegistry(file.dir = "my_registry", seed = 1)

# Define jobs -- here, a simple function applied to different inputs
batchMap(function(x) {
  Sys.sleep(2)
  x^2
}, x = 1:10)

# Submit all jobs to Slurm (uses cluster defaults)
submitJobs()

# Check status
getStatus()

# Wait for completion, then collect results
waitForJobs()
reduceResultsList()

That’s it. Behind the scenes, batchtools generates sbatch scripts from the cluster template, submits them to Slurm, and tracks everything in the registry directory.

Cluster defaults

The cluster provides default configuration files at /etc/xdg/batchtools/:

File	Purpose
`config.R`	Sets the Slurm cluster functions, default resources, and job limits
`slurm_bips.tmpl`	Custom Slurm template with resource validation and QoS handling

{batchtools} automatically picks up /etc/xdg/batchtools/config.R as a fallback when no project-level or user-level config is found. You don’t need to do anything to use it.

Default resource values

The cluster defaults are:

Resource	Default	Meaning
`ncpus`	1	CPUs per task (`--cpus-per-task`)
`memory`	1024	Total memory in MB (`--mem`)
`walltime`	21600	Walltime in seconds (6 hours)
`qos`	`"medium"`	QoS level (1 day limit)
`partition`	`"compute"`	Slurm partition
`measure.memory`	`TRUE`	Track peak memory usage (rough heuristic, will likely underestimate)

The maximum number of concurrent jobs (max.concurrent.jobs) is set to 500 as a somewhat conservative default.

Overriding defaults per submission

Pass a resources list to submitJobs() to override any default:

submitJobs(resources = list(
  ncpus = 4,
  memory = 4096,
  hours = 48,
  qos = "long"
))

Only the resources you specify are overridden – everything else keeps the default value.

Resources reference

The Slurm template supports the following resources. Set them via submitJobs(resources = list(...)) or as default.resources in your config.

CPU and memory

Resource	Type	Default	Slurm flag	Description
`ncpus`	integer	1	`--cpus-per-task`	Number of CPUs per task. Request as many as your code actually uses (e.g. `ncpus = 4` for `mc.cores = 4`).
`ntasks`	integer	1	`--ntasks`	Number of MPI tasks. Only set > 1 for MPI parallelism (`Rmpi`, `pbdMPI`).
`memory`	integer	1024	`--mem`	Total memory for the job in MB. Mutually exclusive with `mem_per_cpu`.
`mem_per_cpu`	integer	—	`--mem-per-cpu`	Memory per CPU in MB. Mutually exclusive with `memory`.

By default, memory specifies the total memory for the job (--mem). If you prefer Slurm’s per-CPU model, use mem_per_cpu instead — but not both.

# 8 GB total for the job
submitJobs(resources = list(memory = 8 * 1024))

# 2 GB per CPU (e.g. 8 GB total with ncpus = 4)
submitJobs(resources = list(mem_per_cpu = 2048, ncpus = 4))

ncpus vs ntasks

Most R users want ncpus for parallel computing (e.g., mclapply(mc.cores = 4) or future::plan(multicore, workers = 4)). Setting ntasks > 1 is for MPI and will trigger a warning if used.

Time specification

Specify walltime using any of these units:

Resource	Type	Description
`walltime`	integer	Walltime in seconds
`days`	integer	Walltime in days
`hours`	integer	Walltime in hours
`minutes`	integer	Walltime in minutes

submitJobs(resources = list(hours = 6))     # 6 hours
submitJobs(resources = list(days = 3))      # 3 days
submitJobs(resources = list(minutes = 30))  # 30 minutes

If you don’t specify any time, the walltime defaults to the QoS limit (1 day for the default "medium" QoS).

Multiple time units: largest unit wins

If multiple time units are present (e.g. from merging default.resources with per-job resources), the largest unit takes priority: days > hours > minutes > walltime.

For example, if your config sets walltime = 21600 (6 hours) and you submit a job with days = 4, the days specification wins and the job gets 4 days.

This means you can safely override time defaults without clearing them first:

# Config has: default.resources = list(walltime = 21600)  # 6 hours
# Per-job override with a coarser unit just works:
submitJobs(resources = list(days = 4))  # → 4 days (days > walltime)

Use walltime in default resources

Use walltime (seconds) for the time setting in your default.resources. Since it has the lowest priority, any per-job override using minutes, hours, or days will cleanly take precedence.

QoS (Quality of Service)

Resource	Type	Default	Description
`qos`	character	`"medium"`	QoS level. See Slurm Basics – QoS for available levels and their time limits.

In most cases you don’t need to set qos explicitly. The template handles it automatically:

Only walltime set (recommended): The template picks the smallest QoS that fits your walltime. For example, hours = 3 auto-selects "medium", days = 3 auto-selects "long".
Only QoS set (or neither): The walltime is set conservatively to the QoS limit. With the default "medium", that means 1 day.
Both set: The template validates that the walltime fits within the QoS limit and errors if it doesn’t.

# Template auto-selects qos = "long" (7 days covers 3 days)
submitJobs(resources = list(days = 3))

# Template auto-selects qos = "short" (1 hour covers 30 minutes)
submitJobs(resources = list(minutes = 30))

Note

The "interactive" QoS is not allowed for batch jobs and will be rejected.

GPU

Resource	Type	Default	Description
`gpus`	integer	—	Number of GPUs (1–2).

submitJobs(resources = list(gpus = 1, partition = "gpu"))

Node selection

Resource	Type	Default	Description
`partition`	character	`"compute"`	Slurm partition (`"compute"` or `"gpu"`).
`nodelist`	character	—	Run on specific node(s), e.g. `"node01"` or `"node[11-12]"`.
`exclude`	character	—	Exclude node(s), e.g. `"node01"`.

Deadline

Resource	Type	Default	Description
`deadline`	character	—	Slurm removes the job if it can’t start in time to finish by this deadline. Passed directly to `sbatch --deadline`.

Slurm supports many time formats for deadlines:

# Absolute date/time
submitJobs(resources = list(deadline = "2026-03-10T07:30"))

# Relative: tomorrow at 7:30
submitJobs(resources = list(deadline = "tomorrow07:30"))

# Relative: 12 hours from now
submitJobs(resources = list(deadline = "now+12hours"))

# Keywords: midnight, noon, teatime (4PM), fika (3PM)
submitJobs(resources = list(deadline = "tomorrow"))

See the sbatch documentation for the full list of supported formats.

Priority

Resource	Type	Default	Description
`nice`	integer	—	Priority adjustment (-10000 to 10000). Higher values = lower priority. Negative values require elevated privileges.

Other

Resource	Type	Default	Description
`comment`	character	—	Annotation for the job, visible in `squeue`. Useful for identifying jobs by project.

Customizing the configuration

{batchtools} looks for configuration files in this order (first found wins):

batchtools.conf.R in the current working directory (project-level)
~/.batchtools.conf.R in your home directory (user-level)
/etc/xdg/batchtools/config.R (cluster default)

Per-project configuration

To customize settings for a specific project, create a batchtools.conf.R in the project directory:

# batchtools.conf.R -- project-level overrides

# Use the cluster template (same as default)
cluster.functions <- makeClusterFunctionsSlurm(
  "/etc/xdg/batchtools/slurm_bips.tmpl",
  array.jobs = TRUE
)

# Override defaults for this project
# Use walltime (seconds) so per-job overrides with hours/days take precedence
default.resources <- list(
  ncpus = 4,
  walltime = 43200,  # 12 hours
  memory = 4096,
  qos = "long",
  partition = "compute"
)

Per-user configuration

For user-wide defaults, create ~/.batchtools.conf.R:

# ~/.batchtools.conf.R -- user-level defaults

cluster.functions <- makeClusterFunctionsSlurm(
  "/etc/xdg/batchtools/slurm_bips.tmpl",
  array.jobs = TRUE
)

default.resources <- list(
  ncpus = 2,
  walltime = 21600,  # 6 hours; per-job hours/days overrides take precedence
  memory = 2048,
  qos = "medium",
  partition = "compute"
)

Using a custom template

If you need to modify the Slurm template itself (e.g., to add environment setup), copy it to your project or home directory and point your config at the copy:

# Copy the template
cp /etc/xdg/batchtools/slurm_bips.tmpl ~/my_slurm.tmpl

# In your batchtools.conf.R, point to the copy
cluster.functions <- makeClusterFunctionsSlurm(
  "~/my_slurm.tmpl",
  array.jobs = TRUE
)

Start from the cluster defaults

The default template includes input validation and QoS auto-selection. Copy it rather than writing one from scratch.

Registries and experiments

{batchtools} provides two workflow patterns:

Simple registry (`makeRegistry`)

Best for applying a function to many inputs:

reg <- makeRegistry(file.dir = "sim_registry", seed = 42)

# Map a function over inputs
batchMap(function(n, dist) {
  data <- switch(dist,
    normal = rnorm(n),
    uniform = runif(n)
  )
  mean(data)
}, n = c(100, 1000, 10000), dist = c("normal", "uniform", "normal"))

# Submit and collect
submitJobs()
waitForJobs()
reduceResultsList()

Experiment registry (`makeExperimentRegistry`)

Best for structured simulation studies with multiple problems/algorithms:

reg <- makeExperimentRegistry(file.dir = "experiment", seed = 42)

# Define a "problem" (data generation)
addProblem("sim_data", fun = function(n, ...) {
  list(data = rnorm(n))
})

# Define "algorithms" (methods to compare)
addAlgorithm("mean", fun = function(instance, ...) mean(instance$data))
addAlgorithm("median", fun = function(instance, ...) median(instance$data))

# Create experiment grid
addExperiments(
  prob.designs = list(sim_data = data.table(n = c(100, 1000))),
  repls = 50  # 50 replications each
)

# Submit all
submitJobs()

Job management

# Status overview
getStatus()

# Find specific jobs
findNotSubmitted()
findRunning()
findDone()
findErrors()

# View logs of failed jobs
getLog(id = 42)
# Or for errors specifically
getErrorMessages()

# Resubmit failed jobs
submitJobs(findErrors())

# Cancel running jobs
killJobs(findRunning())

# Clean up a registry
removeRegistry("experiment")

Common patterns

Multicore parallelism within jobs

To use parallel processing within each batchtools job, set ncpus to the number of cores you need:

submitJobs(resources = list(ncpus = 4))

Then inside your function, match the worker count to ncpus:

# With future (recommended)
future::plan(future::multicore, workers = 4)

# With mirai
mirai::daemons(n = 4)

# With parallel (base R)
parallel::mclapply(data, my_fun, mc.cores = 4)

Warning

Make sure the number of workers in your R code matches ncpus. Requesting ncpus = 4 but using 8 workers will oversubscribe the allocation.

Large memory jobs

submitJobs(resources = list(
  memory = 32 * 1024  # 32 GB total for the job
))

Long-running jobs

submitJobs(resources = list(
  days = 5,
  qos = "long"  # optional -- auto-selected from walltime
))

Annotating jobs for tracking

submitJobs(resources = list(
  comment = "my_project_sim_v3"
))

The comment is visible in squeue --me --format="%.18i %.9P %.30j %.8T %.10M %.9l %.6D %k".

Quick start

Cluster defaults

Default resource values

Overriding defaults per submission

Resources reference

CPU and memory

Time specification

QoS (Quality of Service)

GPU

Node selection

Deadline

Priority

Other

Customizing the configuration

Per-project configuration

Per-user configuration

Using a custom template

Registries and experiments

Simple registry (makeRegistry)

Experiment registry (makeExperimentRegistry)

Job management

Common patterns

Multicore parallelism within jobs

Large memory jobs

Long-running jobs

Annotating jobs for tracking

Simple registry (`makeRegistry`)

Experiment registry (`makeExperimentRegistry`)