Python on the Cluster

Setting up Python projects with uv, conda/mamba, or system Python

Modified

2026-03-13

Python is not centrally managed on this cluster — you are responsible for your own Python toolchain. This is by design: different projects need different Python versions and dependencies, and user-managed environments avoid conflicts between users.

This page covers three approaches, in order of recommendation:

  1. uv — fast, modern project and environment manager (recommended for most use cases)
  2. Conda / Micromamba — heavier but useful when you need non-Python dependencies bundled together
  3. System Python — last resort, limited to the OS-provided version

All of these tools install into your home directory and do not require admin privileges.

ImportantInstall packages on the head node

Compute nodes do not have internet access. Always create environments and install packages on the head node, then run your code on compute nodes. Your home directory (/srv/home/<user>) is shared via NFS, so environments are available on all nodes.

uv (recommended)

uv is a fast Python package and project manager. It handles Python version management, virtual environments, dependency resolution, and lockfiles — all in one tool. It is significantly faster than pip and does not require a pre-installed Python.

Installing uv

curl -LsSf https://astral.sh/uv/install.sh | sh

Then restart your shell or run:

source ~/.cargo/env

Verify it works:

uv --version

Starting a new project

uv init creates a project with a pyproject.toml, which is the standard way to define Python project metadata and dependencies:

mkdir ~/my_project && cd ~/my_project
uv init

This creates:

  • pyproject.toml — project configuration and dependencies
  • .python-version — the Python version for this project
  • hello.py — a placeholder script (you can delete this)

Specifying a Python version

If your project requires a specific Python version, specify it when initializing:

uv init --python 3.12

Or change it later:

uv python pin 3.12

uv automatically downloads and manages Python interpreters — you don’t need to install Python yourself. The interpreters are cached in ~/.local/share/uv/python/ and shared across projects that use the same version.

You can see which Python versions are available and installed:

# List available versions
uv python list

# See what's installed locally
uv python list --only-installed

Adding dependencies

uv add pandas numpy scikit-learn

This resolves dependencies, installs them into a .venv/ virtual environment, and writes a uv.lock lockfile for reproducibility.

To add a development-only dependency (e.g. a linter or test framework):

uv add --dev pytest ruff

Running code

The simplest way is uv run, which ensures the correct environment is active:

uv run python analysis.py

You can also activate the environment manually:

source .venv/bin/activate
python analysis.py

Batch jobs with uv

run_analysis.slurm

#!/bin/bash
#SBATCH --job-name=python_analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err

cd /srv/home/<user>/my_project
uv run python analysis.py
mkdir -p logs
sbatch run_analysis.slurm

Reproducibility

Always commit pyproject.toml and uv.lock to version control. The lockfile pins exact versions of all dependencies, so anyone (or any node) can reproduce your environment:

# On another machine or after cloning
uv sync   # Installs exactly what's in uv.lock

Example workflow

# 1. Set up project on head node
mkdir ~/analysis && cd ~/analysis
uv init --python 3.12
uv add pandas numpy matplotlib

# 2. Write your script
cat > analysis.py << 'EOF'
import pandas as pd
import numpy as np

data = pd.read_csv("data.csv")
print(data.describe())
data.to_csv("results.csv")
EOF

# 3. Test interactively on a compute node
salloc --cpus-per-task=2 --mem=4G --time=00:30:00
cd ~/analysis
uv run python analysis.py
exit

# 4. Submit as batch job
sbatch run_analysis.slurm

Conda / Micromamba

You may have heard of conda environments — they are widely used in data science and can bundle non-Python dependencies (C libraries, CUDA toolkits, etc.) alongside Python packages.

In practice, you almost certainly want to use micromamba rather than the full conda/Anaconda/Miniconda distribution. Micromamba is a standalone C++ reimplementation that is much faster and lighter, while being fully compatible with conda packages and environment files. Think of it as “conda but fast and without the bloat”.

conda (Miniconda) micromamba
Install size ~400 MB ~5 MB
Speed Slow dependency resolution Fast (C++ solver)
Compatibility conda packages + pip Same conda packages + pip
Base environment Creates one (can cause conflicts) None (cleaner)

When to use conda/mamba over uv

  • You need packages that include compiled non-Python libraries (e.g. cudatoolkit, r-base, gdal)
  • You’re working with an existing environment.yml from a collaborator
  • A tutorial or paper provides a conda environment file for reproducibility

For pure Python projects, uv is simpler and faster.

Installing micromamba

curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
./bin/micromamba shell init -s bash -p ~/micromamba
source ~/.bashrc

For zsh users, replace -s bash with -s zsh and source ~/.zshrc instead.

Creating an environment

# Create environment with a specific Python version
micromamba create -n myenv python=3.12 pandas numpy -c conda-forge

# Activate it
micromamba activate myenv

# Install more packages
micromamba install -n myenv scikit-learn -c conda-forge

# You can also use pip inside an activated environment
pip install some-pypi-only-package

Using in batch jobs

#!/bin/bash
#SBATCH --job-name=conda_job
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err

# Initialize micromamba for this shell
eval "$(micromamba shell hook --shell bash)"
micromamba activate myenv

cd /srv/home/<user>/my_project
python analysis.py

Environment files

Export and recreate environments for reproducibility:

# Export
micromamba env export -n myenv > environment.yml

# Recreate from file
micromamba create -n myenv -f environment.yml
WarningWatch your disk space

Conda/mamba environments can be large (several GB each). They live in ~/micromamba/envs/ by default. Remove environments you no longer need:

micromamba env remove -n old_env

System Python

The system provides Python 3.9 (Rocky Linux 9 default). This is not recommended for data analysis — it’s an older version and installing packages into it can cause conflicts with system tools.

If you need a standalone Python outside of uv or micromamba (e.g. for a quick one-off script), you can use the Spack-installed version:

module load spack/1.1.1
spack load python   # Currently python@3.14.0

python -m venv myenv
source myenv/bin/activate
pip install pandas numpy

However, uv is almost always a better choice — it’s faster, handles dependency resolution properly, and manages Python versions for you.

Tips

Test before submitting

Run your script interactively first to catch errors early:

salloc --cpus-per-task=2 --mem=4G --time=00:30:00
cd ~/my_project
uv run python script.py

Mind the memory

Python can use significant memory with large DataFrames or when copying data between operations. Monitor usage and request appropriate resources. Use --mem=16G or more for large datasets.

Parallel Python

For multiprocessing or multithreaded workloads, request multiple cores and match your worker count:

salloc --cpus-per-task=8 --mem=16G --time=02:00:00
import os
from multiprocessing import Pool

n_workers = int(os.environ.get("SLURM_CPUS_PER_TASK", "1"))

with Pool(n_workers) as pool:
    results = pool.map(my_function, data)