Python on the Cluster
Setting up Python projects with uv, conda/mamba, or system Python
Python is not centrally managed on this cluster — you are responsible for your own Python toolchain. This is by design: different projects need different Python versions and dependencies, and user-managed environments avoid conflicts between users.
This page covers three approaches, in order of recommendation:
- uv — fast, modern project and environment manager (recommended for most use cases)
- Conda / Micromamba — heavier but useful when you need non-Python dependencies bundled together
- System Python — last resort, limited to the OS-provided version
All of these tools install into your home directory and do not require admin privileges.
Compute nodes do not have internet access. Always create environments and install packages on the head node, then run your code on compute nodes. Your home directory (/srv/home/<user>) is shared via NFS, so environments are available on all nodes.
uv (recommended)
uv is a fast Python package and project manager. It handles Python version management, virtual environments, dependency resolution, and lockfiles — all in one tool. It is significantly faster than pip and does not require a pre-installed Python.
Installing uv
curl -LsSf https://astral.sh/uv/install.sh | shThen restart your shell or run:
source ~/.cargo/envVerify it works:
uv --versionStarting a new project
uv init creates a project with a pyproject.toml, which is the standard way to define Python project metadata and dependencies:
mkdir ~/my_project && cd ~/my_project
uv initThis creates:
pyproject.toml— project configuration and dependencies.python-version— the Python version for this projecthello.py— a placeholder script (you can delete this)
Specifying a Python version
If your project requires a specific Python version, specify it when initializing:
uv init --python 3.12Or change it later:
uv python pin 3.12uv automatically downloads and manages Python interpreters — you don’t need to install Python yourself. The interpreters are cached in ~/.local/share/uv/python/ and shared across projects that use the same version.
You can see which Python versions are available and installed:
# List available versions
uv python list
# See what's installed locally
uv python list --only-installedAdding dependencies
uv add pandas numpy scikit-learnThis resolves dependencies, installs them into a .venv/ virtual environment, and writes a uv.lock lockfile for reproducibility.
To add a development-only dependency (e.g. a linter or test framework):
uv add --dev pytest ruffRunning code
The simplest way is uv run, which ensures the correct environment is active:
uv run python analysis.pyYou can also activate the environment manually:
source .venv/bin/activate
python analysis.pyBatch jobs with uv
run_analysis.slurm
#!/bin/bash
#SBATCH --job-name=python_analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err
cd /srv/home/<user>/my_project
uv run python analysis.pymkdir -p logs
sbatch run_analysis.slurmReproducibility
Always commit pyproject.toml and uv.lock to version control. The lockfile pins exact versions of all dependencies, so anyone (or any node) can reproduce your environment:
# On another machine or after cloning
uv sync # Installs exactly what's in uv.lockExample workflow
# 1. Set up project on head node
mkdir ~/analysis && cd ~/analysis
uv init --python 3.12
uv add pandas numpy matplotlib
# 2. Write your script
cat > analysis.py << 'EOF'
import pandas as pd
import numpy as np
data = pd.read_csv("data.csv")
print(data.describe())
data.to_csv("results.csv")
EOF
# 3. Test interactively on a compute node
salloc --cpus-per-task=2 --mem=4G --time=00:30:00
cd ~/analysis
uv run python analysis.py
exit
# 4. Submit as batch job
sbatch run_analysis.slurmConda / Micromamba
You may have heard of conda environments — they are widely used in data science and can bundle non-Python dependencies (C libraries, CUDA toolkits, etc.) alongside Python packages.
In practice, you almost certainly want to use micromamba rather than the full conda/Anaconda/Miniconda distribution. Micromamba is a standalone C++ reimplementation that is much faster and lighter, while being fully compatible with conda packages and environment files. Think of it as “conda but fast and without the bloat”.
| conda (Miniconda) | micromamba | |
|---|---|---|
| Install size | ~400 MB | ~5 MB |
| Speed | Slow dependency resolution | Fast (C++ solver) |
| Compatibility | conda packages + pip | Same conda packages + pip |
| Base environment | Creates one (can cause conflicts) | None (cleaner) |
When to use conda/mamba over uv
- You need packages that include compiled non-Python libraries (e.g.
cudatoolkit,r-base,gdal) - You’re working with an existing
environment.ymlfrom a collaborator - A tutorial or paper provides a conda environment file for reproducibility
For pure Python projects, uv is simpler and faster.
Installing micromamba
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
./bin/micromamba shell init -s bash -p ~/micromamba
source ~/.bashrcFor zsh users, replace -s bash with -s zsh and source ~/.zshrc instead.
Creating an environment
# Create environment with a specific Python version
micromamba create -n myenv python=3.12 pandas numpy -c conda-forge
# Activate it
micromamba activate myenv
# Install more packages
micromamba install -n myenv scikit-learn -c conda-forge
# You can also use pip inside an activated environment
pip install some-pypi-only-packageUsing in batch jobs
#!/bin/bash
#SBATCH --job-name=conda_job
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err
# Initialize micromamba for this shell
eval "$(micromamba shell hook --shell bash)"
micromamba activate myenv
cd /srv/home/<user>/my_project
python analysis.pyEnvironment files
Export and recreate environments for reproducibility:
# Export
micromamba env export -n myenv > environment.yml
# Recreate from file
micromamba create -n myenv -f environment.ymlConda/mamba environments can be large (several GB each). They live in ~/micromamba/envs/ by default. Remove environments you no longer need:
micromamba env remove -n old_envSystem Python
The system provides Python 3.9 (Rocky Linux 9 default). This is not recommended for data analysis — it’s an older version and installing packages into it can cause conflicts with system tools.
If you need a standalone Python outside of uv or micromamba (e.g. for a quick one-off script), you can use the Spack-installed version:
module load spack/1.1.1
spack load python # Currently python@3.14.0
python -m venv myenv
source myenv/bin/activate
pip install pandas numpyHowever, uv is almost always a better choice — it’s faster, handles dependency resolution properly, and manages Python versions for you.
Tips
Test before submitting
Run your script interactively first to catch errors early:
salloc --cpus-per-task=2 --mem=4G --time=00:30:00
cd ~/my_project
uv run python script.pyMind the memory
Python can use significant memory with large DataFrames or when copying data between operations. Monitor usage and request appropriate resources. Use --mem=16G or more for large datasets.
Parallel Python
For multiprocessing or multithreaded workloads, request multiple cores and match your worker count:
salloc --cpus-per-task=8 --mem=16G --time=02:00:00import os
from multiprocessing import Pool
n_workers = int(os.environ.get("SLURM_CPUS_PER_TASK", "1"))
with Pool(n_workers) as pool:
results = pool.map(my_function, data)