Python on the Cluster
Environment management with uv
Python works well on the cluster, especially with modern environment management tools. We recommend uv for its speed and simplicity.
Installing uv
uv is a fast Python package manager and environment tool. Install it in your home directory:
curl -LsSf https://astral.sh/uv/install.sh | shAdd it to your path (or log out and back in):
source ~/.cargo/envCreating a project
# Create a new project
mkdir my_project && cd my_project
# Initialize with uv
uv init
# Add dependencies
uv add pandas numpy scikit-learnThis creates: - pyproject.toml – Project configuration and dependencies - .venv/ – Virtual environment - uv.lock – Lockfile for reproducible installs
Running Python
# Run Python in the project environment
uv run python script.py
# Or activate the environment
source .venv/bin/activate
python script.pyBatch jobs with Python
run_analysis.slurm
#!/bin/bash
#SBATCH --job-name=python_analysis
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --time=02:00:00
#SBATCH --output=logs/%x_%j.out
cd /home/<user>/my_project
uv run python analysis.pySince compute nodes don’t have internet, create your environment and install packages on the head node before submitting batch jobs.
Example workflow
# On head node
mkdir ~/analysis && cd ~/analysis
uv init
uv add pandas numpy matplotlib
# Create your script
cat > analysis.py << 'EOF'
import pandas as pd
import numpy as np
data = pd.read_csv("data.csv")
print(data.describe())
EOF
# Test interactively
salloc --cpus-per-task=2 --mem=4G --time=00:30:00
uv run python analysis.py
exit
# Submit as batch job
sbatch run_analysis.slurmUsing system Python
If you don’t need uv, you can use the system Python:
module avail python # See available versions
module load python/3.11
python -m venv myenv
source myenv/bin/activate
pip install pandas numpyHowever, uv is faster and handles dependencies better.
Conda / Micromamba
For complex environments with non-Python dependencies (e.g., CUDA libraries for ML), you might prefer conda or micromamba.
Install micromamba
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
./bin/micromamba shell init -s bash -p ~/micromamba
source ~/.bashrcCreate environment
micromamba create -n myenv python=3.11 pandas numpy
micromamba activate myenvConda environments can be large. Be mindful of disk space in your home directory.
Tips
Lock your dependencies
Always commit uv.lock or requirements.txt to version control. This ensures reproducibility.
Test before submitting
Run your script interactively first to catch errors:
salloc --cpus-per-task=2 --mem=4G --time=00:30:00
uv run python script.pyMind the memory
Python can use significant memory with large DataFrames. Monitor usage and request appropriate resources.