Compose-style multi-service workflows, compiled into one inspectable Slurm job.
One allocation · one script · Slurm-native runtime
hpc-compose turns a small Compose-like YAML file into one inspectable Slurm job for multi-service HPC and research ML workflows.
Use it when you want Docker Compose-style authoring on Slurm without adding Kubernetes, a long-running control plane, or a pile of hand-written sbatch glue.
These commands work from a laptop, workstation, or login node because new writes a local starter spec and plan is purely static:
hpc-compose new --template minimal-batch --name my-app --output compose.yaml
hpc-compose plan -f compose.yaml
hpc-compose plan --show-script -f compose.yamlFor real cluster runs, configure a cache path visible from both the Slurm submission host and compute nodes, either in x-slurm.cache_dir, hpc-compose setup --cache-dir, or [defaults.cache] / [profiles.<name>.cache] settings. From a source checkout, you can also inspect the checked-in examples with hpc-compose plan -f examples/minimal-batch.yaml.
Expected signals:
spec is valid
service order: app
Rendered script:
Additional lines (runtime mode, cache dir, allocation geometry, per-service image state) are normal.
Run hpc-compose up -f compose.yaml only after moving to a supported Linux Slurm submission host with the runtime backend your spec selects. From a source checkout, the local Slurm dev cluster can smoke-test host-backend specs against real local sbatch before you use a shared cluster. If a run fails, start triage with hpc-compose debug -f compose.yaml --preflight.
hpc-compose is intentionally narrow:
- one Slurm allocation per application
- one generated batch script you can inspect
- service startup ordering and readiness gates inside that allocation
- Slurm-native arrays, submit-time dependencies, and reusable resource profiles
- Pyxis/Enroot, Apptainer, Singularity, or host runtime backends
- finite spec smoke tests plus local
devandtmuxworkflows for single-host authoring - one-off
run --image ... -- <cmd>jobs and directshell --image ...sessions - tracked
notebooksessions launching JupyterLab or VS Code on a compute node - tracked logs, state, metrics, artifacts, cache entries, and follow-up commands
It does not aim to be a full Docker Compose runtime. Unsupported Compose features include build:, ports, custom Docker networks, deploy, and dynamic scheduler-style placement across arbitrary nodes.
The fastest path installs the most recent published release with no edits. The
script resolves the latest GitHub Release tag for you and downloads the matching
asset into ~/.local/bin by default:
curl -fsSL https://raw.githubusercontent.com/NicolasSchuler/hpc-compose/main/install.sh | shFor reproducible installs (recommended for shared clusters), pin a specific release tag so every run resolves the exact same asset:
RELEASE_TAG=vX.Y.Z
curl -fsSL "https://raw.githubusercontent.com/NicolasSchuler/hpc-compose/${RELEASE_TAG}/install.sh" \
| env HPC_COMPOSE_VERSION="${RELEASE_TAG}" shReplace vX.Y.Z with the release tag shown on the GitHub Releases page. Fetching install.sh from main runs the moving script, but it still installs from a published releases/download/<tag>/... asset, not unreleased main; pin HPC_COMPOSE_VERSION when you need every machine to land on the same build.
Other install paths:
- Linux
.debor.rpmassets from the release page - macOS Homebrew tap:
brew install NicolasSchuler/hpc-compose/hpc-compose - source checkout for development:
cargo build --release
Installer availability is not the same as full runtime support. Check the Support Matrix before assuming a platform or cluster can run submission workflows end to end.
- Published manual
- Support Matrix
- Installation
- Quickstart
- Examples
- Task Guide
- Development Workflow
- Runtime Backends
- Runbook
- Troubleshooting
- CLI Reference
- Spec Reference
You can ask any LLM agent (Claude, Codex, Copilot, Cursor) to set up hpc-compose on your cluster. Point it at the published machine-readable map first, which carries a curated doc index, a safety contract (which commands are static-safe vs. which submit Slurm jobs), and the canonical spec conventions:
- Agent entry map:
docs/src/llms.txt, served athttps://nicolasschuler.github.io/hpc-compose/llms.txt - Walkthrough and copy-paste prompt: Set Up With an AI Agent
- Drop-in skill bundle:
skills/hpc-compose/SKILL.md
Agents author and statically verify a spec (validate, plan --show-script, inspect) before any real run, and ask before submitting jobs.
If you try hpc-compose, open an adoption feedback issue with:
- cluster type
- workload type
- the main failure or friction point
- License: LICENSE
- Contributing: CONTRIBUTING.md
- Security: SECURITY.md
- Code of Conduct: CODE_OF_CONDUCT.md
If you use hpc-compose in research, please cite the software. GitHub also exposes the same metadata through the repository citation UI via CITATION.cff.
@software{schuler_hpc_compose_2026,
author = {Schuler, Nicolas},
title = {hpc-compose},
version = {0.1.48},
year = {2026},
publisher = {Karlsruhe Institute of Technology (KIT)},
url = {https://github.com/NicolasSchuler/hpc-compose}
}