Skip to content

NicolasSchuler/hpc-compose

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

253 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

hpc-compose logo
Compose-style multi-service workflows, compiled into one inspectable Slurm job.
One allocation · one script · Slurm-native runtime

hpc-compose

CI Docs Release License

hpc-compose turns a small Compose-like YAML file into one inspectable Slurm job for multi-service HPC and research ML workflows.

Use it when you want Docker Compose-style authoring on Slurm without adding Kubernetes, a long-running control plane, or a pile of hand-written sbatch glue.

Safe First Path

These commands work from a laptop, workstation, or login node because new writes a local starter spec and plan is purely static:

hpc-compose new --template minimal-batch --name my-app --output compose.yaml
hpc-compose plan -f compose.yaml
hpc-compose plan --show-script -f compose.yaml

For real cluster runs, configure a cache path visible from both the Slurm submission host and compute nodes, either in x-slurm.cache_dir, hpc-compose setup --cache-dir, or [defaults.cache] / [profiles.<name>.cache] settings. From a source checkout, you can also inspect the checked-in examples with hpc-compose plan -f examples/minimal-batch.yaml.

Expected signals:

spec is valid
service order: app
Rendered script:

Additional lines (runtime mode, cache dir, allocation geometry, per-service image state) are normal.

Run hpc-compose up -f compose.yaml only after moving to a supported Linux Slurm submission host with the runtime backend your spec selects. From a source checkout, the local Slurm dev cluster can smoke-test host-backend specs against real local sbatch before you use a shared cluster. If a run fails, start triage with hpc-compose debug -f compose.yaml --preflight.

Scope

hpc-compose is intentionally narrow:

  • one Slurm allocation per application
  • one generated batch script you can inspect
  • service startup ordering and readiness gates inside that allocation
  • Slurm-native arrays, submit-time dependencies, and reusable resource profiles
  • Pyxis/Enroot, Apptainer, Singularity, or host runtime backends
  • finite spec smoke tests plus local dev and tmux workflows for single-host authoring
  • one-off run --image ... -- <cmd> jobs and direct shell --image ... sessions
  • tracked notebook sessions launching JupyterLab or VS Code on a compute node
  • tracked logs, state, metrics, artifacts, cache entries, and follow-up commands

It does not aim to be a full Docker Compose runtime. Unsupported Compose features include build:, ports, custom Docker networks, deploy, and dynamic scheduler-style placement across arbitrary nodes.

Install

The fastest path installs the most recent published release with no edits. The script resolves the latest GitHub Release tag for you and downloads the matching asset into ~/.local/bin by default:

curl -fsSL https://raw.githubusercontent.com/NicolasSchuler/hpc-compose/main/install.sh | sh

For reproducible installs (recommended for shared clusters), pin a specific release tag so every run resolves the exact same asset:

RELEASE_TAG=vX.Y.Z
curl -fsSL "https://raw.githubusercontent.com/NicolasSchuler/hpc-compose/${RELEASE_TAG}/install.sh" \
  | env HPC_COMPOSE_VERSION="${RELEASE_TAG}" sh

Replace vX.Y.Z with the release tag shown on the GitHub Releases page. Fetching install.sh from main runs the moving script, but it still installs from a published releases/download/<tag>/... asset, not unreleased main; pin HPC_COMPOSE_VERSION when you need every machine to land on the same build.

Other install paths:

  • Linux .deb or .rpm assets from the release page
  • macOS Homebrew tap: brew install NicolasSchuler/hpc-compose/hpc-compose
  • source checkout for development: cargo build --release

Installer availability is not the same as full runtime support. Check the Support Matrix before assuming a platform or cluster can run submission workflows end to end.

Start From Docs

Set Up With an AI Agent

You can ask any LLM agent (Claude, Codex, Copilot, Cursor) to set up hpc-compose on your cluster. Point it at the published machine-readable map first, which carries a curated doc index, a safety contract (which commands are static-safe vs. which submit Slurm jobs), and the canonical spec conventions:

Agents author and statically verify a spec (validate, plan --show-script, inspect) before any real run, and ask before submitting jobs.

Feedback

If you try hpc-compose, open an adoption feedback issue with:

  • cluster type
  • workload type
  • the main failure or friction point

Project Policies

Citation

If you use hpc-compose in research, please cite the software. GitHub also exposes the same metadata through the repository citation UI via CITATION.cff.

@software{schuler_hpc_compose_2026,
  author = {Schuler, Nicolas},
  title = {hpc-compose},
  version = {0.1.48},
  year = {2026},
  publisher = {Karlsruhe Institute of Technology (KIT)},
  url = {https://github.com/NicolasSchuler/hpc-compose}
}

About

Single-binary launcher that turns a Compose-like spec into a single Slurm job running one or more services through Enroot and Pyxis.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors