Skip to content

Persist runner state across restarts (stop /tmp wipe)#1

Merged
NathanEdg merged 2 commits into
mainfrom
persist-runner-state-across-restarts
May 30, 2026
Merged

Persist runner state across restarts (stop /tmp wipe)#1
NathanEdg merged 2 commits into
mainfrom
persist-runner-state-across-restarts

Conversation

@NathanEdg

Copy link
Copy Markdown
Contributor

What

The runner wrote compose files and .env data into /tmp while its systemd unit ran with PrivateTmp=true. A private /tmp is a fresh tmpfs on every (re)start, so compose projects and other deploy artifacts were destroyed whenever the runner restarted. This moves all runner-owned state to a persistent location and ensures it survives restarts.

Changes

  • Compose data out of /tmpcompose_up and compose_up_stream now write to /var/lib/lockethq/compose/<project>/ instead of std::env::temp_dir(), matching the existing persistent convention already used by the proxy module (/var/lib/lockethq/traefik).
  • Systemd unit (runner/lockethq-runner.service) — PrivateTmp=truefalse, and added StateDirectory=lockethq so systemd creates/owns /var/lib/lockethq and never clears it on restart. ProtectSystem=full leaves /var writable, so no extra ReadWritePaths is needed.
  • "Update runner" now refreshes the unitupdate_runner previously only swapped the binary and restarted. It now also re-uploads the systemd unit (same include_str! source as a fresh install) and runs daemon-reload before restart, so unit changes reach already-installed servers.
  • Startup migrationmigrate_legacy_compose_dirs() runs on boot, moving any leftover /tmp/lockethq-compose-* projects into the persistent location. Uses rename with a recursive copy + delete fallback (tmpfs → disk fails rename with EXDEV). Fully best-effort: errors are logged and skipped so a bad entry never blocks startup.

Also verified (no change needed)

Detection of containers not created by LocketHQ already works: list_containers calls Docker with all: true and no label filter, so every container on the host is returned regardless of who created it, and the frontend renders them all.

Reviewer notes

  • The startup migration only helps boxes updated without an intervening restart (older PrivateTmp=true runners already wiped /tmp on restart). The durable fix for everyone is the new unit + new compose location, both delivered by the updated "Update runner" action.
  • Both crates build clean; remaining warnings (mut in docker.rs, unused arch field) are pre-existing and unrelated.

🤖 Generated with Claude Code

NathanEdg and others added 2 commits May 30, 2026 11:56
The runner stored compose files and .env data in /tmp while the systemd
unit ran with PrivateTmp=true, which gives the service a fresh tmpfs on
every (re)start. As a result, compose projects and other deploy
artifacts were destroyed whenever the runner restarted.

- Move compose projects from /tmp to /var/lib/lockethq/compose/<project>,
  matching the existing persistent convention used by the proxy module.
- Set PrivateTmp=false and add StateDirectory=lockethq in the systemd
  unit so /var/lib/lockethq is created/owned by systemd and never cleared
  on restart.
- "Update runner" now re-pushes the systemd unit and runs daemon-reload
  before restart, so unit changes reach already-installed servers.
- Add a best-effort startup migration that moves any leftover
  /tmp/lockethq-compose-* projects into the persistent location, with a
  recursive copy fallback for cross-filesystem (EXDEV) moves.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`cargo check --workspace` compiles the Tauri app crate, whose build
script resolves the `binaries/*` resource glob from tauri.conf.json.
Those runner binaries are gitignored release artifacts, so in CI the
glob matched nothing and the build script failed with "glob pattern
binaries/* ... didn't match any files" — CI had been red on main for
this reason regardless of the change under test.

Build the runner and stage it at app/src-tauri/binaries/ before the
check so the bundled resource genuinely exists. This also verifies the
runner compiles to a binary as a side effect.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@NathanEdg NathanEdg merged commit f046d0b into main May 30, 2026
2 checks passed
@NathanEdg NathanEdg deleted the persist-runner-state-across-restarts branch May 30, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant