Skip to content

fix(dev): harden docker-desktop image delivery, drop broken --no-workers (#517, #518, #519)#521

Merged
ericfitz merged 1 commit into
mainfrom
fix/dev-docker-desktop-followups
Jul 4, 2026
Merged

fix(dev): harden docker-desktop image delivery, drop broken --no-workers (#517, #518, #519)#521
ericfitz merged 1 commit into
mainfrom
fix/dev-docker-desktop-followups

Conversation

@ericfitz

@ericfitz ericfitz commented Jul 4, 2026

Copy link
Copy Markdown
Owner

Three follow-ups from the Docker Desktop dev-target work (#520), all in the dev-tooling layer (scripts/lib/deploy.py + dev k8s overlays). No production/Go code, no DB schema.

#517 — pre-import postgres/redis to avoid first-run cgr.dev flake

Docker Desktop's containerd pulls cgr.dev/chainguard/{postgres,redis} independently of the host Docker daemon; that first pull occasionally fails with a transient EOF, leaving pods in ErrImagePull.

  • build_and_push now docker pulls the base images on the host and imports them into the node's containerd alongside the tmi-* images.
  • Postgres/redis manifests pinned to imagePullPolicy: IfNotPresent so the imported copy is used (a :latest tag otherwise defaults to Always and re-pulls, defeating the import). Redis pin is a per-overlay kustomize patch because redis.yml is shared with k3s (which remaps redis to redis:7-alpine); postgres is pinned directly in docker-desktop/postgres.yml (applied raw by deploy.py).
  • Base-image set is DB-aware: oracle uses external ADB and deploys no Postgres pod, so only redis is imported there.

#518 — remove the broken --no-workers path

It applied raw leaf manifests (image: localhost:5000/tmi-*:dev) that only worked against the retired kind local registry — ErrImagePull on docker-desktop/k3s. No make target passed it (developer-manual-only dead code). Removed the flag, the no_workers params on start/restart/apply_overlay, and _no_workers_files.

#519 — harden import_image_to_node against a pipe hang

If the importer Popen raised before saver.stdout.close(), the parent kept the pipe's read end open and saver.wait() could deadlock once the buffer filled. The importer Popen is now wrapped so saver's stdout is released (and saver killed) on any exception before the wait.

Verification

  • make test-dev-scripts — 94 pass (added tests for DB-aware base-image selection and the import teardown path; removed the --no-workers tests).
  • make lint — pass.
  • kubectl kustomize renders IfNotPresent on redis across all three docker-desktop overlays; k3s unaffected.
  • ⚠️ The runtime behavior of dev(docker-desktop): pre-import postgres/redis images to avoid first-run cgr.dev pull flake #517 (actually dodging the cgr.dev flake) can only be confirmed with a live make dev-up CLUSTER=docker-desktop on a fresh node.

Closes #517
Closes #518
Closes #519

🤖 Generated with Claude Code

…orkers (#517, #518, #519)

Three follow-ups from the Docker Desktop dev-target work (#520), all in the
dev-tooling layer (scripts/lib/deploy.py + dev k8s overlays).

#517 — pre-import postgres/redis base images to avoid the first-run cgr.dev
pull flake. Docker Desktop's containerd pulls cgr.dev/chainguard/{postgres,redis}
independently of the host Docker daemon, and that first pull occasionally fails
with a transient EOF, leaving the pods in ErrImagePull. build_and_push now
`docker pull`s the base images on the host and imports them into the node's
containerd alongside the tmi-* images, and the postgres/redis manifests are
pinned to imagePullPolicy: IfNotPresent so the imported copy is used (a :latest
tag otherwise defaults to Always and re-pulls, defeating the import). The
redis pin is a per-overlay kustomize patch (redis.yml is shared with k3s, which
remaps redis to redis:7-alpine); postgres is pinned directly in the
docker-desktop postgres.yml (applied raw by deploy.py). The base-image set is
db-aware: oracle uses an external ADB and deploys no Postgres pod, so only redis
is imported there.

#518 — remove the --no-workers bring-up path. It applied the raw leaf manifests
(image: localhost:5000/tmi-*:dev), which only worked against the retired kind
local registry and yields ErrImagePull on docker-desktop/k3s. No make target
passes it, so it was developer-manual-only dead/broken code. Dropped the flag
from devenv.py, the no_workers params from start/restart/apply_overlay, and the
_no_workers_files helper.

#519 — harden import_image_to_node against a Popen-raises-before-close hang. If
the importer Popen raised before saver.stdout.close() ran, the parent kept the
pipe's read end open and saver.wait() in the finally could deadlock once the
pipe buffer filled. The importer Popen is now wrapped so saver's stdout is
released (and saver killed) on any exception before the wait.

Unit tests added for the db-aware base-image selection and the import teardown
path; the --no-workers tests were removed. make test-dev-scripts (94) and
make lint pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
@ericfitz ericfitz merged commit a9938a8 into main Jul 4, 2026
12 checks passed
@ericfitz ericfitz deleted the fix/dev-docker-desktop-followups branch July 4, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant