feat(dev): replace standalone-kind dev path with Docker Desktop Kubernetes (CLUSTER=docker-desktop default)#520
Merged
Merged
Conversation
Replace the standalone-kind dev path with Docker Desktop's kind-provisioned Kubernetes (context docker-desktop): registry-free image delivery via ctr import, port-forward endpoint, in-cluster Postgres, chainguard redis (4KB pages). e2e-platform stays on standalone kind (needs Calico/NetworkPolicy + ephemeral clusters). Sequenced after the k3s target (#516). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
… | ctr import) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
…ageclass) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
- Rename apply_k3s_postgres -> apply_incluster_postgres(cluster_target)
so it resolves the right overlay's postgres.yml (k3s or docker-desktop).
- Rename teardown_k3s_namespace -> teardown_namespace and generalize its
docstring to cover both no-own-cluster targets.
- Extend server port-forward gating in wait_and_forward and restart from
cluster_target == 'k3s' to cluster_target in ('k3s', 'docker-desktop').
- Add docker-desktop branch to start(): skip registry setup, import images
via build_and_push, bring up in-cluster Postgres via apply_incluster_postgres.
- Add docker-desktop branch to restart(): same no-registry guard shape.
- Generalize cmd_nuke guard in devenv.py from k3s-only to both no-own-cluster
targets; calls teardown_namespace and redeploys via deploy.start.
- Update test_server_port_forward_is_k3s_only to assert tuple-form gating;
add test_server_port_forward_gated_for_docker_desktop_too.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
… the default) - Delete deployments/k8s/dev/kind-cluster.yml (dev kind config, not e2e) - scripts/lib/cluster.py: remove all kind lifecycle helpers (ensure_registry, is_registry_running, cluster_exists, _kind_node_containers, start_stopped_nodes, connect_registry_to_kind) and kind-only constants (CLUSTER_NAME, REGISTRY_CONTAINER, REGISTRY_IMAGE, REGISTRY_PORT, LOCAL_REGISTRY, KIND_CONFIG); up()/down() now handle only k3s and docker-desktop (raises ValueError on unknown target); registry_for() and expected_context() likewise simplified and made explicit. - scripts/lib/deploy.py: remove kind registry block from start()/restart(), remove REGISTRY_CONTAINER stop from teardown(), update HOST_PORT/NODE_PORT comment block, defaults updated from "kind" to "docker-desktop". - scripts/devenv.py: drop _uses_host_db(), _db_profile(), database import, cmd_db and "db" verb (host Postgres container no longer in use); drop "kind" from --cluster choices; simplify cmd_nuke (always namespace-scoped for k3s/docker-desktop). - scripts/lib/devstatus.py: replace kind/registry/host-db status rows with a single kube-context row; remove cluster and database imports. - All tests updated: kind-specific tests removed, kind defaults updated to docker-desktop, new tests assert kind raises ValueError. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
…ng after kind retirement - Makefile: delete dev-db-up and dev-db-down targets (called now-removed devenv.py db verb) and remove them from .PHONY - scripts/help.py: remove dev-db-up/dev-db-down help entries; update stale "kind" wording to match current docker-desktop/k3s reality - scripts/lib/deploy.py: change seven cluster_target default args from "kind" to "docker-desktop" for consistency with cluster.py Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
…kind overlays; fix stale comments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
ericfitz
added a commit
that referenced
this pull request
Jul 4, 2026
…orkers (#517, #518, #519) Three follow-ups from the Docker Desktop dev-target work (#520), all in the dev-tooling layer (scripts/lib/deploy.py + dev k8s overlays). #517 — pre-import postgres/redis base images to avoid the first-run cgr.dev pull flake. Docker Desktop's containerd pulls cgr.dev/chainguard/{postgres,redis} independently of the host Docker daemon, and that first pull occasionally fails with a transient EOF, leaving the pods in ErrImagePull. build_and_push now `docker pull`s the base images on the host and imports them into the node's containerd alongside the tmi-* images, and the postgres/redis manifests are pinned to imagePullPolicy: IfNotPresent so the imported copy is used (a :latest tag otherwise defaults to Always and re-pulls, defeating the import). The redis pin is a per-overlay kustomize patch (redis.yml is shared with k3s, which remaps redis to redis:7-alpine); postgres is pinned directly in the docker-desktop postgres.yml (applied raw by deploy.py). The base-image set is db-aware: oracle uses an external ADB and deploys no Postgres pod, so only redis is imported there. #518 — remove the --no-workers bring-up path. It applied the raw leaf manifests (image: localhost:5000/tmi-*:dev), which only worked against the retired kind local registry and yields ErrImagePull on docker-desktop/k3s. No make target passes it, so it was developer-manual-only dead/broken code. Dropped the flag from devenv.py, the no_workers params from start/restart/apply_overlay, and the _no_workers_files helper. #519 — harden import_image_to_node against a Popen-raises-before-close hang. If the importer Popen raised before saver.stdout.close() ran, the parent kept the pipe's read end open and saver.wait() in the finally could deadlock once the pipe buffer filled. The importer Popen is now wrapped so saver's stdout is released (and saver killed) on any exception before the wait. Unit tests added for the db-aware base-image selection and the import teardown path; the --no-workers tests were removed. make test-dev-scripts (94) and make lint pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX
ericfitz
added a commit
that referenced
this pull request
Jul 4, 2026
…orkers (#517, #518, #519) (#521) Three follow-ups from the Docker Desktop dev-target work (#520), all in the dev-tooling layer (scripts/lib/deploy.py + dev k8s overlays). #517 — pre-import postgres/redis base images to avoid the first-run cgr.dev pull flake. Docker Desktop's containerd pulls cgr.dev/chainguard/{postgres,redis} independently of the host Docker daemon, and that first pull occasionally fails with a transient EOF, leaving the pods in ErrImagePull. build_and_push now `docker pull`s the base images on the host and imports them into the node's containerd alongside the tmi-* images, and the postgres/redis manifests are pinned to imagePullPolicy: IfNotPresent so the imported copy is used (a :latest tag otherwise defaults to Always and re-pulls, defeating the import). The redis pin is a per-overlay kustomize patch (redis.yml is shared with k3s, which remaps redis to redis:7-alpine); postgres is pinned directly in the docker-desktop postgres.yml (applied raw by deploy.py). The base-image set is db-aware: oracle uses an external ADB and deploys no Postgres pod, so only redis is imported there. #518 — remove the --no-workers bring-up path. It applied the raw leaf manifests (image: localhost:5000/tmi-*:dev), which only worked against the retired kind local registry and yields ErrImagePull on docker-desktop/k3s. No make target passes it, so it was developer-manual-only dead/broken code. Dropped the flag from devenv.py, the no_workers params from start/restart/apply_overlay, and the _no_workers_files helper. #519 — harden import_image_to_node against a Popen-raises-before-close hang. If the importer Popen raised before saver.stdout.close() ran, the parent kept the pipe's read end open and saver.wait() in the finally could deadlock once the pipe buffer filled. The importer Popen is now wrapped so saver's stdout is released (and saver killed) on any exception before the wait. Unit tests added for the db-aware base-image selection and the import teardown path; the --no-workers tests were removed. make test-dev-scripts (94) and make lint pass. Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ericfitz
added a commit
that referenced
this pull request
Jul 4, 2026
The dev environment was migrated to Docker Desktop Kubernetes (#520/#521), but CLAUDE.md still described a kind cluster with the database and Redis running as containers "external to the cluster." That stale description caused a misdiagnosis when the in-cluster PostgreSQL PVC came up empty: the real data was assumed lost when it was actually stranded in the old host Docker volume the new topology no longer mounts. Update both occurrences to reflect reality: - Default CLUSTER=docker-desktop (k3s also supported) - server, PostgreSQL, Redis, and NATS all run in-cluster in the tmi-platform namespace (Deployments + StatefulSets) - PostgreSQL data persists in a Kubernetes PVC (data-postgres-0), NOT a host Docker volume; re-provisioning the PVC starts from an empty DB - With DB=oracle the database is an external managed Oracle ADB - orchestration is via scripts/devenv.py; manifests under deployments/k8s/dev/<cluster>/ Claude-Session: https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the standalone-kind local dev deployment path with Docker Desktop's kind-provisioned Kubernetes (
CLUSTER=docker-desktop, the new default).CLUSTER=k3sis retained; the separate e2e-platform kind usage is untouched. Motivation: fewer moving parts, lower resource use, reuse the DD cluster the developer already runs.Design spec:
docs/superpowers/specs/2026-07-03-docker-desktop-dev-target-design.md· Plan:docs/superpowers/plans/2026-07-03-docker-desktop-dev-target.mdWhat changed
docker-desktopcontext, never create/destroy it.dev-nukeis namespace-scoped.docker save <img> | docker exec -i desktop-control-plane ctr -n k8s.io images import -. No registry container, no containerd mirror. (The standalonekindCLI can't address the DD-managed cluster, and DD LoadBalancer isn'tlocalhosthere — verified.)localhost:8080/:6379(reuses the k3s port-forward, gated to no-own-cluster targets).hostpathstorageclass; DB-URL host rewritten to thepostgresService.DB=oracledeploys theserver-oracleoverlay against the external Oracle ADB (no in-cluster Postgres), preserving kind's behavior;CLUSTER=k3s DB=oraclenow fails fast (out of scope) instead of hanging.redis:7-alpineremap does not apply).deployments/k8s/dev/kind-cluster.yml, thetmi-dev-registrycontainer + mirror,extraPortMappings, the kind lifecycle incluster.py, the host-Postgres dev path, and orphaned kind-era overlays.Makefiledefault is nowCLUSTER ?= docker-desktop.Kept for e2e only: the
kindCLI,deployments/k8s/platform/kind-cluster.yml,e2e-platform-*,test/e2e/platform/.Testing
make test-dev-scripts— 94 unit tests pass (cluster-aware helpers,--cluster docker-desktopparser + default,ctr importcommand builder, overlay routing incl. oracle, port-forward gating).make dev-up CLUSTER=docker-desktop→ HTTP 200 (7 pods Running);make dev-up CLUSTER=k3s→ HTTP 200 (non-regression). Oracle overlay verified by render + code routing; live oracle bring-up requires ADB creds (deferred).New docs
deployments/k8s/dev/docker-desktop/README.md— one-time prereq (enable DD Kubernetes with the kind provisioner), the image-import mechanism, and why e2e stays on kind.Follow-ups
#517 (pre-import postgres/redis for first-run reliability), #518 (
--no-workerspath), #519 (import_image_to_nodehardening).🤖 Generated with Claude Code
https://claude.ai/code/session_01Kk9GxWS9EpazjbwBKfMpUX