Skip to content

fix(docker): link work-queue workspace deps in harness image#42

Merged
pdettori merged 1 commit into
mainfrom
fix/harness-dockerfile-work-queue-deps
Jul 1, 2026
Merged

fix(docker): link work-queue workspace deps in harness image#42
pdettori merged 1 commit into
mainfrom
fix/harness-dockerfile-work-queue-deps

Conversation

@pdettori

@pdettori pdettori commented Jul 1, 2026

Copy link
Copy Markdown
Member

Problem

The published ghcr.io/kagenti/serverless-harness image crashes on startup:

ERR_MODULE_NOT_FOUND: Cannot find package 'redis' imported from /app/packages/work-queue/src/queue.ts

server.ts imports @sh/work-queue at top level, but the redis package is never linked for that workspace package.

Root cause

The Dockerfile dependency-cache COPY block copies each workspace's package.json before pnpm install --frozen-lockfile (for layer caching), but the list omitted packages/work-queue:

COPY packages/session-backend/package.json ./packages/session-backend/
COPY packages/k8s-sandbox/package.json ./packages/k8s-sandbox/
COPY packages/knative-server/package.json ./packages/knative-server/
COPY harness/package.json ./harness/

So pnpm install ran without work-queue in the tree and never created packages/work-queue/node_modules. Confirmed by inspecting the published image: packages/session-backend/node_modules/redis is linked, packages/work-queue/node_modules does not exist (though redis@6.0.0 is present in the pnpm store).

Fix

Add the missing COPY packages/work-queue/package.json ./packages/work-queue/ line.

Verification

Built the fixed image in-cluster on OpenShift 4.20.8 and deployed it — the harness starts cleanly and ksvc/serverless-harness reaches Ready (previously CrashLoopBackOff with the module-not-found error). The existing build.yaml CI republishes the corrected image on merge.

Found while implementing #41 (OpenShift setup script).

Assisted-By: Claude Code

The dependency-cache COPY block listed session-backend, k8s-sandbox,
knative-server and harness package.json files but omitted
packages/work-queue. `pnpm install --frozen-lockfile` therefore ran
before work-queue existed in the build context, so its node_modules
(including the `redis` package) were never linked. Since server.ts
imports @sh/work-queue at top level, the published image crashed on
startup:

  ERR_MODULE_NOT_FOUND: Cannot find package 'redis' imported from
  /app/packages/work-queue/src/queue.ts

Copy work-queue's package.json alongside the others so pnpm links its
dependencies.

Verified by building the fixed image in-cluster on OpenShift 4.20 and
deploying it: the harness starts and ksvc/serverless-harness reaches
Ready (previously CrashLoopBackOff).

Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
Signed-off-by: Paolo Dettori <paolo.dettori@example.com>
@pdettori pdettori merged commit a461dfc into main Jul 1, 2026
9 checks passed
@pdettori pdettori deleted the fix/harness-dockerfile-work-queue-deps branch July 1, 2026 17:07
pdettori pushed a commit that referenced this pull request Jul 1, 2026
Adds deploy/knative/setup-ocp.sh, the OpenShift-native sibling of
setup-kind.sh (issue #41), plus a shared OCP kustomize overlay, a
pre-baked sandbox image, docs, and CI to publish the sandbox image.

Base bring-up on OpenShift 4.20+:
- OpenShift Serverless Operator (OLM Subscription) + KnativeServing CR with
  the PVC/securityContext feature flags and autoscaler tuning set in the CR
  spec (the operator reverts direct ConfigMap patches).
- Redis, sandbox, leaf-work PVC, LLM secret and the harness Knative Service
  applied via deploy/knative/overlays/ocp; OCP tweaks are kustomize patches,
  base YAMLs stay shared with Kind.
- Sandbox image pre-baked (deploy/knative/sandbox.Dockerfile, sets USER
  65532) and built in-cluster against the internal registry; also published
  to GHCR by build.yaml alongside the harness image.
- Harness SA granted the nonroot-v2 SCC so its explicit non-root UID is
  admitted (the published image declares no USER) - issue #41 item #4b.
- Ingress via the auto-created OpenShift Route (no Kourier port-forward).
- Idempotent; --dry-run/--help and --image/--namespace/--skip-sandbox-build
  flags.

KEDA (async leaf) and the optional Redis Enterprise Operator are deferred
follow-ups (--skip-keda is the default).

Verified end-to-end on OpenShift 4.20.8: operator install, KnativeServing
Ready, SCC grant, in-cluster sandbox build, PVC bind (gp3-csi RWO), ksvc
Ready, /health 200 over the Route, and an idempotent re-run. A full /turn
inference needs LLM-gateway egress, which is environment-specific.

Depends on #42 (harness image fix) for a runnable default image.

Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
Signed-off-by: Paolo Dettori <paolo.dettori@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant