Skip to content

Guard local model runtime boundaries#1059

Merged
joelteply merged 1 commit into
canaryfrom
fix/local-model-runtime-guardrails
May 8, 2026
Merged

Guard local model runtime boundaries#1059
joelteply merged 1 commit into
canaryfrom
fix/local-model-runtime-guardrails

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

  • reject removed local llama/phi/codellama aliases at LOCAL_MODELS.mapToHuggingFace so stale DB rows and command params fail loudly instead of silently routing to old Candle/CPU paths
  • route ai/should-respond and ai/validate-response through provider=local Qwen defaults instead of provider=candle / llama3.2 defaults
  • make PersonaLifecycleManager startup allocation use SecretManager's non-empty key semantics instead of raw process.env truthiness
  • update the should-respond docs/types to match the Qwen/local runtime contract

Validation

  • npx vitest run tests/unit/local-model-guardrails.test.ts --reporter=verbose
  • npx tsc --noEmit --project tsconfig.json --pretty false
  • git commit precommit gate: TypeScript build passed, browser ping passed, ESLint baseline improved
  • git push prepush gate: TypeScript clean, ESLint baseline improved

Review focus

  • confirm the command defaults do not leave any persona/chat path on Candle
  • confirm SecretManager gating matches the config.env single-source-of-truth rule
  • confirm removed alias errors are the right fail-loud behavior for stale local runtime configs

@joelteply
Copy link
Copy Markdown
Contributor Author

LGTM — small focused PR (+78/-18, 9 files), each change addresses a specific concern raised in the #1058 reviews + audits.

Verified

  • ai/should-respond + ai/validate-response: provider: 'candle''local' in all 3 call sites (gating, JSON-repair retry, validate). Now consistent with Stabilize startup persona backpressure #1058's catalog/admission flip.
  • validate-response default: hardcoded 'llama3.2:3b' literal → LOCAL_MODELS.GATING constant. Removes a divergence point.
  • Constants.ts.REMOVED_LOCAL_ALIASES + assertNotRemoved: directly addresses my flag (B) from Stabilize startup persona backpressure #1058 review (legacy alias removal was breaking-by-omission). Now: explicit error names the alias, gives replacement, explains "Continuum local chat uses Qwen through Rust/llama.cpp admission only." 12-entry mapping covers the removed families (llama3*, phi3, tinyllama, smollm2, codellama).
  • Defensive coverage: assertNotRemoved is called twice — once on normalized, once on withoutSuffix. Correctly catches llama3.2:3b-instruct via the suffix-strip path. Good.
  • PersonaLifecycleManager.collectAvailableApiKeys: switches from raw process.env[key] to SecretManager.get(...) filter. Picks up Stabilize startup persona backpressure #1058's empty-string-as-undefined treatment automatically. Addresses my audit flag (B) from earlier.

Tiny

  • Test coverage: local-model-guardrails.test.ts covers 4 of 12 removed aliases. Worth expanding to all 12 (loop over REMOVED_LOCAL_ALIASES keys) so a future addition to the dict is automatically tested. Cheap to widen.
  • Test gap: no assertion that 'llama3.2:3b-instruct' (suffix variant) throws via the second assertNotRemoved call. The defensive double-check is the most subtle piece of the assert logic; a test would lock it in.

LGTM ship.

@joelteply joelteply force-pushed the fix/local-model-runtime-guardrails branch from bb58d59 to 0e722a2 Compare May 8, 2026 00:28
@joelteply joelteply merged commit 48ed439 into canary May 8, 2026
3 checks passed
@joelteply joelteply deleted the fix/local-model-runtime-guardrails branch May 8, 2026 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant