Skip to content

chore(deploy): gp.nano migration — env-driven kill switches + low-maintenance notices#22

Merged
javierdejesusda merged 14 commits into
mainfrom
chore/gp-nano-migration
May 8, 2026
Merged

chore(deploy): gp.nano migration — env-driven kill switches + low-maintenance notices#22
javierdejesusda merged 14 commits into
mainfrom
chore/gp-nano-migration

Conversation

@javierdejesusda
Copy link
Copy Markdown
Owner

Summary

  • Add ENABLE_SCHEDULER and ENABLE_TFT_FORECASTS env-driven kill switches to backend config, wired into scheduler bootstrap and ML module constant
  • Guard chat_service against empty OPENAI_API_KEY (raises 503 before SSE stream construction)
  • Add docker-compose.nano.yml overlay for 1vCPU/2GB hosts with memory/CPU caps
  • Add .env.production.nano.example reference file for nano deployments
  • Add ENABLE_SENTRY_SAMPLE_RATE env var to make Sentry sample rate configurable
  • Cap frontend Node.js heap below container limit (NODE_OPTIONS=--max-old-space-size=512)
  • Add deployment runbook (docs/deploy/gp-nano-runbook.md) with migration and revert steps
  • Frontend: add MaintenanceCard component and wrap disabled-feature pages with low-maintenance notices when kill switches are active

Test plan

  • Frontend type-check: npx tsc --noEmit — passes
  • Frontend lint: npx eslint src/ — 4 pre-existing warnings (not in our diff), no new warnings
  • Backend ruff: ruff check . — clean
  • Backend mypy: targeted syntax check on modified files — clean (full mypy has pre-existing TypeGuardedType internal crash unrelated to our changes)
  • Docker compose overlay merge: docker compose -f docker-compose.prod.yml -f docker-compose.nano.yml config — exits 0
  • CI frontend and backend jobs

Move the openai_api_key guard from stream_chat_response (async generator)
into send_message (route handler body). An HTTPException raised inside an
async generator body is caught by the except-Exception wrapper and emitted
as an SSE error event, so the HTTP status was always 200. Placing the guard
synchronously before EventSourceResponse is constructed ensures FastAPI
returns a real 503. Remove the now-redundant guard and the unused
HTTPException import from chat_service.py.
The TFT kill-switch is already controlled by ENABLE_TFT_FORECASTS read
directly in ml/training/config.py. The Settings field added in the prior
commit was never read and could mislead future maintainers into thinking
Settings is the authoritative control point for TFT. Remove the field and
update the comment to describe only the scheduler flag that remains.
Drop NODE_OPTIONS max-old-space-size from 384 to 300, keeping the
container memory limit at 384M. This reserves ~84 MiB for native
Node memory (OpenSSL, buffers, addons) and prevents guaranteed OOM
when the heap fills to the container ceiling.
Replace all three hardcoded tracesSampleRate: 0.1 literals with reads
from environment variables. Frontend (server, edge, client) reads
NEXT_PUBLIC_SENTRY_TRACES_SAMPLE_RATE; backend reads
SENTRY_TRACES_SAMPLE_RATE. Both fall back to 0.1 when unset, so
behaviour is unchanged without the overlay.
The previous comment claimed NEXT_PUBLIC_DISABLED_FEATURES was set in
docker-compose.nano.yml, but that variable doesn't exist yet. Reword
to describe the intended behaviour without referencing a future env var.
Introduces NEXT_PUBLIC_DISABLED_FEATURES env var (comma-separated list)
that gates AI-powered features behind a bilingual MaintenanceCard notice
instead of firing requests destined to fail on gp.nano.

- src/lib/feature-flags.ts: build-time feature-flag helper evaluated
  from NEXT_PUBLIC_DISABLED_FEATURES
- src/components/ui/maintenance-card.tsx: styled elevated notice using
  project brand tokens (bg-card, border-hover, accent-yellow tint);
  wrench icon, role=status, tight tracking on title, relaxed leading
  on body; supports optional feature-specific copy
- messages/{en,es}.json: adds "maintenance" namespace with title, body,
  and featureBody (ICU placeholder for feature name)
- Wraps five host components with the outer/inner pattern so hooks never
  fire when the feature is flagged off: AiWeatherSummary, RiskNarrative,
  PersonalizedSuggestions, PlanWizard, ChatPage
- Hooks (use-ai-summary, use-narrative, use-personalized-suggestions,
  use-emergency-plan) now expose disabled: boolean and set it on HTTP
  503, giving a defence-in-depth fallback when the env var is missing
- docker-compose.nano.yml: adds NEXT_PUBLIC_DISABLED_FEATURES to
  frontend environment block
- .env.production.nano.example: documents the var and updates the
  OPENAI_API_KEY comment to reference it
@javierdejesusda
Copy link
Copy Markdown
Owner Author

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@javierdejesusda javierdejesusda merged commit 8ba1cf9 into main May 8, 2026
3 checks passed
@javierdejesusda javierdejesusda deleted the chore/gp-nano-migration branch May 8, 2026 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant