Skip to content

fix: forward required env vars to sre-agent container in docker-compose#209

Open
Sanjays2402 wants to merge 1 commit into
fuzzylabs:mainfrom
Sanjays2402:fix/docker-compose-missing-required-env
Open

fix: forward required env vars to sre-agent container in docker-compose#209
Sanjays2402 wants to merge 1 commit into
fuzzylabs:mainfrom
Sanjays2402:fix/docker-compose-missing-required-env

Conversation

@Sanjays2402
Copy link
Copy Markdown

@Sanjays2402 Sanjays2402 commented Apr 19, 2026

What

Forward the environment variables that the sre-agent container requires to start (GITHUB_OWNER, GITHUB_REPO, GITHUB_REF, LOG_GROUP, SERVICE_NAME, TIME_RANGE_MINUTES) through the sre-agent service's environment: block in docker-compose.yaml.

Why

docker-compose.yaml does not forward several environment variables that the sre-agent container requires. docker compose only passes variables that are explicitly listed in a service's environment: (or env_file:) block — the user shell or project .env is not enough on its own. As a result, running docker compose up -d produces a crash-loop on the sre-agent service:

  • GITHUB_OWNER, GITHUB_REPO, GITHUB_REF are declared as required fields on GitHubSettings in src/sre_agent/core/settings.py (lines 22-35, no defaults). Settings validation raises a pydantic.ValidationError as soon as sre_agent.run imports the agent.
  • LOG_GROUP and SERVICE_NAME are required by sre_agent/run.py::_load_request_from_args_or_env (lines 22-36). Without them the entrypoint prints a usage banner and exits with SystemExit(1).
  • TIME_RANGE_MINUTES is optional for sre_agent.run but is honoured when set.

With restart: unless-stopped, the container keeps restarting and the agent never comes up.

How

Extend the environment: block of the sre-agent service so compose forwards every variable the agent needs to start:

  • GITHUB_OWNER, GITHUB_REPO, GITHUB_REF (with main as the default for the ref, matching IntegrationConfig.github_ref).
  • LOG_GROUP, SERVICE_NAME (required by sre_agent.run).
  • TIME_RANGE_MINUTES (optional, default 10, matching the CLI default).

Diff: +7 / −0 in docker-compose.yaml.

Extra

No new dependencies.

Repro on current main (6fe7329):

$ env -i PATH="$PATH" HOME=/tmp/noexist \
  ANTHROPIC_API_KEY=x AWS_REGION=us-east-1 AWS_ACCESS_KEY_ID=x AWS_SECRET_ACCESS_KEY=x \
  SLACK_CHANNEL_ID=C01 SLACK_MCP_URL=http://slack:13080/sse \
  GITHUB_PERSONAL_ACCESS_TOKEN=x GITHUB_MCP_URL=https://api.githubcopilot.com/mcp/ \
  LOG_GROUP=/ecs/foo SERVICE_NAME=cartservice \
  uv run python -m sre_agent.run
# FATAL ERROR: 3 validation errors for GitHubSettings (owner/repo/ref)

After the fix — compose config now renders all required vars:

$ ANTHROPIC_API_KEY=x AWS_REGION=us-east-1 AWS_ACCESS_KEY_ID=x AWS_SECRET_ACCESS_KEY=x \
  SLACK_CHANNEL_ID=C01 SLACK_BOT_TOKEN=xoxb-y \
  GITHUB_PERSONAL_ACCESS_TOKEN=ghp_z GITHUB_OWNER=foo GITHUB_REPO=bar GITHUB_REF=main \
  LOG_GROUP=/ecs/foo SERVICE_NAME=cartservice \
  docker compose config
  sre-agent:
    environment:
      GITHUB_OWNER: foo
      GITHUB_REF: main
      GITHUB_REPO: bar
      LOG_GROUP: /ecs/foo
      SERVICE_NAME: cartservice
      TIME_RANGE_MINUTES: "10"
      ...

Notes

  • This matches the environment that the ECS task definition already builds in src/sre_agent/core/deployments/aws_ecs/ecs_tasks.py (GITHUB_OWNER/REPO/REF are set there explicitly), so compose and ECS are now consistent about what the container needs.
  • Users still need to set these values (in their shell or a project-level .env) before docker compose up, but that is the same contract compose already uses for ANTHROPIC_API_KEY, SLACK_CHANNEL_ID, etc.

Checklist

  • I have run application tests ensuring nothing has broken.
  • I have updated the documentation if required. (N/A — environment contract matches ECS docs already.)
  • I have added tests which cover my changes. (N/A — compose config verified via docker compose config; no unit-testable code change.)

Type of change

Bug fix. Adjusted label on the PR panel.

MacOS tests

Not required — docker-compose YAML change only, no dependency or OS-specific behavior changes.

The sre-agent service defined in docker-compose.yaml crashes at startup
because several required environment variables are never forwarded into
the container:

- GITHUB_OWNER, GITHUB_REPO, and GITHUB_REF are required fields on
  GitHubSettings (no defaults), so settings validation raises a
  pydantic ValidationError as soon as the container starts.
- LOG_GROUP and SERVICE_NAME are required by sre_agent.run; without
  them the entrypoint exits with usage output and a non-zero status.
- TIME_RANGE_MINUTES is optional but honoured by sre_agent.run when
  set, so it is plumbed through with a sensible default.

Because the service uses restart: unless-stopped, this produces a
crash-loop that prevents docker compose up from ever bringing up the
agent, even when the missing values are present in the user shell
environment or project .env file (docker compose only passes through
variables listed under environment: or env_file:).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant