Skip to content

feat(iam/agent)!: require explicit assume_role_arns, drop implicit permissions role#421

Merged
agustincelentano merged 18 commits into
mainfrom
feat/agent-remove-implicit-permissions-role
Jul 1, 2026
Merged

feat(iam/agent)!: require explicit assume_role_arns, drop implicit permissions role#421
agustincelentano merged 18 commits into
mainfrom
feat/agent-remove-implicit-permissions-role

Conversation

@davidf-null

Copy link
Copy Markdown
Collaborator

What

The infrastructure/aws/iam/agent module used to prepend a conventionally-named permissions role ARN (nullplatform-{cluster}-agent-permissions-role) into the agent's sts:AssumeRole policy on every apply, without the caller asking for it. This PR removes that implicit behavior so the agent only assumes the roles explicitly provided.

Why

  • Single responsibility: the agent module stopped "knowing" about the k8s permissions role (owned by the k8s scope module).
  • No hidden ARN / no duplicates: every assumable role is now visible in one place (assume_role_arns + permissions_roles).
  • No dangling grants: clusters without containers no longer get an AssumeRole pointing at a nonexistent role.

Changes

  • main.tf: drop [local.permissions_role_arn] from the policy Resource; remove the dead permissions_role_arn local; add a lifecycle.precondition requiring at least one assumable role (avoids an empty Resource, which AWS rejects — now fails fast at plan).
  • variables.tf: remove unused permissions_role_name.
  • outputs.tf: remove nullplatform_agent_permissions_role_arn.
  • tests/agent.tftest.hcl: assert the convention ARN is no longer injected, and that plan fails when no role is provided (expect_failures).
  • README.md: docs updated.

Verification

  • tofu fmt -check -recursive → clean
  • tofu validate → Success
  • tofu test4 passed, 0 failed
  • No internal consumers of the removed output/variable in this repo.

⚠️ Breaking change for callers

Every cluster with containers that relied on the implicit permissions role must now pass it explicitly:

assume_role_arns = [
  module.k8s_agent_permissions.permissions_role_arn, # k8s permissions role — now explicit
  # ...any other roles (e.g. lambda)
]

Without it, the next plan fails the new precondition (explicit failure, not silent). Those caller changes live in the consuming repo.

Note: the AI_METADATA hash in the README is regenerated by the publish tooling; content was updated by hand.

🤖 Generated with Claude Code

David Fernandez and others added 18 commits April 9, 2026 15:55
…rmissions role

The agent module no longer injects the k8s permissions role ARN by naming
convention into the sts:AssumeRole policy. Callers must now pass every role the
agent should assume explicitly via assume_role_arns and/or permissions_roles.

- Remove [local.permissions_role_arn] from the assume-role policy Resource
- Remove the now-dead permissions_role_name variable, permissions_role_arn
  local, and nullplatform_agent_permissions_role_arn output
- Add a lifecycle precondition requiring at least one assumable role, so an
  empty Resource (rejected by AWS) fails fast at plan with a clear message
- Update tests and README accordingly

BREAKING CHANGE: clusters that relied on the implicit permissions role must now
pass it explicitly in assume_role_arns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@agustincelentano agustincelentano merged commit 44b8fd6 into main Jul 1, 2026
44 checks passed
@agustincelentano agustincelentano deleted the feat/agent-remove-implicit-permissions-role branch July 1, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants