From 4a11116dc76e1f20a3a5317e02c4f7916a4d14ff Mon Sep 17 00:00:00 2001 From: JacobPEvans <20714140+JacobPEvans-personal@users.noreply.github.com> Date: Sat, 30 May 2026 19:49:26 -0400 Subject: [PATCH 1/2] docs(infrastructure): add Terraform on AWS section MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three new pages under Infrastructure → Terraform on AWS: - overview: isolation model (per-project IAM role, GitHub OIDC for CI, named operator IAM users with MFA), naming conventions, the SSE-S3 vs SSE-KMS pricing rationale, aws-vault keychain context, tagging, and tool versions. - aws-bootstrap: admin-runnable Terraform that provisions the state bucket, GitHub OIDC trust, and per-project IAM role. Includes the chicken-and-egg flow (local apply → init -migrate-state into the bucket the bootstrap just created). - consuming-repo: backend.tf with use_lockfile=true, providers.tf with default_tags, the operator's ~/.aws/config profile chain (mfa-base → tf- via source_profile/role_arn), and the GitHub Actions workflow with aws-actions/configure-aws-credentials@v4 + OIDC. Plus a Terragrunt variant section. Three CardGroup cross-links added to existing pages: - infrastructure/overview.mdx (Cross-cutting topics) - security/tools/aws-vault.mdx (See also list) - infrastructure/terraform-check-placement.mdx (Where to go next) docs.json gains the new "Terraform on AWS" nested subgroup inside the Infrastructure group, alongside the existing CI/CD subgroup. Standards established: - One IAM role per repo; no shared "terraform" power-user. - GitHub OIDC only for CI; no static AWS_ACCESS_KEY_ID secrets. - S3 native locking via use_lockfile=true; no DynamoDB lock tables. - SSE-S3 (AES256) for state buckets; no per-project KMS keys — pricing wins, isolation already enforced at the IAM role boundary. - Bootstrap state lives in the bucket it provisions, under _bootstrap/terraform.tfstate. All Mermaid diagrams use the canonical theme directive and classDef palette; internal click directives only; no real account IDs, hostnames, or IPs anywhere in the new pages. Assisted-by: Claude --- docs.json | 8 + infrastructure/overview.mdx | 3 + infrastructure/terraform-check-placement.mdx | 3 + infrastructure/terraform/aws-bootstrap.mdx | 407 +++++++++++++++++++ infrastructure/terraform/consuming-repo.mdx | 267 ++++++++++++ infrastructure/terraform/overview.mdx | 125 ++++++ security/tools/aws-vault.mdx | 1 + 7 files changed, 814 insertions(+) create mode 100644 infrastructure/terraform/aws-bootstrap.mdx create mode 100644 infrastructure/terraform/consuming-repo.mdx create mode 100644 infrastructure/terraform/overview.mdx diff --git a/docs.json b/docs.json index e38a38d..9d23fbd 100644 --- a/docs.json +++ b/docs.json @@ -83,6 +83,14 @@ "infrastructure/cicd/git-signing", "infrastructure/cicd/terraform-runs-on" ] + }, + { + "group": "Terraform on AWS", + "pages": [ + "infrastructure/terraform/overview", + "infrastructure/terraform/aws-bootstrap", + "infrastructure/terraform/consuming-repo" + ] } ] }, diff --git a/infrastructure/overview.mdx b/infrastructure/overview.mdx index 04a0c48..d7a4ee3 100644 --- a/infrastructure/overview.mdx +++ b/infrastructure/overview.mdx @@ -69,6 +69,9 @@ Terraform builds VMs and LXCs (coral). Ansible takes the inventory and configure ## Cross-cutting topics + + Per-project IAM role, GitHub OIDC, S3 native locking, SSE-S3 — the standard for any new AWS-backed Terraform repo. + OrbStack as the local control plane; what runs on K8s vs LXC vs Docker. diff --git a/infrastructure/terraform-check-placement.mdx b/infrastructure/terraform-check-placement.mdx index dd6fea2..88fc72b 100644 --- a/infrastructure/terraform-check-placement.mdx +++ b/infrastructure/terraform-check-placement.mdx @@ -122,6 +122,9 @@ Direnv already activates the Nix dev shell on `cd` — no `nix develop` wrapper ## Where to go next + + The admin-runnable module that creates the role this placement rule applies to. + Marketplace actions, release-please, version pinning, runner choice. diff --git a/infrastructure/terraform/aws-bootstrap.mdx b/infrastructure/terraform/aws-bootstrap.mdx new file mode 100644 index 0000000..735db3c --- /dev/null +++ b/infrastructure/terraform/aws-bootstrap.mdx @@ -0,0 +1,407 @@ +--- +title: "AWS bootstrap" +description: "Admin-runnable Terraform that provisions the state bucket, GitHub OIDC trust, and per-project IAM role for a new Terraform repo. Idempotent; runs once per project." +tier: 2 +--- + +{/* TIER-GUARD: reference page — prerequisites, the bootstrap module, the migrate-state flow, and verify all belong together. */} + +> Run this once per new project, with admin AWS credentials. Output is everything a new repo's `backend.tf` needs. + +The bootstrap is plain Terraform. The first `apply` runs with local state; once the bucket exists, you uncomment the `backend "s3"` block at the top of the file and `terraform init -migrate-state` lifts the state into the bucket the bootstrap itself just created. The bucket then hosts both its own bootstrap state (`_bootstrap/terraform.tfstate`) and the consuming repo's state (`/terraform.tfstate`). + +## Prerequisites + +- Admin AWS credentials in the shell — `aws sts get-caller-identity` returns an admin ARN. +- Terraform ≥ 1.10 or OpenTofu ≥ 1.10 on PATH. +- The new GitHub repo (`/`) already exists. +- The GitHub Actions OIDC provider exists in the AWS account. Check with: + + ```bash + aws iam list-open-id-connect-providers \ + --query 'OpenIDConnectProviderList[?contains(Arn, `token.actions.githubusercontent.com`)]' + ``` + + If the result is empty, create it once per account (one-time, account-wide): + + ```bash + aws iam create-open-id-connect-provider \ + --url https://token.actions.githubusercontent.com \ + --client-id-list sts.amazonaws.com + ``` + + AWS verifies the GitHub Actions issuer's certificate chain automatically — no manual thumbprint is needed. + +- Each human operator has an IAM user with MFA enabled, and a policy granting only `sts:AssumeRole` on `arn:aws:iam:::role/tf-*` (no direct resource permissions). Operator IAM user creation is a per-operator one-time step, separate from per-project bootstrap. + +## Where this lives + +The recommended layout is one directory per project inside a single admin-owned repo (suggested name: `terraform-aws-foundation`): + +```text +terraform-aws-foundation/ +├── bootstrap/ +│ ├── proxmox/ +│ │ ├── main.tf # this file +│ │ └── terraform.tfvars # per-project values +│ ├── unifi/ +│ │ ├── main.tf +│ │ └── terraform.tfvars +│ └── ... +└── README.md +``` + +Each per-project directory is independent: its own state object lives in its own bucket, so projects cannot affect each other even by accident. + +## The bootstrap module + +Paste the block below into `main.tf`. It is the entire module — no submodules, no provider aliases, no remote dependencies. + +```hcl +terraform { + required_version = ">= 1.10" + + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 5.0" + } + } + + # The first `terraform apply` runs with local state. + # After the bucket exists, uncomment this block (substitute the outputs) + # and run `terraform init -migrate-state` to lift state into the bucket. + # + # backend "s3" { + # bucket = "tfstate--" + # key = "_bootstrap/terraform.tfstate" + # region = "us-east-1" + # use_lockfile = true + # encrypt = true + # } +} + +variable "project" { + description = "Short kebab-case project id. Matches the consuming repo's last segment." + type = string +} + +variable "github_org" { + description = "GitHub organization that owns the consuming repo." + type = string +} + +variable "github_repo" { + description = "Name of the consuming repo. Used in the OIDC sub-claim match." + type = string +} + +variable "branch_pattern" { + description = "Branch name CI is allowed to assume the role from on push." + type = string + default = "main" +} + +variable "operator_user_arns" { + description = "IAM user ARNs of human operators allowed to assume the role with MFA." + type = list(string) + default = [] +} + +variable "aws_region" { + description = "Region for the state bucket." + type = string + default = "us-east-1" +} + +provider "aws" { + region = var.aws_region + + default_tags { + tags = { + Project = var.project + ManagedBy = "Terraform" + Repo = "${var.github_org}/${var.github_repo}" + Environment = "bootstrap" + } + } +} + +data "aws_caller_identity" "current" {} + +data "aws_iam_openid_connect_provider" "github" { + url = "https://token.actions.githubusercontent.com" +} + +locals { + bucket_name = "tfstate-${var.project}-${data.aws_caller_identity.current.account_id}" + role_name = "tf-${var.project}" +} + +resource "aws_s3_bucket" "state" { + bucket = local.bucket_name +} + +resource "aws_s3_bucket_versioning" "state" { + bucket = aws_s3_bucket.state.id + versioning_configuration { + status = "Enabled" + } +} + +resource "aws_s3_bucket_server_side_encryption_configuration" "state" { + bucket = aws_s3_bucket.state.id + rule { + apply_server_side_encryption_by_default { + sse_algorithm = "AES256" + } + } +} + +resource "aws_s3_bucket_public_access_block" "state" { + bucket = aws_s3_bucket.state.id + block_public_acls = true + block_public_policy = true + ignore_public_acls = true + restrict_public_buckets = true +} + +resource "aws_s3_bucket_lifecycle_configuration" "state" { + bucket = aws_s3_bucket.state.id + rule { + id = "expire-noncurrent-versions" + status = "Enabled" + filter {} + noncurrent_version_expiration { + noncurrent_days = 90 + } + } +} + +resource "aws_s3_bucket_policy" "deny_insecure_transport" { + bucket = aws_s3_bucket.state.id + policy = jsonencode({ + Version = "2012-10-17" + Statement = [{ + Sid = "DenyInsecureTransport" + Effect = "Deny" + Principal = "*" + Action = "s3:*" + Resource = [aws_s3_bucket.state.arn, "${aws_s3_bucket.state.arn}/*"] + Condition = { + Bool = { "aws:SecureTransport" = "false" } + } + }] + }) +} + +resource "aws_iam_role" "terraform" { + name = local.role_name + + assume_role_policy = jsonencode({ + Version = "2012-10-17" + Statement = concat( + [ + { + Sid = "GitHubOIDCBranchPush" + Effect = "Allow" + Principal = { Federated = data.aws_iam_openid_connect_provider.github.arn } + Action = "sts:AssumeRoleWithWebIdentity" + Condition = { + StringEquals = { + "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com" + } + StringLike = { + "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:ref:refs/heads/${var.branch_pattern}" + } + } + }, + { + Sid = "GitHubOIDCPullRequest" + Effect = "Allow" + Principal = { Federated = data.aws_iam_openid_connect_provider.github.arn } + Action = "sts:AssumeRoleWithWebIdentity" + Condition = { + StringEquals = { + "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com" + } + StringLike = { + "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:pull_request" + } + } + } + ], + length(var.operator_user_arns) > 0 ? [{ + Sid = "OperatorAssumeWithMFA" + Effect = "Allow" + Principal = { AWS = var.operator_user_arns } + Action = "sts:AssumeRole" + Condition = { + Bool = { "aws:MultiFactorAuthPresent" = "true" } + } + }] : [] + ) + }) +} + +resource "aws_iam_role_policy" "state" { + name = "${local.role_name}-state" + role = aws_iam_role.terraform.id + + policy = jsonencode({ + Version = "2012-10-17" + Statement = [ + { + Sid = "ListBucket" + Effect = "Allow" + Action = ["s3:ListBucket", "s3:GetBucketVersioning"] + Resource = aws_s3_bucket.state.arn + }, + { + Sid = "ObjectAccess" + Effect = "Allow" + Action = ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"] + Resource = "${aws_s3_bucket.state.arn}/*" + } + ] + }) +} + +output "state_bucket" { + description = "Name of the S3 bucket for Terraform state." + value = aws_s3_bucket.state.bucket +} + +output "state_bucket_arn" { + description = "ARN of the state bucket." + value = aws_s3_bucket.state.arn +} + +output "tf_role_arn" { + description = "Role ARN for the consuming repo's backend, local dev, and CI." + value = aws_iam_role.terraform.arn +} + +output "aws_region" { + description = "Region where the state bucket lives." + value = var.aws_region +} + +output "state_key_prefix" { + description = "Prefix the consuming repo writes its state objects under." + value = "${var.project}/" +} +``` + +The S3-native lock object (`/terraform.tfstate.tflock`) is just another S3 object under the same prefix as state — no separate IAM permission is required for it, and no DynamoDB table. + +## Bootstrap the chicken-and-egg + + + + Create a `terraform.tfvars` next to `main.tf`: + + ```hcl + project = "proxmox" + github_org = "" + github_repo = "terraform-proxmox" + branch_pattern = "main" + operator_user_arns = [ + "arn:aws:iam:::user/", + ] + ``` + + Then: + + ```bash + terraform init # local state — no backend block yet + terraform apply + ``` + + Confirm the apply. Save the outputs — `terraform output -json` gives you a single blob you can pipe into the consuming repo's backend block. + + + In `main.tf`, uncomment the `backend "s3"` block at the top of the `terraform {}` block and substitute the outputs the apply just produced: + + ```hcl + backend "s3" { + bucket = "tfstate-proxmox-" + key = "_bootstrap/terraform.tfstate" + region = "us-east-1" + use_lockfile = true + encrypt = true + } + ``` + + `encrypt = true` instructs the client to send the SSE header on every PutObject. The bucket's default SSE-S3 encryption is already configured by the module above — no `kms_key_id` is needed because there is no KMS key. + + + ```bash + terraform init -migrate-state + ``` + + Terraform prompts to copy the local state file into the bucket. Confirm. Then delete the local artefacts: + + ```bash + rm terraform.tfstate terraform.tfstate.backup + ``` + + The bootstrap is now self-hosting in the bucket it created. Subsequent `terraform plan` / `terraform apply` runs against the bootstrap (for example to widen `branch_pattern` or add another operator) work like any other Terraform module. + + + +## Verify + +```bash +# State bucket exists and contains the bootstrap state. +aws s3 ls s3://tfstate--/_bootstrap/ + +# Role exists with the expected name. +aws iam get-role --role-name tf- + +# Outputs — feed these into the consuming repo's backend.tf. +terraform output -json +``` + +If all three succeed, the consuming repo can immediately set up its [backend.tf](/infrastructure/terraform/consuming-repo) and run its first `terraform plan`. + +{/* Shape: linear chain. 5 nodes. Boundary crossings: 0. Aspect: ~5:1 LR. Pass. */} + +```mermaid +%%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%% +flowchart LR + Admin([Admin shell]) + Apply([terraform apply
local state]) + Bucket[(State bucket created)] + Migrate([init -migrate-state]) + Done([Self-hosting]) + + Admin --> Apply --> Bucket --> Migrate --> Done + + classDef src fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6; + classDef hop fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6; + classDef sink fill:#102937,stroke:#F4EFE6,stroke-width:2px,color:#F4EFE6; + + class Admin src + class Apply,Migrate hop + class Bucket,Done sink + + linkStyle 0,1,2,3 stroke:#F4EFE6,stroke-width:1.5px; +``` + +## Where to go next + + + + The isolation model and naming conventions this bootstrap implements. + + + What the new repo drops in next to use the outputs above. + + + Where every Terraform / OpenTofu command runs — pre-commit vs CI. + + + Operator-side credential management for the role this bootstrap created. + + diff --git a/infrastructure/terraform/consuming-repo.mdx b/infrastructure/terraform/consuming-repo.mdx new file mode 100644 index 0000000..844e6ed --- /dev/null +++ b/infrastructure/terraform/consuming-repo.mdx @@ -0,0 +1,267 @@ +--- +title: "Consuming-repo setup" +description: "backend.tf, providers.tf, the operator's ~/.aws/config profile, and the GitHub Actions OIDC workflow that lets a new Terraform repo run plan and apply against the per-project role." +tier: 2 +--- + +{/* TIER-GUARD: reference page — file layout, backend config, local dev, and CI workflow belong together. */} + +> Drop these files into the new repo. The role and bucket from the [AWS bootstrap](/infrastructure/terraform/aws-bootstrap) supply everything the backend block needs. + +The four outputs from the bootstrap (`state_bucket`, `tf_role_arn`, `aws_region`, `state_key_prefix`) feed straight into this page. Once the files below are in place, the repo's first `terraform plan` runs in under a minute — locally and in CI — with no static credentials anywhere. + +## Repo file layout + +```text +/ +├── backend.tf +├── providers.tf +├── main.tf +└── .github/workflows/terraform.yml +``` + +`main.tf` is whatever the repo actually manages — VPC, Bedrock agents, Proxmox guests, anything. The three files below set up the surrounding plumbing. + +## `backend.tf` + +```hcl +terraform { + required_version = ">= 1.10" + + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 5.0" + } + } + + backend "s3" { + bucket = "tfstate--" + key = "/terraform.tfstate" + region = "us-east-1" + use_lockfile = true + encrypt = true + } +} +``` + +`use_lockfile = true` enables S3-native locking via conditional writes (Terraform ≥ 1.10, OpenTofu ≥ 1.10). No DynamoDB table, no separate lock service. `encrypt = true` instructs the client to send the SSE header on every PutObject; the bucket's default SSE-S3 encryption already configured by the bootstrap handles the actual cipher. + +There is no `assume_role` block here. The aws-vault profile (locally) and `aws-actions/configure-aws-credentials@v4` (in CI) perform the AssumeRole before Terraform runs, exporting the role's STS credentials into the subprocess environment. Terraform consumes those credentials directly. + +## `providers.tf` + +```hcl +provider "aws" { + region = "us-east-1" + + default_tags { + tags = { + Project = "" + ManagedBy = "Terraform" + Repo = "/" + Environment = "prod" + } + } +} +``` + +Every resource the AWS provider creates inherits the four tags. Setting `default_tags` at the provider level keeps individual resource declarations clean and prevents per-resource tag drift. + +## Local development + + + + The base profile holds your IAM user identity. The per-project profile chains off it and assumes the role: + + ```ini + [profile mfa-base] + region = us-east-1 + mfa_serial = arn:aws:iam:::mfa/ + session_ttl = 1h + + [profile tf-] + source_profile = mfa-base + role_arn = arn:aws:iam:::role/tf- + region = us-east-1 + ``` + + `mfa-base` is added once per operator (same block in every repo's docs). The `tf-` block is repo-specific — copy it into `~/.aws/config` the first time you clone the repo. + + + One time per operator. aws-vault writes the access key into its dedicated macOS keychain — nothing lands in `~/.aws/credentials`: + + ```bash + aws-vault add mfa-base + ``` + + + aws-vault prompts for your MFA token on the first invocation per cached session (`session_ttl = 1h`), assumes the per-project role via the chained `source_profile`, and exports the role's STS credentials into the subprocess: + + ```bash + aws-vault exec tf- -- terraform init + aws-vault exec tf- -- terraform plan + ``` + + Subsequent commands inside the `session_ttl` window do not re-prompt for MFA. + + + See the [aws-vault page](/security/tools/aws-vault) for keychain backends, `session_ttl` tuning, the canonical `aws-vault exec ... -- doppler run -- terragrunt plan` envelope, and the anti-patterns the tool exists to prevent. + + + +## CI/CD — GitHub Actions with OIDC + +`.github/workflows/terraform.yml`: + +```yaml +name: terraform +on: + pull_request: + branches: [main] + push: + branches: [main] + +permissions: + id-token: write # required for OIDC token exchange + contents: read + +jobs: + plan: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - uses: hashicorp/setup-terraform@v3 + with: + terraform_version: 1.10.x + + - uses: aws-actions/configure-aws-credentials@v4 + with: + role-to-assume: arn:aws:iam:::role/tf- + aws-region: us-east-1 + + - run: terraform init + - run: terraform plan -no-color + + apply: + if: github.event_name == 'push' && github.ref == 'refs/heads/main' + needs: plan + runs-on: ubuntu-latest + environment: prod # require manual reviewer approval via GitHub Environment + steps: + - uses: actions/checkout@v4 + + - uses: hashicorp/setup-terraform@v3 + with: + terraform_version: 1.10.x + + - uses: aws-actions/configure-aws-credentials@v4 + with: + role-to-assume: arn:aws:iam:::role/tf- + aws-region: us-east-1 + + - run: terraform init + - run: terraform apply -auto-approve +``` + +The `environment: prod` line ties the apply job to a GitHub Environment with required reviewers — apply pauses until a maintainer approves. Configure reviewers in the repo's *Settings → Environments → prod*. + +The same `tf-` role used locally is the role CI assumes. The OIDC trust statements in the bootstrap (`refs/heads/` for push, `pull_request` for PR runs) gate which workflow events are allowed. + + +For the [self-hosted RunsOn-on-AWS-spot](/infrastructure/cicd/terraform-runs-on) runner alternative, the workflow shape is identical — `runs-on: ubuntu-latest` swaps for a self-hosted runner label, and everything else stays the same. OIDC works the same way on self-hosted runners. + + +## Terragrunt variant + +Terragrunt repos generate the same `backend.tf` shape at runtime instead of hand-writing it. Drop a `root.hcl` (or `terragrunt.hcl` at the repo root) with: + +```hcl +remote_state { + backend = "s3" + generate = { + path = "backend.tf" + if_exists = "overwrite_terragrunt" + } + config = { + bucket = "tfstate-${local.project}-${local.account_id}" + key = "${path_relative_to_include()}/terraform.tfstate" + region = "us-east-1" + use_lockfile = true + encrypt = true + } +} + +locals { + project = "" + account_id = get_aws_account_id() +} +``` + +Per-leaf `terragrunt.hcl` files at each environment include the parent and do not repeat backend config: + +```hcl +include "root" { + path = find_in_parent_folders() +} +``` + +Local invocation pairs aws-vault with [Doppler](/security/tools/doppler) for runtime secret injection — the canonical chain from the [check-placement page](/infrastructure/terraform-check-placement): + +```bash +aws-vault exec tf- -- doppler run -- terragrunt plan +``` + +In CI, `aws-actions/configure-aws-credentials@v4` replaces `aws-vault exec`, and `doppler run` replaces with whatever secret-injection path the workflow uses (Doppler CLI inside the runner, or repo secrets). + +{/* Shape: parallel convergence (operator chain and CI chain join at AssumeRole, continue to bucket via plan). Ranks: 2x2x1x1x1. Boundary crossings: 0. Aspect: ~3:1 LR. Pass. */} + +```mermaid +%%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%% +flowchart LR + Op([Operator]) + Vault([aws-vault exec]) + CI([GitHub Actions]) + Cred([configure-aws-credentials]) + Role{tf-project} + Plan([terraform plan]) + Bucket[(State bucket)] + + Op --> Vault --> Role + CI --> Cred --> Role + Role --> Plan --> Bucket + + classDef src fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6; + classDef hop fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6; + classDef external fill:#102937,stroke:#E6B35A,stroke-width:2px,color:#F4EFE6; + classDef gate fill:#102937,stroke:#E06B4A,stroke-width:2.5px,color:#F4EFE6; + classDef sink fill:#102937,stroke:#F4EFE6,stroke-width:2px,color:#F4EFE6; + + class Op src + class Vault,Plan hop + class CI,Cred external + class Role gate + class Bucket sink + + linkStyle 0,1 stroke:#F4EFE6,stroke-width:1.5px; + linkStyle 2,3 stroke:#E6B35A,stroke-width:1.5px,stroke-dasharray:2 4; + linkStyle 4,5 stroke:#E06B4A,stroke-width:2px,stroke-dasharray:4 3; +``` + +## Where to go next + + + + The isolation model and naming conventions this repo implements. + + + The admin-side module that created the bucket and role this repo points at. + + + Static checks in pre-commit, credentialed plan/apply in CI only. + + + The wider CI/CD shape — marketplace actions, runner choice, version pinning. + + diff --git a/infrastructure/terraform/overview.mdx b/infrastructure/terraform/overview.mdx new file mode 100644 index 0000000..2da2612 --- /dev/null +++ b/infrastructure/terraform/overview.mdx @@ -0,0 +1,125 @@ +--- +title: "Terraform on AWS" +description: "Per-project IAM role, GitHub OIDC for CI, S3 native locking, SSE-S3 encryption, aws-vault + MFA for local dev. The standard for any new AWS-backed Terraform/OpenTofu/Terragrunt repo." +tier: 2 +--- + +{/* TIER-GUARD: reference page — the isolation model, naming, encryption policy, and tagging belong together. */} + +> One IAM role per repo. Humans and CI both `AssumeRole`; nothing holds direct AWS resource access. State lives in a project-scoped S3 bucket; locks live in the same bucket via S3 conditional writes. + +The pattern below is the single supported shape for any new AWS-backed Terraform / OpenTofu / Terragrunt repo. The [bootstrap snippet](/infrastructure/terraform/aws-bootstrap) provisions every resource named here in one `terraform apply`; the [consuming-repo page](/infrastructure/terraform/consuming-repo) shows what the new repo itself needs to drop in. + +## The isolation model + +The per-project IAM role is the security boundary. Its trust policy lists exactly two principal types: GitHub OIDC for the matching repo, and named IAM users with MFA. Its permissions policy grants S3 access to exactly one bucket — its own. + +Humans authenticate as themselves with a base IAM user whose long-lived access key lives in the [aws-vault](/security/tools/aws-vault) macOS keychain. The base user holds one permission only: `sts:AssumeRole` on roles named `tf-*`. Every Terraform command runs through `aws-vault exec tf- -- ...`, which mints a short-lived STS session for the per-project role and injects it into the subprocess. + +CI uses no static credentials at all. GitHub Actions exchanges its short-lived OIDC token for STS credentials directly via the role's trust policy. There is no `AWS_ACCESS_KEY_ID` secret in any repo. The same role that a human assumes is the role CI assumes — one trust policy, one permissions policy, audited the same way. + +{/* Shape: parallel convergence (two source chains join at the per-project role, continue to bucket). Ranks: 2x2x1x1. Boundary crossings: 0. Aspect: ~2:1 LR. Pass. */} + +```mermaid +%%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%% +flowchart LR + Op([Operator]) + Vault([aws-vault MFA]) + CI([GitHub Actions]) + OIDC([OIDC token]) + Role{tf-project} + Bucket[(tfstate-project)] + + Op --> Vault --> Role + CI --> OIDC --> Role + Role --> Bucket + + classDef src fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6; + classDef hop fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6; + classDef external fill:#102937,stroke:#E6B35A,stroke-width:2px,color:#F4EFE6; + classDef gate fill:#102937,stroke:#E06B4A,stroke-width:2.5px,color:#F4EFE6; + classDef sink fill:#102937,stroke:#F4EFE6,stroke-width:2px,color:#F4EFE6; + + class Op src + class Vault hop + class CI,OIDC external + class Role gate + class Bucket sink + + click Role "/infrastructure/terraform/aws-bootstrap" "Bootstrap that provisions this role" + click Bucket "/infrastructure/terraform/consuming-repo" "How consuming repos wire to this bucket" + + linkStyle 0,1 stroke:#F4EFE6,stroke-width:1.5px; + linkStyle 2,3 stroke:#E6B35A,stroke-width:1.5px,stroke-dasharray:2 4; + linkStyle 4 stroke:#E06B4A,stroke-width:2px,stroke-dasharray:4 3; +``` + +## Naming conventions + +Every project uses the same naming shape so that an account-wide audit (`aws s3 ls`, `aws iam list-roles --query "Roles[?starts_with(RoleName, \`tf-\`)]"`) is trivial. + +| Resource | Pattern | Example | +| --- | --- | --- | +| S3 state bucket | `tfstate--` | `tfstate-proxmox-111122223333` | +| IAM role | `tf-` | `tf-proxmox` | +| State object key | `/terraform.tfstate` | `proxmox/terraform.tfstate` | +| Bootstrap state key | `_bootstrap/terraform.tfstate` | (same key in every project's bucket) | + +`` is a short kebab-case identifier matching the consuming repo's last path segment (e.g. `proxmox` for `terraform-proxmox`, `unifi` for `terraform-unifi`). `` is the 12-digit AWS account number — its inclusion in the bucket name makes the name globally unique across the S3 namespace without requiring a random suffix. + +## Encryption — why SSE-S3, not SSE-KMS + +Every state bucket has bucket-default SSE-S3 (`AES256`) applied; the consuming repo's backend block sets `encrypt = true` so each PutObject carries the SSE header explicitly. + +SSE-KMS uses the same AES-256 cipher under the hood. The difference is who owns the key material. SSE-KMS costs about $1 per month per project key plus a KMS API call on every state read and write — a real number in pipelines that re-plan on every PR. See [AWS KMS pricing](https://aws.amazon.com/kms/pricing/). Since access to the state bucket is already gated by the per-project IAM role's trust policy (MFA-required for humans, OIDC-bound for CI), the KMS layer adds operational cost without changing who can read the state. + +Application-layer secrets that genuinely need MFA-gated or cross-account key control belong in [Bitwarden](/security/tools/bitwarden) for cold human secrets or [Doppler](/security/tools/doppler) for warm runtime injection — never inside the state file. + +## Where the long-lived AWS key actually lives + +The base IAM user's access key is stored in a dedicated [aws-vault](https://github.com/99designs/aws-vault) macOS keychain — a separate keychain from the login keychain, with its own password and access policy. No long-lived AWS credentials ever land in `~/.aws/credentials`, in a `.env` file, or in shell history. + +Every Terraform invocation runs under a one-hour STS session minted by `aws-vault exec tf- -- `. aws-vault prompts for an MFA token on the first invocation per cached session window and silently re-uses the cached chained session for the remainder of `session_ttl`. See [aws-vault](/security/tools/aws-vault) for the profile-management mechanics. + +## Tagging + +Every resource carries four tags, applied via the AWS provider's `default_tags` block so individual resource declarations stay clean: + +| Tag | Value | +| --- | --- | +| `Project` | `` (same as in the naming table above) | +| `ManagedBy` | `Terraform` | +| `Repo` | `/` | +| `Environment` | `bootstrap` for the bootstrap module; per-environment (`prod`, `staging`) for the consuming repo | + +The `Project` tag should be activated as an AWS cost allocation tag (Billing → Cost allocation tags) so per-project spend appears in Cost Explorer. + +## Tool versions + +| Tool | Minimum version | Why | +| --- | --- | --- | +| Terraform | 1.10 | S3 native locking (`use_lockfile`) released in Nov 2024 | +| OpenTofu | 1.10 | S3 native locking released in 1.10 (conditional writes via `If-None-Match`) | +| aws-vault | 7.x | Stable keychain backend and chained session caching | +| AWS CLI | v2 | `aws sts assume-role` behavior matches what the IAM trust policy expects | + + +Terragrunt wraps Terraform and uses the same backend config — none of the isolation model changes. The Terragrunt-specific `remote_state {}` and `generate {}` blocks live in the [consuming-repo page](/infrastructure/terraform/consuming-repo#terragrunt-variant). + + +## Where to go next + + + + The admin-runnable Terraform that creates every per-project resource named on this page. + + + What goes inside the new repo so `terraform plan` runs immediately. + + + How the keychain backend, MFA, and session TTLs interact in practice. + + + Static checks in pre-commit, credentialed ops in CI. The placement rule every repo follows. + + diff --git a/security/tools/aws-vault.mdx b/security/tools/aws-vault.mdx index 3b2f24d..22bb706 100644 --- a/security/tools/aws-vault.mdx +++ b/security/tools/aws-vault.mdx @@ -86,4 +86,5 @@ Storing long-lived AWS access keys in `~/.aws/credentials` instead of the keycha - [Doppler](/security/tools/doppler) — the runtime config layer wrapped inside `aws-vault exec`. - [How it fits together](/security/how-it-fits-together#local-dev-flow-aws-vault-into-doppler-into-terragrunt) — flow diagram. +- [Consuming-repo setup](/infrastructure/terraform/consuming-repo) — where the `mfa-base` + `tf-` profile chain plugs into a real Terraform repo. - [`docs.dryvist.com`](https://docs.dryvist.com) — account IDs, MFA serials, real-world profile names. From 60d6c134c7fae3b683c3551abe64fdf69c8ce9c2 Mon Sep 17 00:00:00 2001 From: JacobPEvans <20714140+JacobPEvans-personal@users.noreply.github.com> Date: Sat, 30 May 2026 22:41:06 -0400 Subject: [PATCH 2/2] docs(infrastructure): extract bootstrap HCL to terraform-aws-template repo The inline ~230-line Terraform snippet on the aws-bootstrap page is gone. The module code now lives in dryvist/terraform-aws-template (Apache-2.0, public). The page shows a ~45-line module-block usage pinned to v0.1.0; full input/output reference and the underlying resource list live in the module repo's README. Page is roughly half the length and easier to scan. The chicken-and-egg flow, verify commands, Mermaid diagram, and CardGroup are unchanged. Repo: https://github.com/dryvist/terraform-aws-template Release: https://github.com/dryvist/terraform-aws-template/releases/tag/v0.1.0 Assisted-by: Claude:claude-opus-4-7 --- infrastructure/terraform/aws-bootstrap.mdx | 246 +++------------------ 1 file changed, 25 insertions(+), 221 deletions(-) diff --git a/infrastructure/terraform/aws-bootstrap.mdx b/infrastructure/terraform/aws-bootstrap.mdx index 735db3c..8a23102 100644 --- a/infrastructure/terraform/aws-bootstrap.mdx +++ b/infrastructure/terraform/aws-bootstrap.mdx @@ -55,7 +55,7 @@ Each per-project directory is independent: its own state object lives in its own ## The bootstrap module -Paste the block below into `main.tf`. It is the entire module — no submodules, no provider aliases, no remote dependencies. +The Terraform code lives in [`dryvist/terraform-aws-template`][repo] (Apache-2.0, public). Each per-project bootstrap directory is a small root module that wires the published module to the project's values: ```hcl terraform { @@ -68,9 +68,9 @@ terraform { } } - # The first `terraform apply` runs with local state. - # After the bucket exists, uncomment this block (substitute the outputs) - # and run `terraform init -migrate-state` to lift state into the bucket. + # First `terraform apply` runs with local state. Once the bucket exists, + # uncomment this block (substitute the bucket name the apply emits) and + # run `terraform init -migrate-state` to lift state into the bucket. # # backend "s3" { # bucket = "tfstate--" @@ -81,244 +81,48 @@ terraform { # } } -variable "project" { - description = "Short kebab-case project id. Matches the consuming repo's last segment." - type = string -} - -variable "github_org" { - description = "GitHub organization that owns the consuming repo." - type = string -} - -variable "github_repo" { - description = "Name of the consuming repo. Used in the OIDC sub-claim match." - type = string -} - -variable "branch_pattern" { - description = "Branch name CI is allowed to assume the role from on push." - type = string - default = "main" -} - -variable "operator_user_arns" { - description = "IAM user ARNs of human operators allowed to assume the role with MFA." - type = list(string) - default = [] -} - -variable "aws_region" { - description = "Region for the state bucket." - type = string - default = "us-east-1" -} - provider "aws" { - region = var.aws_region - - default_tags { - tags = { - Project = var.project - ManagedBy = "Terraform" - Repo = "${var.github_org}/${var.github_repo}" - Environment = "bootstrap" - } - } -} - -data "aws_caller_identity" "current" {} - -data "aws_iam_openid_connect_provider" "github" { - url = "https://token.actions.githubusercontent.com" -} - -locals { - bucket_name = "tfstate-${var.project}-${data.aws_caller_identity.current.account_id}" - role_name = "tf-${var.project}" + region = "us-east-1" } -resource "aws_s3_bucket" "state" { - bucket = local.bucket_name -} +module "state_backend" { + source = "git::https://github.com/dryvist/terraform-aws-template.git?ref=v0.1.0" -resource "aws_s3_bucket_versioning" "state" { - bucket = aws_s3_bucket.state.id - versioning_configuration { - status = "Enabled" - } -} + project = "proxmox" + github_org = "" + github_repo = "terraform-proxmox" + branch_pattern = "main" -resource "aws_s3_bucket_server_side_encryption_configuration" "state" { - bucket = aws_s3_bucket.state.id - rule { - apply_server_side_encryption_by_default { - sse_algorithm = "AES256" - } - } + operator_user_arns = [ + "arn:aws:iam:::user/", + ] } -resource "aws_s3_bucket_public_access_block" "state" { - bucket = aws_s3_bucket.state.id - block_public_acls = true - block_public_policy = true - ignore_public_acls = true - restrict_public_buckets = true -} - -resource "aws_s3_bucket_lifecycle_configuration" "state" { - bucket = aws_s3_bucket.state.id - rule { - id = "expire-noncurrent-versions" - status = "Enabled" - filter {} - noncurrent_version_expiration { - noncurrent_days = 90 - } - } -} - -resource "aws_s3_bucket_policy" "deny_insecure_transport" { - bucket = aws_s3_bucket.state.id - policy = jsonencode({ - Version = "2012-10-17" - Statement = [{ - Sid = "DenyInsecureTransport" - Effect = "Deny" - Principal = "*" - Action = "s3:*" - Resource = [aws_s3_bucket.state.arn, "${aws_s3_bucket.state.arn}/*"] - Condition = { - Bool = { "aws:SecureTransport" = "false" } - } - }] - }) -} - -resource "aws_iam_role" "terraform" { - name = local.role_name - - assume_role_policy = jsonencode({ - Version = "2012-10-17" - Statement = concat( - [ - { - Sid = "GitHubOIDCBranchPush" - Effect = "Allow" - Principal = { Federated = data.aws_iam_openid_connect_provider.github.arn } - Action = "sts:AssumeRoleWithWebIdentity" - Condition = { - StringEquals = { - "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com" - } - StringLike = { - "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:ref:refs/heads/${var.branch_pattern}" - } - } - }, - { - Sid = "GitHubOIDCPullRequest" - Effect = "Allow" - Principal = { Federated = data.aws_iam_openid_connect_provider.github.arn } - Action = "sts:AssumeRoleWithWebIdentity" - Condition = { - StringEquals = { - "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com" - } - StringLike = { - "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:pull_request" - } - } - } - ], - length(var.operator_user_arns) > 0 ? [{ - Sid = "OperatorAssumeWithMFA" - Effect = "Allow" - Principal = { AWS = var.operator_user_arns } - Action = "sts:AssumeRole" - Condition = { - Bool = { "aws:MultiFactorAuthPresent" = "true" } - } - }] : [] - ) - }) -} - -resource "aws_iam_role_policy" "state" { - name = "${local.role_name}-state" - role = aws_iam_role.terraform.id - - policy = jsonencode({ - Version = "2012-10-17" - Statement = [ - { - Sid = "ListBucket" - Effect = "Allow" - Action = ["s3:ListBucket", "s3:GetBucketVersioning"] - Resource = aws_s3_bucket.state.arn - }, - { - Sid = "ObjectAccess" - Effect = "Allow" - Action = ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"] - Resource = "${aws_s3_bucket.state.arn}/*" - } - ] - }) -} - -output "state_bucket" { - description = "Name of the S3 bucket for Terraform state." - value = aws_s3_bucket.state.bucket -} - -output "state_bucket_arn" { - description = "ARN of the state bucket." - value = aws_s3_bucket.state.arn -} - -output "tf_role_arn" { - description = "Role ARN for the consuming repo's backend, local dev, and CI." - value = aws_iam_role.terraform.arn -} +output "backend_config" { value = module.state_backend.backend_config } +output "tf_role_arn" { value = module.state_backend.tf_role_arn } +output "state_bucket" { value = module.state_backend.state_bucket } +output "state_key_prefix" { value = module.state_backend.state_key_prefix } +``` -output "aws_region" { - description = "Region where the state bucket lives." - value = var.aws_region -} +Full input / output reference and the list of underlying AWS resources live in the module repo's [README][readme]. The module pins to a tagged release (`v0.1.0` above) — breaking changes ship as new majors so existing bootstraps stay valid until you re-pin. -output "state_key_prefix" { - description = "Prefix the consuming repo writes its state objects under." - value = "${var.project}/" -} -``` +The S3-native lock object (`/terraform.tfstate.tflock`) is just another S3 object under the same prefix as state — no separate IAM permission required, no DynamoDB table. -The S3-native lock object (`/terraform.tfstate.tflock`) is just another S3 object under the same prefix as state — no separate IAM permission is required for it, and no DynamoDB table. +[repo]: https://github.com/dryvist/terraform-aws-template +[readme]: https://github.com/dryvist/terraform-aws-template/blob/main/README.md ## Bootstrap the chicken-and-egg - Create a `terraform.tfvars` next to `main.tf`: - - ```hcl - project = "proxmox" - github_org = "" - github_repo = "terraform-proxmox" - branch_pattern = "main" - operator_user_arns = [ - "arn:aws:iam:::user/", - ] - ``` - - Then: + With the module block above pointing at your project's values: ```bash terraform init # local state — no backend block yet terraform apply ``` - Confirm the apply. Save the outputs — `terraform output -json` gives you a single blob you can pipe into the consuming repo's backend block. + Confirm the apply. `terraform output backend_config` emits the ready-to-paste `backend "s3" {}` block for the consuming repo (`terraform output -raw backend_config > /tmp/backend.tf` to ship it straight to a file). In `main.tf`, uncomment the `backend "s3"` block at the top of the `terraform {}` block and substitute the outputs the apply just produced: