Skip to content

chore(terraform): migrate cross-variable validations to lifecycle.precondition for Terraform v1.15 compatibility #608

Description

@katriendg

Summary

Terraform v1.15.0 enforces stricter evaluation of validation blocks inside variable
declarations. When a validation condition references another variable (var.Y
inside the validation of variable "X"), and the referenced variable has not been
processed yet (for example, during terraform test with targeted operations or partial
graph evaluation), Terraform now errors:

Error: Reference to uninitialized variable

The variable var.<name> was not processed by the most recent operation,
this likely means the previous operation either failed or was incomplete due to targeting.

This was previously tolerated. Local apply with Terraform v1.14.x passes, but any
operation under v1.15+ that initializes a working directory containing these
cross-variable validations fails. There are 10 cross-variable validation blocks across
5 components
that need to migrate from variable.validation to
lifecycle.precondition on a root-level guard resource.

Problem

Cross-variable validation (var.Y referenced inside variable "X") is no longer safe
under Terraform v1.15. Preconditions are evaluated during plan with all variables fully
resolved, so they are unaffected by the v1.15 strictness on validation blocks.

Affected Locations (current as of main, 2026-06-11)

variables.deps.tf cross-variable validations

# File Variable validation line References
1 src/000-cloud/080-azureml/terraform/variables.deps.tf ml_workload_identity L77 var.should_assign_ml_workload_identity_roles
2 src/000-cloud/080-azureml/terraform/variables.deps.tf network_security_group L89 var.should_associate_network_security_group
3 src/000-cloud/080-azureml/terraform/variables.deps.tf nat_gateway L111 var.should_enable_nat_gateway
4 src/100-edge/100-cncf-cluster/terraform/variables.deps.tf arc_onboarding_identity (validation 1) L56 var.should_assign_roles, var.arc_onboarding_sp, var.arc_onboarding_principal_ids
5 src/100-edge/100-cncf-cluster/terraform/variables.deps.tf arc_onboarding_identity (validation 2) L61 var.should_assign_roles, var.arc_onboarding_sp, var.arc_onboarding_principal_ids
6 src/100-edge/100-cncf-cluster/terraform/variables.deps.tf key_vault L86 var.should_upload_to_key_vault
7 src/000-cloud/036-managed-redis/terraform/variables.deps.tf private_endpoint_subnet L20 var.should_enable_private_endpoint

variables.tf cross-variable validations

# File Variable validation line References
8 src/000-cloud/031-fabric/terraform/variables.tf fabric_capacity_admins L57 var.should_create_fabric_capacity
9 src/100-edge/100-cncf-cluster/terraform/variables.tf should_generate_cluster_server_token L159 var.cluster_server_token
10 src/100-edge/100-cncf-cluster/terraform/modules/ubuntu-k3s/variables.tf should_generate_cluster_server_token L102 var.cluster_server_token

Proposed Solution

Migrate each cross-variable validation into a lifecycle.precondition block on a
terraform_data "validation guard" resource at the root of each module. Preconditions
are evaluated during plan with all variables fully resolved.

Before (in variables.deps.tf)

variable "ml_workload_identity" {
  type = object({
    id           = string
    principal_id = string
  })
  description = "AzureML workload managed identity object containing id and principal_id."
  default     = null
  validation {
    condition     = (!var.should_assign_ml_workload_identity_roles) || (var.ml_workload_identity != null)
    error_message = "ml_workload_identity must be provided when should_assign_ml_workload_identity_roles is true."
  }
}

After (validation removed from the variable)

variable "ml_workload_identity" {
  type = object({
    id           = string
    principal_id = string
  })
  description = "AzureML workload managed identity object containing id and principal_id."
  default     = null
}

Added in main.tf (single guard resource per component)

resource "terraform_data" "validate_inputs" {
  lifecycle {
    precondition {
      condition     = !var.should_assign_ml_workload_identity_roles || var.ml_workload_identity != null
      error_message = "ml_workload_identity must be provided when should_assign_ml_workload_identity_roles is true."
    }
    precondition {
      condition     = !var.should_associate_network_security_group || var.network_security_group != null
      error_message = "network_security_group must be provided when should_associate_network_security_group is true."
    }
    precondition {
      condition     = !var.should_enable_nat_gateway || var.nat_gateway != null
      error_message = "nat_gateway must be provided when should_enable_nat_gateway is true."
    }
  }
}

For multi-condition rules (for example, arc_onboarding_identity mutual exclusion),
translate each validation block to a separate precondition block on the same
terraform_data resource, preserving the original error messages verbatim.

Per-component edits

  • src/000-cloud/080-azureml/terraform/ — Remove three validation blocks from
    variables.deps.tf (ml_workload_identity, network_security_group, nat_gateway).
    Add terraform_data "validate_inputs" to main.tf with three preconditions.
  • src/100-edge/100-cncf-cluster/terraform/ — Remove two validation blocks on
    arc_onboarding_identity and one on key_vault from variables.deps.tf; remove the
    validation block on should_generate_cluster_server_token from variables.tf. Add
    terraform_data "validate_inputs" to main.tf with four preconditions (arc identity
    required, arc identity mutual exclusion, key vault required, cluster_server_token vs
    should_generate_cluster_server_token).
  • src/100-edge/100-cncf-cluster/terraform/modules/ubuntu-k3s/ — Remove the
    validation block on should_generate_cluster_server_token from variables.tf. Add a
    precondition (on a new terraform_data or an existing root-level resource in the
    module's main.tf) enforcing the same rule.
  • src/000-cloud/036-managed-redis/terraform/ — Remove the validation block on
    private_endpoint_subnet from variables.deps.tf. Add terraform_data "validate_inputs" to main.tf with a precondition.
  • src/000-cloud/031-fabric/terraform/ — Remove the validation block on
    fabric_capacity_admins from variables.tf. Add terraform_data "validate_inputs" to
    main.tf with a precondition.

Implementation notes

  • Preserve original error messages verbatim so downstream documentation and operator
    tooling continues to match.
  • Prefer terraform_data "validate_inputs" over null_resource: it is built-in (no extra
    provider) and idempotent across plans.
  • If a component's main.tf already contains a sentinel/guard resource with
    lifecycle.precondition, append the new preconditions there instead of introducing a new
    terraform_data resource.

Suggested approach: HVE Core RPI workflow

Contributors can use the HVE Core Research → Plan → Implement → Review (RPI) workflow
to let GitHub Copilot AI assist with this change while keeping every decision traceable. See
Understanding the RPI Workflow.

This change qualifies for RPI: it spans multiple files (5 components) and crosses concerns
across variables.tf, variables.deps.tf, and main.tf.

  1. Research/task-research migrate cross-variable terraform validations to lifecycle.precondition.
    Capture the exact current locations (table in this issue, ), the v1.15 error class, and the
    terraform_data "validate_inputs" pattern with evidence and line numbers.
  2. /clear
  3. Plan/task-plan. Produce a per-component task list with checkboxes covering the
    10 validation migrations and the preconditions to add to each main.tf.
  4. /clear
  5. Implement/task-implement. Execute the plan component by component, tracking
    changes in the changes log.
  6. /clear
  7. Review/task-review. Validate against the plan, run lint/validate, and confirm
    intentional misconfigurations still fail at plan time with identical error messages.

Remember to /clear between phases and open the relevant .copilot-tracking/ artifact
before invoking the next agent.

Validation

For each affected component:

  1. Run npm run tf-validate -- <component-path> and confirm clean validation.
  2. Run npm run tflint-fix -- <component-path> to ensure lint cleanliness.
  3. Run terraform test (or npm run tf-test -- <component-path>) under Terraform v1.15+
    to confirm the original error class no longer occurs.
  4. Spot-check that intentional misconfigurations (for example,
    should_assign_ml_workload_identity_roles = true with ml_workload_identity = null)
    still fail at plan time with the same error messages.
  5. Run npm run tf-docs (or the Terraform Build task) to regenerate each component's
    auto-generated README.md. The new terraform_data "validate_inputs" resource appears
    in the generated Resources section, so the docs must be regenerated and committed.

Acceptance criteria

  • All 10 cross-variable validation blocks removed from the variables above.
  • A root-level terraform_data "validate_inputs" (or appended preconditions on an
    existing guard) added to each of the 5 affected components/modules.
  • Original error messages preserved verbatim.
  • npm run tf-validate and npm run tflint-fix-all pass with zero errors.
  • npm run tf-docs regenerated and committed for each affected component.
  • Intentional misconfigurations still fail at plan time with the same messages.

Note: ADO pipeline Terraform version (manual, secondary)

The ADO pipeline template still uses terraformVersion: latest, which resolves to v1.15+
and surfaces this error before the migration lands:

ADO pipelines are secondary (GitHub Actions is the primary CI, and those workflows were
already pinned to 1.12.2 via #484). A long-term fix is not required here. As a manual
stopgap until this migration merges, pin the ADO template to a v1.14.x version, for example:

terraformVersion: 1.14.9

Revert to latest (or bump deliberately) once the precondition migration is verified under
v1.15+.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions