Skip to content

fix(scripts): aio-version-checker misses Terraform and cert-manager drift #654

Description

@katriendg

Summary

scripts/aio-version-checker.py silently fails to detect version drift for Terraform files and for the cert-manager extension. As a result, CI can pass while the codebase is pinned to AIO component versions that no longer match the upstream manifest. Two independent gaps were found while preparing the AIO 2606 (v1.3.137) upgrade.

Gap 1 — Terraform detection is broken (python-hcl2 list wrapping)

python-hcl2 (0.4.3, currently installed) wraps block bodies in a single-element list, so a variable's default parses as [{...}] rather than {...}. The extractors (extract_tf_variables, extract_tf_instance_variables) check isinstance(defaults, dict), which is now always False, so they return an empty list. The checker therefore reports zero Terraform mismatches regardless of actual drift.

Reproduction (before fix):

python3 scripts/aio-version-checker.py --release-tag v1.2.36 -t terraform
# -> []   (should report secret_sync_controller and operations_config mismatches)

Gap 2 — cert-manager drift is invisible (component moved to 109-arc-extensions)

cert-manager and container storage were moved out of 110-iot-ops into the 109-arc-extensions component, but the checker still points at the old location/names:

  • TERRAFORM_COMPONENTS maps cert_manager → but no such variable exists in 110-iot-ops anymore.
  • BICEP_COMPONENTS referenced aioCertManagerExtensionDefaults, which no longer exists (the 109 component uses certManagerExtensionDefaults).
  • The checker never reads 109-arc-extensions/terraform/variables.tf or 109-arc-extensions/bicep/types.bicep.

Consequently the manifest bump certManager 0.12.0 → 0.13.3 was not flagged by the checker.

Impact

  • CI (aio-version-checker) can report a clean run while Terraform pins and the cert-manager version are stale.
  • Version drift is caught only by manual inspection during upgrades.

Proposed fix

  • Unwrap python-hcl2 list-wrapped default blocks in the Terraform extractors (add a small _unwrap_hcl helper) so Terraform detection works again.
  • Add cert-manager detection sourced from 109-arc-extensions for both Terraform (arc_extensions.cert_manager_extension) and Bicep (certManagerExtensionDefaults), including a BICEP_COMPONENT_FILES map for variables declared outside 110-iot-ops.
  • Attach the correct local_file to each reported mismatch so CI output points at the right component.
  • Leave container storage out until the manifest republishes a container storage key.

Acceptance criteria

  • python3 scripts/aio-version-checker.py --release-tag <current> returns [] when the codebase matches the manifest.
  • Running against an older tag reports both Terraform and Bicep mismatches, including cert_manager from 109-arc-extensions, each with the correct local_file.
  • The mapping documentation notes how to add a new manifest key (TERRAFORM_COMPONENTS, BICEP_COMPONENTS, BICEP_COMPONENT_FILES, dedicated extractor) so future extensions are covered.

Notes

  • The iotops-version-upgrade.prompt.md should reference the checker as a validation step and describe the component-to-manifest version map (delivered alongside this fix).

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingciContinuous integrationinfrastructureInfrastructure as code and platformpriority-2High priority, address soonscriptsPowerShell, Bash, or Python scripts

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions