Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 40 additions & 3 deletions .github/prompts/iotops-version-upgrade.prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,23 @@ Use runSubagent tool for complex steps in this prompt like executing scripts, fe
### Components Analyzed by This Prompt

This prompt analyzes and updates the following IoT Operations-related components:
- **110-iot-ops**: Core IoT Operations instance, brokers, authentication, and listeners
- **109-arc-extensions**: Arc dependency extensions — `cert-manager` (`certManager`) and container storage. cert-manager is tracked against the AIO manifest; container storage is **not** currently published in the manifest, so leave it unless the manifest reintroduces it.
- **110-iot-ops**: Core IoT Operations instance, brokers, authentication, and listeners — includes the AIO instance version (`iotOperations`) and the Secret Store extension (`secretStore`).
- **111-assets**: Azure Device Registry assets and endpoint profiles
- **130-messaging**: Dataflow endpoints, profiles, and messaging integration

### Component-to-Manifest Version Map

The AIO manifests publish component versions under `variables.VERSIONS`/`variables.TRAINS`. These map to codebase variables as follows (kept in sync by `scripts/aio-version-checker.py`):

| Manifest key | Manifest file | Terraform | Bicep |
|-----------------|---------------|-----------------------------------------------------------------------------------------------|-------------------------------------------------------------------------|
| `certManager` | enablement | `109-arc-extensions/terraform/variables.tf` (`arc_extensions.cert_manager_extension.version`) | `109-arc-extensions/bicep/types.bicep` (`certManagerExtensionDefaults`) |
| `secretStore` | enablement | `110-iot-ops/terraform/variables.init.tf` (`secret_sync_controller`) | `110-iot-ops/bicep/types.bicep` (`secretStoreExtensionDefaults`) |
| `iotOperations` | instance | `110-iot-ops/terraform/variables.instance.tf` (`operations_config`) | `110-iot-ops/bicep/types.bicep` (`aioExtensionDefaults`) |

If a new manifest key appears (e.g. container storage returns, or a new dependency extension is added), add it to both the codebase and the `aio-version-checker.py` mappings (`TERRAFORM_COMPONENTS`, `BICEP_COMPONENTS`, `BICEP_COMPONENT_FILES`, and any dedicated extractor) so future CI catches drift.

### Workflow Overview

The prompt follows a three-phase approach:
Expand All @@ -42,7 +55,7 @@ These steps must be completed immediately to gather all necessary information be

### Execution Requirements for Phase 1:
- **Sequential Execution**: Complete steps 1-6 in order before proceeding to Phase 2
- **Component Coverage**: Analyze ALL three components (110-iot-ops, 111-assets, 130-messaging)
- **Component Coverage**: Analyze ALL four components (109-arc-extensions, 110-iot-ops, 111-assets, 130-messaging)
- **API Validation**: Cross-validate with REST specifications to catch breaking changes
- **Structural Comparison**: Detect ALL differences between JSON and codebase (new, removed, changed)
- **Complete Analysis**: Do not skip to planning until all immediate analysis is complete
Expand Down Expand Up @@ -313,6 +326,27 @@ Check the following files within the `src/100-edge/110-iot-ops/` component:

Before moving to the next step: update the plan file.

## 4.5. Arc Extensions Component Analysis (109-arc-extensions) - EXECUTE IMMEDIATELY

**EXECUTE IMMEDIATELY**: The `cert-manager` (`certManager`) dependency extension lives in `109-arc-extensions`, not `110-iot-ops`. Container storage also lives here but is currently **not** published in the enablement manifest.

**Analysis Process**:
1. Read the current cert-manager and container storage versions:
- Terraform: `src/100-edge/109-arc-extensions/terraform/variables.tf` (`arc_extensions.cert_manager_extension.version` and `container_storage_extension.version`)
- Bicep: `src/100-edge/109-arc-extensions/bicep/types.bicep` (`certManagerExtensionDefaults`, `containerStorageExtensionDefaults`)
2. Compare the enablement manifest `variables.VERSIONS.certManager` against the codebase cert-manager version. Plan a bump (both Terraform and Bicep) when they differ.
3. For container storage: only plan a change if the manifest reintroduces a container storage key. Otherwise leave it as-is and note this in the plan.
4. Confirm no structural changes — these are version-default-only variables.

<!-- <example-arc-extensions-plan-entries> -->
```markdown
- [ ] Update `certManagerExtensionDefaults.release.version` from "<old>" to "<new>" in `src/100-edge/109-arc-extensions/bicep/types.bicep` (manifest `certManager`).
- [ ] Update `arc_extensions.cert_manager_extension.version` default from "<old>" to "<new>" in `src/100-edge/109-arc-extensions/terraform/variables.tf` (manifest `certManager`).
```
<!-- </example-arc-extensions-plan-entries> -->

Before moving to the next step: update the plan file.

## 5. Assets Component Analysis (111-assets) - EXECUTE IMMEDIATELY

**EXECUTE IMMEDIATELY**: Extend the same API version analysis to the Assets module to detect any API changes:
Expand Down Expand Up @@ -349,11 +383,12 @@ Before moving to the next step: update the plan file.
- ✅ Target version configuration downloaded
- ✅ REST API specifications cross-validated
- ✅ Core IoT Operations component (110-iot-ops) analyzed
- ✅ Arc extensions component (109-arc-extensions) analyzed (cert-manager)
- ✅ Assets component (111-assets) analyzed
- ✅ Messaging component (130-messaging) analyzed

**CRITICAL**: Do not proceed to Phase 2 until ALL analysis is complete. The following are common execution pitfalls to avoid:
- ❌ **Don't skip component analysis**: All three components must be analyzed, not just 110-iot-ops
- ❌ **Don't skip component analysis**: All four components must be analyzed (109-arc-extensions, 110-iot-ops, 111-assets, 130-messaging), not just 110-iot-ops
- ❌ **Don't defer REST validation**: API specifications must be validated immediately, not marked as "future work"
- ❌ **Don't mix analysis with planning**: Complete all discovery before creating implementation plans

Expand Down Expand Up @@ -679,6 +714,8 @@ Plan validation steps to ensure comprehensive coverage of changes beyond the JSO
```markdown
- [ ] VALIDATE: Test all blueprints that use `110-iot-ops` component deploy successfully.
- [ ] VALIDATE: Verify CI tests cover new parameters and resource types.
- [ ] VALIDATE: Run `python3 scripts/aio-version-checker.py --release-tag <target-tag>` and confirm it returns `[]` (no mismatches) for both Terraform and Bicep across 109-arc-extensions and 110-iot-ops. If the checker misses a component you changed (e.g. a newly added manifest key), update its mappings so future CI catches drift.
- [ ] VALIDATE: Update `docs/getting-started/upgrade-aio.md` — set the target AIO release, and the version-matrix row (compatible `azure-iot-ops` CLI version, cert-manager, secret-sync-controller, iotOperations) and `ms.date`. Source the CLI↔AIO↔component mapping from the [IoT Operations versions wiki](https://github.com/Azure/azure-iot-ops-cli-extension/wiki/IoT-Operations-versions).
```

Before moving to the next step: update the plan file.
Expand Down
8 changes: 4 additions & 4 deletions docs/getting-started/upgrade-aio.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Upgrade Azure IoT Operations
description: How to upgrade Azure IoT Operations (AIO) and reconcile the upgrade with the edge-ai Terraform or Bicep deployments
author: Edge AI Team
ms.date: 2026-05-05
ms.date: 2026-07-01
ms.topic: how-to
estimated_reading_time: 5
keywords:
Expand All @@ -28,18 +28,18 @@ The reconciliation steps differ between Terraform (stateful) and Bicep (stateles

## Version matrix

This repository currently targets the **AIO 2605** release. The table below maps `az iot ops` CLI versions to the component versions pinned in edge-ai:
This repository currently targets the **AIO 2606** release. The table below maps `az iot ops` CLI versions to the component versions pinned in edge-ai:

| CLI extension (`azure-iot-ops`) | AIO release | cert-manager | secret-sync-controller | iotOperations |
|---------------------------------|-------------|--------------|------------------------|---------------|
| 2.5.0 | 2605 | 0.12.0 | 1.4.1 | 1.3.105 |
| 2.7.0 | 2606 | 0.13.3 | 1.5.0 | 1.3.137 |

For the full upstream compatibility matrix, see [Supported versions — Azure IoT Operations](https://learn.microsoft.com/azure/iot-operations/deploy-iot-ops/howto-upgrade?tabs=portal#supported-versions).

## Prerequisites

- Azure CLI logged in to the target subscription.
- `azure-iot-ops` CLI extension installed (this repo expects version `2.5.0`).
- `azure-iot-ops` CLI extension installed (this repo expects version `2.7.0`).
- `<RESOURCE_GROUP_NAME>` — the resource group containing the AIO instance.
- `<INSTANCE_NAME>` — the AIO instance name (for edge-ai blueprints this is typically `iotops-arck-<resource_prefix>-<environment>-<instance>`).

Expand Down
147 changes: 135 additions & 12 deletions scripts/aio-version-checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,12 @@
TERRAFORM_VARS_INSTANCE_FILE = (
"./src/100-edge/110-iot-ops/terraform/variables.instance.tf"
)
# cert-manager and container storage extensions live in the 109-arc-extensions component
TERRAFORM_ARC_EXTENSIONS_FILE = (
"./src/100-edge/109-arc-extensions/terraform/variables.tf"
)
BICEP_VARS_FILE = "./src/100-edge/110-iot-ops/bicep/types.bicep"
BICEP_ARC_EXTENSIONS_FILE = "./src/100-edge/109-arc-extensions/bicep/types.bicep"

# IaC type definition (discriminated union)
IaCType = Literal["terraform", "bicep"]
Expand All @@ -112,11 +117,17 @@

# Component mappings for Bicep (bicep_name:remote_name)
BICEP_COMPONENTS = [
"aioCertManagerExtensionDefaults:certManager",
"certManagerExtensionDefaults:certManager",
"secretStoreExtensionDefaults:secretStore",
"aioExtensionDefaults:iotOperations", # Maps to iotOperations in manifest
]

# Bicep variables whose declaration lives outside BICEP_VARS_FILE (110-iot-ops).
# Maps the bicep variable name to the file that declares it.
BICEP_COMPONENT_FILES = {
"certManagerExtensionDefaults": BICEP_ARC_EXTENSIONS_FILE,
}


def parse_args() -> argparse.Namespace:
"""
Expand Down Expand Up @@ -400,6 +411,26 @@ def download_manifests(
return manifest_data


def _unwrap_hcl(value: Any) -> Any:
"""
Normalize a value parsed by python-hcl2.

Recent python-hcl2 releases wrap block bodies (such as a variable's
``default``) in a single-element list, e.g. ``[{...}]`` instead of ``{...}``.
This helper unwraps that list so callers can treat the result as a dict.

Args:
value (Any): A value produced by ``hcl2.load``.

Returns:
Any: The inner dict when ``value`` is a single-element list wrapping a
dict; otherwise ``value`` unchanged.
"""
if isinstance(value, list) and len(value) == 1:
return value[0]
return value


def extract_tf_variables(tf_file: str) -> list[dict[str, str]]:
"""
Extract component information from the Terraform variables file using HCL2 parser.
Expand Down Expand Up @@ -451,7 +482,7 @@ def extract_tf_variables(tf_file: str) -> list[dict[str, str]]:

# Check if default exists and contains version/train
if isinstance(var_props, dict) and "default" in var_props:
defaults = var_props["default"]
defaults = _unwrap_hcl(var_props["default"])
if isinstance(defaults, dict):
version = defaults.get("version", "")
train = defaults.get("train", "")
Expand All @@ -462,6 +493,7 @@ def extract_tf_variables(tf_file: str) -> list[dict[str, str]]:
"name": var_name,
"version": version,
"train": train,
"local_file": tf_file,
}
)

Expand Down Expand Up @@ -510,7 +542,7 @@ def extract_tf_instance_variables(tf_instance_file: str) -> list[dict[str, str]]
ops_config = var_item["operations_config"]

if isinstance(ops_config, dict) and "default" in ops_config:
defaults = ops_config["default"]
defaults = _unwrap_hcl(ops_config["default"])

if isinstance(defaults, dict):
version = defaults.get("version", "")
Expand All @@ -524,13 +556,86 @@ def extract_tf_instance_variables(tf_instance_file: str) -> list[dict[str, str]]
"version": version,
"train": train,
"namespace": namespace,
"local_file": tf_instance_file,
}
)
break

return variable_blocks


def extract_tf_arc_extension_variables(
tf_arc_file: str,
) -> list[dict[str, str]]:
"""
Extract cert-manager version info from the 109-arc-extensions Terraform variables.

The arc extensions component nests extension settings inside the
``arc_extensions`` variable default, e.g.::

variable "arc_extensions" {
default = {
cert_manager_extension = {
version = "0.13.3"
train = "stable"
}
}
}

Only cert-manager is tracked against the AIO manifest; container storage is
intentionally excluded because the manifest no longer publishes a version
for it.

Args:
tf_arc_file (str): Path to the 109-arc-extensions Terraform variables file.

Returns:
List[Dict[str, str]]: A list with a single ``cert_manager`` entry when a
version/train is found; otherwise an empty list.
"""
logger.debug(f"Reading Terraform arc-extensions file: {tf_arc_file}")

try:
Comment on lines +596 to +598

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the OSError handler here intentional? The two sibling extractors (extract_tf_variables, extract_tf_instance_variables) call sys.exit(1) on all failures, including file-not-found. Returning [] here means that if TERRAFORM_ARC_EXTENSIONS_FILE is ever missing or the path changes, cert-manager produces no local components and the comparison silently reports no mismatches rather than failing. Since catching cert-manager drift is the purpose of this extractor, sys.exit(1) may be the safer choice to keep behavior consistent with the sibling extractors.

with open(tf_arc_file) as f:
parsed = hcl2.load(f)
except OSError as e:
logger.error(f"Failed to read Terraform arc-extensions file: {e}")
return []
except Exception as e:
logger.error(f"Failed to parse Terraform arc-extensions file: {e}")
sys.exit(1)

for var_item in parsed.get("variable", []):
if not (isinstance(var_item, dict) and "arc_extensions" in var_item):
continue

arc_props = var_item["arc_extensions"]
if not (isinstance(arc_props, dict) and "default" in arc_props):
continue

defaults = _unwrap_hcl(arc_props["default"])
if not isinstance(defaults, dict):
continue

cert_manager = _unwrap_hcl(defaults.get("cert_manager_extension", {}))
if not isinstance(cert_manager, dict):
continue

version = cert_manager.get("version", "")
train = cert_manager.get("train", "")
if version or train:
return [
{
"name": "cert_manager",
"version": version,
"train": train,
"local_file": tf_arc_file,
}
]

return []


def extract_bicep_variables(bicep_file: str) -> list[dict[str, str]]:
"""
Extract component information from the Bicep file using regex pattern matching.
Expand Down Expand Up @@ -560,19 +665,30 @@ def extract_bicep_variables(bicep_file: str) -> list[dict[str, str]]:
"""
logger.debug(f"Reading Bicep variables file: {bicep_file}")

try:
with open(bicep_file) as f:
content = f.read()
except OSError as e:
logger.error(f"Failed to read Bicep file: {e}")
sys.exit(1)
# Cache file contents so each file is read at most once.
file_cache: dict[str, str] = {}

def _read(path: str) -> str:
if path not in file_cache:
try:
with open(path) as f:
file_cache[path] = f.read()
except OSError as e:
logger.error(f"Failed to read Bicep file: {e}")
sys.exit(1)
return file_cache[path]

variable_blocks = []

# Use the specific Bicep component names for searching
for bicep_component in BICEP_COMPONENTS:
bicep_name, remote_name = bicep_component.split(":")

# Some components (e.g. cert-manager) are declared outside the default
# 110-iot-ops types file.
component_file = BICEP_COMPONENT_FILES.get(bicep_name, bicep_file)
content = _read(component_file)

# Find the component definition block
var_pattern = f"var {bicep_name} = {{([\\s\\S]*?)}}"
var_match = re.search(var_pattern, content)
Expand Down Expand Up @@ -612,7 +728,12 @@ def extract_bicep_variables(bicep_file: str) -> list[dict[str, str]]:
component_name = "azure-iot-operations"

variable_blocks.append(
{"name": component_name, "version": version, "train": train}
{
"name": component_name,
"version": version,
"train": train,
"local_file": component_file,
}
)

return variable_blocks
Expand Down Expand Up @@ -641,7 +762,9 @@ def extract_variables(iac_type: IaCType, file_path: str) -> list[dict[str, str]]
variables = extract_tf_variables(TERRAFORM_VARS_FILE)
instance_variables = extract_tf_instance_variables(
TERRAFORM_VARS_INSTANCE_FILE)
return variables + instance_variables
arc_variables = extract_tf_arc_extension_variables(
TERRAFORM_ARC_EXTENSIONS_FILE)
return variables + instance_variables + arc_variables
elif iac_type == "bicep":
return extract_bicep_variables(file_path)
else:
Expand Down Expand Up @@ -785,7 +908,7 @@ def compare_versions(
mismatches.append(
{
"name": name,
"local_file": file_path,
"local_file": component.get("local_file", file_path),
"remote_url": manifest_url,
"local_version": local_version,
"remote_version": remote_version,
Expand Down
2 changes: 1 addition & 1 deletion src/100-edge/109-arc-extensions/bicep/types.bicep
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ type CertManagerExtension = {
var certManagerExtensionDefaults = {
enabled: true
release: {
version: '0.12.0'
version: '0.13.3'
train: 'stable'
autoUpgradeMinorVersion: false
}
Expand Down
Loading
Loading