[copilot] Add CI postmortem skill and weekly agentic workflow by rolfbjarne · Pull Request #25303 · dotnet/macios

rolfbjarne · 2026-05-04T08:10:39Z

Summary

Adds a new Copilot skill and agentic workflow for automated weekly CI post-mortem analysis.

What's included

.agents/skills/macios-ci-postmortem/SKILL.md — Skill definition with a 4-phase workflow:
1. Discovery — collect recent PR-validation builds from AzDO
2. Extraction — download TestSummary (triage) then HtmlReport artifacts, parse NUnit XML for individual test failures
3. Classification — categorize as flaky, infrastructure/bot-specific, shared regression, or PR-specific
4. Issue Actions — file/update GitHub issues with ci-postmortem + copilot labels
.agents/skills/macios-ci-postmortem/references/azure-devops-cli.md — AzDO CLI reference
.github/workflows/ci-postmortem.md (+ compiled .lock.yml) — Agentic workflow running weekly on Sunday

Key design decisions

TestSummary-first triage: Only download large HtmlReport zips (~100MB each) for jobs with confirmed test failures, not all failed jobs
One issue per test: Easier to merge than split
AppSizeTest excluded: Expected to fail across PRs
Bot-specific analysis: Extracts workerName from timelines to detect bot-concentrated failures
Rerun detection: Uses commit SHA matching to confirm flaky tests

Validation

The skill was run manually twice (Apr 21-27 and Apr 21-28 windows). Results:

Filed issues for genuinely flaky tests ([CI Postmortem] Flaky: MonoTouchFixtures.Security.RecordTest.DeskCase_83099_InmutableDictionary #25222, [CI Postmortem] Flaky: MonoTouchFixtures.Foundation.UrlProtocolTest.RegistrarTest #25223, [CI Postmortem] Flaky: MonoTests.System.Net.Http.MessageHandlerTest.TestNSUrlSessionHandlerSendClientCertificate #25240-[CI Postmortem] Flaky: MonoTouchFixtures.SystemConfiguration.NetworkReachabilityTest.CtorIPAddressPair #25242)
Filed issues for infrastructure patterns ([CI Postmortem] Windows bot VSM-XAM-126: 'Dotnet tests' consistently failing (path not found) #25263-[CI Postmortem] Intermittent AzDO REST API failures in test jobs #25265)
Updated existing issues with new occurrence data

🤖 Pull request created by Copilot

New Copilot CLI skill that analyzes CI builds across recent PRs to identify failures unrelated to any specific PR: - Flaky tests (pass on rerun with same commit) - Shared regressions (same failure across multiple unrelated PRs) - Infrastructure issues (provisioning, timeouts, etc.) The skill operates in 4 phases: discovery, extraction, classification, and issue filing (with user confirmation before any GitHub issue changes). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Update SKILL.md to reflect lessons learned from running the skill: - Add steps for downloading and parsing HtmlReport artifacts - Add NUnit XML parsing for individual test failures - Add handling for crashes, build failures, and dotnettests - Fix --query-order flag (not supported by az pipelines build list) - Add HTML entity normalization for test name deduplication - Note performance concerns with large artifact downloads Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Exclude AppSizeTest from filing (expected to fail across PRs) - Add rule: always file one issue per test, never group unrelated tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The HtmlReport download step takes 96% of the total analysis time. Make it explicit that HtmlReports should only be downloaded for jobs where TestSummary confirms test failures, not for all failed jobs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… skill - Extract workerName from timeline to correlate failures with bots - Identify bot-specific failures (disproportionate failure rates) - Detect cross-bot infrastructure patterns (timeouts, REST API, paths) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Runs every Sunday (fuzzy schedule) to analyze the past week's CI failures and file issues for flaky tests, infrastructure problems, and bot-specific issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Don't reopen fix-closed issues less than 2 weeks old - Require failing builds from main branch - Allow reopen for lack-of-info or debug-instrumentation closures - Always allow commenting on closed issues with explanation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Include actual compiler/linker/assertion errors from NUnit XML. Flag when different PRs show different errors for the same test (likely different root causes). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use AzDO URL format with j= and t= parameters from timeline record IDs to link directly to the failing log, not just the build. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

'Path does not exist' on artifact publish is a downstream symptom of earlier failures (Install dotnet workloads, azdev-secrets, etc). Always trace back to the first failed task in the timeline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Emphasize that only the first failed step (without continueOnError) is the root cause. All subsequent failures are cascading and must not be reported as separate issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

vs-mobiletools-engineering-service2 · 2026-05-07T07:47:29Z

✅ [PR Build #3ae047d] Build passed (Build macOS tests) ✅

Pipeline on Agent
Hash: 3ae047d0a4e819345a746b9d38533bf3a4a86557 [PR build]

vs-mobiletools-engineering-service2 · 2026-05-07T08:50:42Z

🔥 [CI Build #3ae047d] Test results 🔥

Test results

❌ Tests failed on VSTS: test results

0 tests crashed, 2 tests failed, 173 tests passed.

Failures

❌ monotouch tests (iOS)

1 tests failed, 15 tests passed.

Failed tests

monotouch-test/iOS - simulator/Release (strict inline dlfcn): Failed

Html Report (VSDrops) Download

❌ monotouch tests (tvOS)

1 tests failed, 15 tests passed.

Failed tests

monotouch-test/tvOS - simulator/Release (trimmable static registrar, NativeAOT, ARM64): BuildFailure

Html Report (VSDrops) Download

Successes

✅ cecil: All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (iOS): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (MacCatalyst): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (macOS): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (Multiple platforms): All 1 tests passed. Html Report (VSDrops) Download
✅ dotnettests (tvOS): All 1 tests passed. Html Report (VSDrops) Download
✅ framework: All 2 tests passed. Html Report (VSDrops) Download
✅ fsharp: All 4 tests passed. Html Report (VSDrops) Download
✅ generator: All 5 tests passed. Html Report (VSDrops) Download
✅ interdependent-binding-projects: All 4 tests passed. Html Report (VSDrops) Download
✅ introspection: All 6 tests passed. Html Report (VSDrops) Download
✅ linker: All 44 tests passed. Html Report (VSDrops) Download
✅ monotouch (MacCatalyst): All 18 tests passed. (⚠️ Html Report Publish failed ⚠️) Download
✅ monotouch (macOS): All 18 tests passed. Html Report (VSDrops) Download
✅ msbuild: All 2 tests passed. Html Report (VSDrops) Download
✅ sharpie: All 1 tests passed. Html Report (VSDrops) Download
✅ windows: All 3 tests passed. Html Report (VSDrops) Download
✅ xcframework: All 4 tests passed. Html Report (VSDrops) Download
✅ xtro: All 1 tests passed. (⚠️ Html Report Publish failed ⚠️) Download

macOS tests

✅ Tests on macOS Monterey (12): All 5 tests passed. Html Report (VSDrops) Download
✅ Tests on macOS Ventura (13): All 5 tests passed. Html Report (VSDrops) Download
✅ Tests on macOS Sonoma (14): All 5 tests passed. Html Report (VSDrops) Download
✅ Tests on macOS Sequoia (15): All 5 tests passed. Html Report (VSDrops) Download
✅ Tests on macOS Tahoe (26): All 5 tests passed. Html Report (VSDrops) Download

Linux Build Verification

✅ Linux build succeeded

Pipeline on Agent
Hash: 3ae047d0a4e819345a746b9d38533bf3a4a86557 [PR build]

For failures in the Windows integration stage, always identify the macOS bot from the 'Reserve macOS bot for tests' job, even when the failure is on a Windows bot (e.g. ssh connection failures). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

vs-mobiletools-engineering-service2 · 2026-05-07T20:54:40Z

✅ [PR Build #0ffa72f] Build passed (Detect API changes) ✅

Pipeline on Agent
Hash: 0ffa72ff671669a0c3ac525fe8f0fe13cb1c2ba1 [PR build]

vs-mobiletools-engineering-service2 · 2026-05-07T21:01:21Z

✅ [PR Build #0ffa72f] Build passed (Build packages) ✅

Pipeline on Agent
Hash: 0ffa72ff671669a0c3ac525fe8f0fe13cb1c2ba1 [PR build]

vs-mobiletools-engineering-service2 · 2026-05-07T21:07:16Z

✅ API diff for current PR / commit

NET (empty diffs)

iOS: vsdrops gist ( No breaking changes )
tvOS: vsdrops gist ( No breaking changes )
MacCatalyst: vsdrops gist ( No breaking changes )
macOS: vsdrops gist ( No breaking changes )

✅ API diff vs stable

NET (empty diffs)

iOS: vsdrops gist ( No breaking changes )
tvOS: vsdrops gist ( No breaking changes )
MacCatalyst: vsdrops gist ( No breaking changes )
macOS: vsdrops gist ( No breaking changes )

ℹ️ Generator diff

Generator Diff: vsdrops (html) vsdrops (raw diff) gist (raw diff) - Please review changes)

Pipeline on Agent
Hash: 0ffa72ff671669a0c3ac525fe8f0fe13cb1c2ba1 [PR build]

rolfbjarne and others added 8 commits May 4, 2026 09:47

Use 'ci-postmortem' label for all issues filed by the skill

232ff7c

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Update CI postmortem skill: exclude AppSizeTest, one issue per test

0282b6a

- Exclude AppSizeTest from filing (expected to fail across PRs) - Add rule: always file one issue per test, never group unrelated tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add 'copilot' label requirement to CI postmortem issues

66a72d0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add weekly CI postmortem agentic workflow

fee85ef

Runs every Sunday (fuzzy schedule) to analyze the past week's CI failures and file issues for flaky tests, infrastructure problems, and bot-specific issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

rolfbjarne added the copilot label May 4, 2026

rolfbjarne and others added 2 commits May 4, 2026 10:11

Revert Xamarin.MacDev and dependency bumps to match main

3fbf470

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix: align dependency files with origin/main

aa9e596

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

This comment has been minimized.

Sign in to view

rolfbjarne and others added 4 commits May 4, 2026 16:20

Require specific error messages in postmortem issues

bb28093

Include actual compiler/linker/assertion errors from NUnit XML. Flag when different PRs show different errors for the same test (likely different root causes). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Require deep links to specific job/step in postmortem issues

03ae5f6

Use AzDO URL format with j= and t= parameters from timeline record IDs to link directly to the failing log, not just the build. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

This comment has been minimized.

Sign in to view

rolfbjarne added 2 commits May 4, 2026 17:35

Revert submodule bump.

4d15697

Merge remote-tracking branch 'origin/main' into dev/rolf/ci-postmortem

f580229

This comment has been minimized.

Sign in to view

Merge branch 'main' into dev/rolf/ci-postmortem

3ae047d

This comment has been minimized.

Sign in to view

Conversation

rolfbjarne commented May 4, 2026

Summary

What's included

Key design decisions

Validation

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

vs-mobiletools-engineering-service2 commented May 7, 2026

✅ [PR Build #3ae047d] Build passed (Build macOS tests) ✅

Uh oh!

vs-mobiletools-engineering-service2 commented May 7, 2026

🔥 [CI Build #3ae047d] Test results 🔥

Test results

Failures

❌ monotouch tests (iOS)

Failed tests

❌ monotouch tests (tvOS)

Failed tests

Successes

macOS tests

Linux Build Verification

Uh oh!

vs-mobiletools-engineering-service2 commented May 7, 2026

✅ [PR Build #0ffa72f] Build passed (Detect API changes) ✅

Uh oh!

This comment has been minimized.

vs-mobiletools-engineering-service2 commented May 7, 2026

✅ [PR Build #0ffa72f] Build passed (Build packages) ✅

Uh oh!

vs-mobiletools-engineering-service2 commented May 7, 2026

✅ API diff for current PR / commit

✅ API diff vs stable

ℹ️ Generator diff

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants