Skip to content

AO CI-failure notifier should suppress pings when jobs never ran (billing/quota/admin block) #82

@harshitsinghbhandari

Description

@harshitsinghbhandari

Problem

When GitHub Actions cancels a workflow run before any job executes — billing block, spending-limit cap, admin-disabled Actions, etc. — the GitHub API still marks the run as conclusion: failure. AO's CI-failure notifier picks that up and pings the responsible agent session with a "CI checks are failing on your PR" message, instructing the agent to "investigate the failures, fix the issues, and push again."

The agent cannot fix this. The code never ran. Every push re-triggers the same notification, creating a loop:

  1. Agent pushes a commit.
  2. GitHub queues workflows, billing block kicks in, all jobs marked failure with the message "The job was not started because recent account payments have failed or your spending limit needs to be increased."
  3. AO pings the agent: "CI failed, fix and re-push."
  4. Agent investigates → confirms billing cause → reports waiting.
  5. (If anything triggers a new push, e.g. an actual fix the agent makes for a different reason) GOTO 1.

This wastes tokens, occupies an agent in a non-productive state, and confuses session lifecycle.

Examples from today (2026-06-02)

Both PR #65 push events triggered this loop. Sample runs:

  • 26811288622, 26811288833, 26811288920 (first push) — every job conclusion: failure with the billing message.
  • 26814681179, 26814681182, 26814681210 (second push) — same.

gh run view 26811288622 output:

[FAIL] X The job was not started because recent account payments have failed or your spending limit needs to be increased. Please check the 'Billing & plans' section in your settings

Expected behaviour

The notifier should recognise "the job never started" terminal states and skip the agent ping (or route it to a different channel — needs-input on the human-facing orchestrator session, not the worker agent). Failure classes worth distinguishing:

  • billing/quota: message contains "recent account payments have failed" or "spending limit needs to be increased".
  • admin disable: message contains "Actions is disabled" or run conclusion: action_required.
  • cancelled before start: run status: completed, conclusion: cancelled, and zero jobs with started_at.
  • skipped by paths filter: conclusion: skipped — already probably handled, worth confirming.

For all of these, the right action is not "page the worker agent." It is either:

  1. Silently suppress (the human will see it on the PR UI anyway), OR
  2. Notify the human orchestrator session (e.g. aa-37) with a different message that says "external blocker, agent cannot resolve."

How to detect

The GitHub API returns the failure reason in gh run view <id> --json output, but the most reliable signal for the billing case is the annotations / per-job step output containing the literal string "The job was not started". A two-line check on the first job's annotations is enough:

if strings.Contains(annotation.Message, "The job was not started because") {
    // External infra block, not a code failure.
    return notifyOrchestrator(...)
}

For the general "never executed" case, check whether any job has steps[*].started_at != nil. If none did, no code ran — don't ping the agent that wrote the code.

Scope

  • Add classification in the notifier: code_failure vs infra_block vs cancelled vs skipped.
  • Only code_failure pings the worker agent with the "investigate + push again" prompt.
  • infra_block and similar route to the orchestrator session with a different message.
  • Unit test the classifier against the JSON payloads of the example run IDs above (or equivalent fixtures).

Priority

While the aoagents org's GitHub Actions billing is unresolved (per the team, 1-2 days), every push from every session will hit this loop. Worth landing before tomorrow's work cycle so no agent burns tokens chasing a non-issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdaemonHTTP daemon laneenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions