Skip to content

Add failing tests for #1617: provider unavailability abort missing backoff + misclassification#1620

Draft
prompt-driven-github[bot] wants to merge 1 commit into
mainfrom
fix/issue-1617
Draft

Add failing tests for #1617: provider unavailability abort missing backoff + misclassification#1620
prompt-driven-github[bot] wants to merge 1 commit into
mainfrom
fix/issue-1617

Conversation

@prompt-driven-github

Copy link
Copy Markdown
Contributor

Summary

Adds 7 failing tests across 5 files that detect the two bugs reported in #1617.

Test Files

  • tests/test_agentic_checkup_orchestrator.py — backoff sleep test for checkup orchestrator
  • tests/test_agentic_bug_orchestrator.py — backoff sleep test for bug orchestrator
  • tests/test_agentic_change_orchestrator.py — backoff sleep test for change orchestrator
  • tests/test_agentic_e2e_fix_orchestrator.py — backoff sleep test for e2e-fix orchestrator
  • tests/test_final_pr_gate.py — 3 tests for _classify_layer1_failure_category returning "provider_unavailable"

Prompt Files

  • pdd/prompts/agentic_checkup_orchestrator_python.prompt — updated to require 30s backoff before each consecutive_provider_failures increment
  • pdd/prompts/agentic_checkup_python.prompt — updated to specify new "provider_unavailable" failure category

What This PR Contains

  • 4 backoff tests (Tests 1–4): Assert time.sleep(30) is called at least 3 times when all steps return "All agent providers failed". Currently call_count == 0AssertionError. After fix adds PROVIDER_FAILURE_BACKOFF_SECONDS = 30 and time.sleep(PROVIDER_FAILURE_BACKOFF_SECONDS) before each counter increment, these pass.
  • 3 classification tests (Tests 5–7): Assert _classify_layer1_failure_category("Aborting: 3 consecutive steps failed - agent providers unavailable") returns "provider_unavailable" (not "layer1_failed"). Currently falls through to the catch-all "layer1_failed"AssertionError. After fix adds a new branch, these pass.
  • All 7 tests are verified to fail on current code and will pass once the bug is fixed.
  • Prompt specs updated to reflect the required behavior.

Root Cause

Bug 1 (pdd/agentic_checkup_orchestrator.py:3283-3298 and 3 siblings): The consecutive_provider_failures counter is incremented immediately on each provider-failure step with no time.sleep(). Three back-to-back steps during a transient provider blip fill the counter and abort permanently with no recovery window.

Bug 2 (pdd/agentic_checkup.py:346-389): The abort string "Aborting: 3 consecutive steps failed - agent providers unavailable" matches no specific branch in _classify_layer1_failure_category and falls through to "layer1_failed" with severity: blocker, making a transient infrastructure outage look like a real PR-level code defect.

Next Steps

  1. Add PROVIDER_FAILURE_BACKOFF_SECONDS = 30 and time.sleep(PROVIDER_FAILURE_BACKOFF_SECONDS) before consecutive_provider_failures += 1 in all four orchestrators; add import time to the three missing it
  2. Add FINAL_GATE_CATEGORY_PROVIDER_UNAVAILABLE = "provider_unavailable" to pdd/checkup_review_loop.py, import in pdd/agentic_checkup.py, add branch in _classify_layer1_failure_category before the catch-all
  3. Verify all 7 generated tests pass
  4. Run full test suite for regressions
  5. Mark PR ready for review

Fixes #1617


Generated by PDD agentic bug workflow

…no backoff

Adds 7 failing tests across 5 test files that detect two bugs:

1. No orchestrator-level backoff before the consecutive-provider-failures
   abort threshold in all four orchestrators (checkup, bug, change, e2e-fix).
   Tests assert time.sleep(PROVIDER_FAILURE_BACKOFF_SECONDS) is called before
   each increment — currently call_count==0, will be 3 after the fix.

2. _classify_layer1_failure_category falls through to 'layer1_failed' for
   provider-unavailability abort messages, misclassifying a transient
   infrastructure outage as a PR-level blocker. Tests assert the new
   'provider_unavailable' category is returned.

Also updates prompt specs (agentic_checkup_orchestrator_python.prompt and
agentic_checkup_python.prompt) to require the backoff and the new category.

Fixes #1617

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Checkup failure on PR #1524: Layer 1 final gate aborts — agent providers unavailable

1 participant