Add failing tests for #1617: provider unavailability abort missing backoff + misclassification#1620
Draft
prompt-driven-github[bot] wants to merge 1 commit into
Draft
Add failing tests for #1617: provider unavailability abort missing backoff + misclassification#1620prompt-driven-github[bot] wants to merge 1 commit into
prompt-driven-github[bot] wants to merge 1 commit into
Conversation
…no backoff Adds 7 failing tests across 5 test files that detect two bugs: 1. No orchestrator-level backoff before the consecutive-provider-failures abort threshold in all four orchestrators (checkup, bug, change, e2e-fix). Tests assert time.sleep(PROVIDER_FAILURE_BACKOFF_SECONDS) is called before each increment — currently call_count==0, will be 3 after the fix. 2. _classify_layer1_failure_category falls through to 'layer1_failed' for provider-unavailability abort messages, misclassifying a transient infrastructure outage as a PR-level blocker. Tests assert the new 'provider_unavailable' category is returned. Also updates prompt specs (agentic_checkup_orchestrator_python.prompt and agentic_checkup_python.prompt) to require the backoff and the new category. Fixes #1617 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 7 failing tests across 5 files that detect the two bugs reported in #1617.
Test Files
tests/test_agentic_checkup_orchestrator.py— backoff sleep test for checkup orchestratortests/test_agentic_bug_orchestrator.py— backoff sleep test for bug orchestratortests/test_agentic_change_orchestrator.py— backoff sleep test for change orchestratortests/test_agentic_e2e_fix_orchestrator.py— backoff sleep test for e2e-fix orchestratortests/test_final_pr_gate.py— 3 tests for_classify_layer1_failure_categoryreturning"provider_unavailable"Prompt Files
pdd/prompts/agentic_checkup_orchestrator_python.prompt— updated to require 30s backoff before eachconsecutive_provider_failuresincrementpdd/prompts/agentic_checkup_python.prompt— updated to specify new"provider_unavailable"failure categoryWhat This PR Contains
time.sleep(30)is called at least 3 times when all steps return"All agent providers failed". Currentlycall_count == 0→AssertionError. After fix addsPROVIDER_FAILURE_BACKOFF_SECONDS = 30andtime.sleep(PROVIDER_FAILURE_BACKOFF_SECONDS)before each counter increment, these pass._classify_layer1_failure_category("Aborting: 3 consecutive steps failed - agent providers unavailable")returns"provider_unavailable"(not"layer1_failed"). Currently falls through to the catch-all"layer1_failed"→AssertionError. After fix adds a new branch, these pass.Root Cause
Bug 1 (
pdd/agentic_checkup_orchestrator.py:3283-3298and 3 siblings): Theconsecutive_provider_failurescounter is incremented immediately on each provider-failure step with notime.sleep(). Three back-to-back steps during a transient provider blip fill the counter and abort permanently with no recovery window.Bug 2 (
pdd/agentic_checkup.py:346-389): The abort string"Aborting: 3 consecutive steps failed - agent providers unavailable"matches no specific branch in_classify_layer1_failure_categoryand falls through to"layer1_failed"withseverity: blocker, making a transient infrastructure outage look like a real PR-level code defect.Next Steps
PROVIDER_FAILURE_BACKOFF_SECONDS = 30andtime.sleep(PROVIDER_FAILURE_BACKOFF_SECONDS)beforeconsecutive_provider_failures += 1in all four orchestrators; addimport timeto the three missing itFINAL_GATE_CATEGORY_PROVIDER_UNAVAILABLE = "provider_unavailable"topdd/checkup_review_loop.py, import inpdd/agentic_checkup.py, add branch in_classify_layer1_failure_categorybefore the catch-allFixes #1617
Generated by PDD agentic bug workflow