fix(backend): abort no-progress unattended runs and lower maxSteps defaults#271
Merged
Conversation
…faults Unattended (trigger/scheduled) agent runs could burn compute up to the step ceiling when a model failed to converge — re-issuing the same tool call for the same result without making progress. Weaker self-hosted models (vLLM/Ollama) hit this most. Add no-progress detection on the headless `generate` path only: a run-scoped, in-memory detector tracks a signature per tool call (name + normalized args + a hash of the result) and aborts when the same full signature recurs K=3 times within the run, even with unrelated calls interleaved. The result is part of the signature, so a repeated call whose result differs (e.g. a re-read after a write that changed state) is legitimate progress and does not count. The run is recorded as failed with a machine-readable `no_progress` reason naming the repeated tool, and the abort is logged. Interactive/streamed runs are unaffected. Also lower the maxSteps defaults: the new-agent create form now defaults to 15 (was 30), and the backend fallback for an agent with no explicit maxSteps is 15 (was 1) so API-created agents behave sanely. maxSteps remains the single user-owned ceiling — no separate clamp for triggers. Closes #268 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reflect the lowered create-form default (30 → 15) in the Agents guide and note that unattended (trigger/scheduled) runs now stop early — and are recorded as failed — when the model makes no progress. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TRIGGER_PER_STEP_TIMEOUT_MS and TRIGGER_PER_RUN_TIMEOUT_MS bound unattended runs and are the wall-clock backstop alongside the maxSteps ceiling and the new no-progress guard. They were read from env but undocumented. Note: still need adding to apps/backend/.env.example (source of truth). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nd vars Both are read by the frontend (next.config.ts, About page) but were absent from the frontend configuration reference. NODE_ENV is intentionally left out as a standard Node convention rather than Platypus config. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #268.
Problem
A trigger/scheduled (unattended) agent run that fails to converge — repeatedly issuing the same tool call and getting the same result with no resulting state change — keeps going, with context and token cost growing every step, until it hits
maxSteps. The only backstop today is the step ceiling, and the defaults are poorly chosen (create form 30; backend fallback for an agent with no explicit value is 1). This is most likely on weaker self-hosted models (vLLM/Ollama).Changes
No-progress detection (unattended runs only) —
apps/backend/src/runs/no-progress.tsK = 3times within the run, even when other calls are interleaved (occurrences need not be consecutive).stopWhencondition included only on the headlessgeneratepath (the trigger/scheduled execution path). Interactive/streamed runs are untouched.no_progress:reason naming the repeated tool and count (persisted totriggerRun.errorMessage), and the abort is logged. It does not appear as a successful run.Lower
maxStepsdefaults30 → 15(apps/frontend/components/agent-form.tsx).maxSteps1 → 15(DEFAULT_AGENT_MAX_STEPSinchat-execution.ts).maxStepsstays the single user-owned ceiling everywhere — no separate clamp for triggered runs.Tests
no-progress.test.ts— interleaved same-result abort at K; changing-result no-abort; result-in-signature; arg-key-order normalization; distinct-args no-collision; custom + default threshold; sticky once tripped;NoProgressErrorshape.agent-runner.test.ts— unattendedgeneraterecords ano_progressfailure when it trips; no abort when results change; interactivestreamhas no no-progress condition.chat-execution.test.ts— agent without an explicitmaxStepsresolves to 15.pnpm typecheck,pnpm lint, and the full backend suite (974 tests) pass.🤖 Generated with Claude Code