fix(backend): abort no-progress unattended runs and lower maxSteps defaults by willdady · Pull Request #271 · willdady/platypus

willdady · 2026-06-23T21:23:53Z

Closes #268.

Problem

A trigger/scheduled (unattended) agent run that fails to converge — repeatedly issuing the same tool call and getting the same result with no resulting state change — keeps going, with context and token cost growing every step, until it hits maxSteps. The only backstop today is the step ceiling, and the defaults are poorly chosen (create form 30; backend fallback for an agent with no explicit value is 1). This is most likely on weaker self-hosted models (vLLM/Ollama).

Changes

No-progress detection (unattended runs only) — apps/backend/src/runs/no-progress.ts

A run-scoped, in-memory detector tracks a signature per tool call: tool name + a stable normalized serialization of its arguments + a hash of the result.
Aborts when the same full signature recurs K = 3 times within the run, even when other calls are interleaved (occurrences need not be consecutive).
The result is part of the signature, so a repeated call whose result differs (e.g. a board re-read after an intervening write that changed it) is legitimate progress and does not count.
Strictly intra-run and in-memory: recomputes counts from the steps the AI SDK already accumulates on each stop-condition evaluation. No external store; no state crosses a run/process boundary.
Wired as an additional stopWhen condition included only on the headless generate path (the trigger/scheduled execution path). Interactive/streamed runs are untouched.
On trip, the run is recorded as failed with a machine-readable no_progress: reason naming the repeated tool and count (persisted to triggerRun.errorMessage), and the abort is logged. It does not appear as a successful run.
Fail-safe: if a result legitimately differs every call (volatile timestamps etc.), signatures never collide, the detector under-counts, and the step ceiling remains the backstop — rather than risk a false abort.

Lower maxSteps defaults

New-agent create form default 30 → 15 (apps/frontend/components/agent-form.tsx).
Backend fallback for an agent with no explicit maxSteps 1 → 15 (DEFAULT_AGENT_MAX_STEPS in chat-execution.ts).
maxSteps stays the single user-owned ceiling everywhere — no separate clamp for triggered runs.

Tests

no-progress.test.ts — interleaved same-result abort at K; changing-result no-abort; result-in-signature; arg-key-order normalization; distinct-args no-collision; custom + default threshold; sticky once tripped; NoProgressError shape.
agent-runner.test.ts — unattended generate records a no_progress failure when it trips; no abort when results change; interactive stream has no no-progress condition.
chat-execution.test.ts — agent without an explicit maxSteps resolves to 15.

pnpm typecheck, pnpm lint, and the full backend suite (974 tests) pass.

🤖 Generated with Claude Code

…faults Unattended (trigger/scheduled) agent runs could burn compute up to the step ceiling when a model failed to converge — re-issuing the same tool call for the same result without making progress. Weaker self-hosted models (vLLM/Ollama) hit this most. Add no-progress detection on the headless `generate` path only: a run-scoped, in-memory detector tracks a signature per tool call (name + normalized args + a hash of the result) and aborts when the same full signature recurs K=3 times within the run, even with unrelated calls interleaved. The result is part of the signature, so a repeated call whose result differs (e.g. a re-read after a write that changed state) is legitimate progress and does not count. The run is recorded as failed with a machine-readable `no_progress` reason naming the repeated tool, and the abort is logged. Interactive/streamed runs are unaffected. Also lower the maxSteps defaults: the new-agent create form now defaults to 15 (was 30), and the backend fallback for an agent with no explicit maxSteps is 15 (was 1) so API-created agents behave sanely. maxSteps remains the single user-owned ceiling — no separate clamp for triggers. Closes #268 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Reflect the lowered create-form default (30 → 15) in the Agents guide and note that unattended (trigger/scheduled) runs now stop early — and are recorded as failed — when the model makes no progress. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

TRIGGER_PER_STEP_TIMEOUT_MS and TRIGGER_PER_RUN_TIMEOUT_MS bound unattended runs and are the wall-clock backstop alongside the maxSteps ceiling and the new no-progress guard. They were read from env but undocumented. Note: still need adding to apps/backend/.env.example (source of truth). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…nd vars Both are read by the frontend (next.config.ts, About page) but were absent from the frontend configuration reference. NODE_ENV is intentionally left out as a standard Node convention rather than Platypus config. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

willdady and others added 5 commits June 23, 2026 18:23

chore: updated backend .env.example

5d76c9c

willdady merged commit 908d0a0 into main Jun 23, 2026
4 checks passed

willdady mentioned this pull request Jun 23, 2026

chore(main): release 1.99.1 #270

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(backend): abort no-progress unattended runs and lower maxSteps defaults#271

fix(backend): abort no-progress unattended runs and lower maxSteps defaults#271
willdady merged 5 commits into
mainfrom
fix/268-no-progress-termination

willdady commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

willdady commented Jun 23, 2026

Problem

Changes

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant