Skip to content

fix(hitl): log pre-review failures and add learn_strict mode (closes #5725)#5726

Open
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin/1778040886-fix-hitl-pre-review-silent-fallback
Open

fix(hitl): log pre-review failures and add learn_strict mode (closes #5725)#5726
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin/1778040886-fix-hitl-pre-review-silent-fallback

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Summary

Closes #5725.

@human_feedback(..., learn=True) runs _pre_review_with_lessons to apply past HITL lessons via an LLM before the human sees the output. Today, that helper wraps the entire pre-review path in a broad except Exception: return method_output, which swallows LLM/network/auth/structured-output errors with no log, no event, and no return signal — callers cannot distinguish a pre-reviewed output from a raw one.

This PR makes the failure observable by default, and adds opt-in fail-closed behavior:

lib/crewai/src/crewai/flow/human_feedback.py

  • Add a module logger (logging.getLogger(__name__)).
  • Narrow the try/except in _pre_review_with_lessons so the mem is None and not matches short-circuits stay outside the failure path (those are not errors).
  • On memory.recall failure or pre-review LLM failure, log WARNING with exc_info=True so the silent fallback is detectable. By default the flow still falls back to the raw method_output and continues — we don't change the default fail-open contract.
  • Add learn_strict: bool = False to the human_feedback decorator and HumanFeedbackConfig. When learn_strict=True, recall/pre-review failures are re-raised instead of falling back, for callers that need fail-closed behavior.

docs/en/learn/human-feedback-in-flows.mdx

  • Update the "Graceful degradation" / Configuration section to document that pre-review failures are now logged at WARNING with traceback and to introduce the learn_strict parameter.

lib/crewai/tests/test_human_feedback_decorator.py

  • New TestHumanFeedbackPreReviewFailure class with 7 tests:
    • sync pre-review LLM failure → WARNING logged with exc_info, raw output shown to human;
    • sync memory.recall failure → WARNING logged with exc_info, raw output shown to human;
    • sync learn_strict=True → pre-review LLM RuntimeError propagates;
    • sync learn_strict=True → recall RuntimeError propagates;
    • async pre-review LLM failure → WARNING logged with exc_info, raw output shown to human;
    • async learn_strict=True → pre-review LLM RuntimeError propagates;
    • learn_strict defaults to False and round-trips through HumanFeedbackConfig.

Behavior outside the pre-review step (distillation, outcome collapse, resume emit fallback, listener failures) is intentionally untouched — issue #5725 is scoped only to pre-review.

All 40 tests in test_human_feedback_decorator.py pass locally. ruff check, ruff format --check, and mypy on the modified file are clean.

Review & Testing Checklist for Human

  • Confirm the new learn_strict decorator argument name matches your preferred public API surface (the issue suggested it as one option; alternatives like pre_review_strict or a separate strict= flag would be one-line renames).
  • Confirm fail-open + WARNING is the right default. The PR keeps the documented "Graceful degradation" contract but makes it observable. If you want fail-closed by default, learn_strict would just need to be flipped (and one test inverted).
  • Sanity-check the WARNING log strings ("HITL pre-review failed for %s; falling back to raw output." and "HITL pre-review: memory recall failed for %s; falling back to raw output.") — these will show up in user logs.
  • Spot-check the docs update at docs/en/learn/human-feedback-in-flows.mdx — only the English doc was updated; the pt-BR/ko/ar mirrors still describe silent degradation. Happy to follow up with translations if you'd like them in this PR.

Notes

  • Repro from the issue (@human_feedback(learn=True) + seeded memory + LLM raising during pre-review) now produces a single WARNING under the crewai.flow.human_feedback logger with full traceback, instead of silent passthrough.
  • The change is intentionally minimal and scoped to _pre_review_with_lessons. Distillation's existing except Exception: pass remains untouched per the issue's "out of scope" note, but it would be a natural follow-up to give it the same WARNING-on-failure treatment.

Link to Devin session: https://app.devin.ai/sessions/8615574d124449a5a1b09ea4d1fd88bd

Closes #5725.

In _pre_review_with_lessons (lib/crewai/src/crewai/flow/human_feedback.py),
a broad except Exception: return method_output silently swallowed any
LLM, network, auth, or structured-output failure during the pre-review
step. Callers could not distinguish pre-reviewed output from raw output,
and there was no log or event surfaced.

Changes:
- Add a module logger.
- Narrow the try/except in _pre_review_with_lessons so the mem is None
  and not matches short-circuits stay outside the failure path (those
  are not error cases).
- On recall or LLM pre-review failure, log a WARNING with exc_info=True
  so the silent fallback is observable. Continue to fall back to the raw
  method_output so the flow does not break.
- Add an opt-in learn_strict=True parameter on the human_feedback
  decorator and HumanFeedbackConfig that re-raises pre-review failures
  instead of falling back, for callers that need fail-closed behavior.
- Update the Graceful degradation docs section to reflect the new
  observable-by-default behavior and document learn_strict.

Tests (lib/crewai/tests/test_human_feedback_decorator.py):
- New TestHumanFeedbackPreReviewFailure class with 7 tests covering
  WARNING logging on LLM and recall failures, learn_strict propagation
  in both sync and async paths, and config introspection of the new
  learn_strict flag.

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 6, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
crewai 🟢 Ready View Preview May 6, 2026, 4:28 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: HITL pre-review fails open and silently bypasses automated safeguards

0 participants