fix(proxy): invoke post-call guardrails on pass-through endpoint responses#2
Open
tuhinspatra wants to merge 2 commits into
Open
fix(proxy): invoke post-call guardrails on pass-through endpoint responses#2tuhinspatra wants to merge 2 commits into
tuhinspatra wants to merge 2 commits into
Conversation
032ed13 to
a2d1d95
Compare
…onses (BerriAI#20270) Wire post_call_success_hook into non-streaming pass-through response path, gated on explicit guardrail config (opt-in only, no backwards-compat break). - Call post_call_success_hook after reading non-streaming response body - Build enriched hook_data with guardrails metadata and litellm_logging_obj at call site (avoids mutation of _parsed_body which is shared by logging) - Handle ModifyResponseException with provider-agnostic error envelope, post_call_failure_hook, and defensive try/except - Strip stale content-length when guardrail modifies response body - Move ModifyResponseException to litellm.exceptions to break cyclic import; re-export from custom_guardrail for backwards compat - Add call_type fallback in UnifiedLLMGuardrails for pass-through endpoints using CallTypes.pass_through.value enum
5 tests covering the post-call guardrail invocation on pass-through endpoints: - post_call_success_hook fires when guardrails configured - post_call_success_hook skipped when no guardrails (backwards compat) - ModifyResponseException returns 200 with provider-agnostic error - UnifiedLLMGuardrails resolves call_type from logging_obj for pass-through - ModifyResponseException re-export from custom_guardrail stays in sync
a2d1d95 to
32cdace
Compare
seph-barker
pushed a commit
that referenced
this pull request
May 18, 2026
…ps (BerriAI#28028) * fix(ci): flag codecov uploads and enable carryforward Coverage uploads from GHA and CircleCI were unflagged. Commits that receive the push-triggered workflows more than once (re-runs, or branches cut at the same SHA) accumulated many overlapping flagless sessions, and Codecov's per-commit merge dropped the largest, ubiquitously-imported files (router.py, proxy_server.py, main.py, utils.py, cost_calculator.py) from the report even though the uploaded XMLs contained them. - codecov.yaml: flag_management.default_rules.carryforward: true - GHA reusable bases: tag each upload with its workflow/shard name - CircleCI: tag the combined upload "circleci"; also combine the agent / google_generate_content_endpoint / litellm_utils datafiles that were produced and required but missing from the combine list * fix(ci): close coverage gaps in proxy-legacy, router-unit, auth-ui, caching-redis - test-unit-proxy-legacy: route through _test-unit-base so the full proxy_unit_tests suite (incl. comprehensive test_proxy_server*.py) is measured and uploaded with per-group flags (was plain pytest, no --cov) - _test-unit-services-base: declare the enable-redis input + the six secrets test-unit-caching-redis passes; that workflow had a workflow_call signature mismatch and startup_failed on every push (never ran). Changes are additive/optional - proxy-db and security callers unchanged - circleci: add --cov + persist + combine + upload-coverage requires for litellm_router_unit_testing (tests/router_unit_tests) and auth_ui_unit_tests (tests/proxy_admin_ui_tests); neither was covered anywhere. Redundant -k subset jobs left as-is (local_testing covers them) * fix(ci): remove dead GHA Redis workflow; keep Redis on CircleCI only CircleCI redis_caching_unit_tests already runs the exact same files (tests/local_testing/test_dual_cache.py, test_redis_batch_optimizations.py, test_router_utils.py) with --cov, and that datafile is already combined and uploaded. The GHA test-unit-caching-redis workflow was redundant and had never run (workflow_call signature mismatch -> startup_failure on every push). - Delete .github/workflows/test-unit-caching-redis.yml - Revert _test-unit-services-base.yml to the flag-fix state (drop the enable-redis input / secrets / env wiring added only to prop up the GHA Redis workflow); the verified per-upload flags line is kept - The only single-star "litellm_*" branch glob lived in the deleted file; no other single-star globs exist, so none remain to widen * fix(ci): keep proxy-legacy as a standalone job to preserve required check names Routing proxy-legacy through the reusable workflow renamed each check from the bare matrix name (e.g. "proxy-response-and-misc") to "proxy-response-and-misc / Run tests". Those bare names are required status checks in branch protection, so the old contexts never reported and PRs sat "Expected — Waiting for status to be reported" indefinitely. Restore the original standalone matrix job (job name == matrix name, so the required contexts report again) and add coverage in place: --cov on pytest plus an OIDC Codecov upload flagged proxy-legacy-<group>. Net effect of the gap-#2 fix is preserved (flagged coverage for tests/proxy_unit_tests/**) without changing any check name. * revert(ci): drop all proxy-legacy changes from this PR tests/proxy_unit_tests/** is already fully covered by test-unit-proxy-db (its shard-coverage guard fails CI if any file in that dir is unassigned), which this PR already flags + carryforwards. Adding --cov and id-token:write to the legacy pull_request job was redundant and put OIDC on a job that runs untrusted PR code. Restore the file to the base version verbatim so this PR no longer touches proxy-legacy at all (also restores its original required check names). Retiring proxy-legacy in favor of proxy-db on pull_request is a separate effort that needs a branch-protection change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upstream bug fix for BerriAI/litellm — enables post-call guardrail invocation on pass-through endpoints.
post_callBerriAI/litellm#20270What this enables
The Rubrik guardrail plugin (
predibase/llmrouter) usesapply_guardrail(input_type="response")to checktool calls. Without this fix, pass-through routes like
/vertex_ai/*skip post-call hooks entirely, so toolblocking never fires for Gemini models accessed via pass-through.
Changes
post_call_success_hookinto non-streaming pass-through response pathcall_typefallback inUnifiedLLMGuardrailsfor pass-through endpointsModifyResponseExceptionwith proper logging and failure hooksTest Plan
test_post_call_success_hook_called_when_guardrails_configured— verifies hook fires with guardrailstest_post_call_success_hook_skipped_when_no_guardrails— verifies no-op without guardrails (backwards compat)test_modify_response_exception_returns_200— verifies guardrail violation returns 200 with message, callspost_call_failure_hook, includes usagetest_pass_through_call_type_resolved_from_logging_obj— verifies unified guardrail resolves call_type for pass-through endpointsRun tests:
Next steps
PassThroughEndpointHandler(follow-up)