feat(ai): wire external HTTP guardrail providers on the input path#551
Merged
Conversation
An AI origin's `guardrails.external` list now runs external guardrail
services (Presidio, Lakera, Aporia, or a custom endpoint) alongside the
built-in checks. Input-mode entries (pre_call / during_call) inspect the
request before dispatch and block on a not-allowed verdict; logging_only
records only, and transport or parse errors honor each entry's fail_open
flag.
Provider presets shape the request and response: Presidio posts
{text, language} and treats a non-empty findings array as a flag; the
others post {input} and parse a common allowed/flagged/blocked verdict,
with an optional API key on a configurable auth header.
The decision logic (verdict_blocks plus the provider request/response
shaping) is pure and unit-tested; the dispatch wiring is a thin async
call before the built-in pipeline. Output-side and AWS Bedrock (SigV4)
guardrails are not yet wired.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01X19S6eQzKKExZ9RUPAHuGy
…-external-guardrails # Conflicts: # CHANGELOG.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Wires external HTTP guardrail providers into the AI gateway. An origin's
guardrails.externallist runs external guardrail services alongside the built-in checks; input-mode entries inspect the request before dispatch and block on a not-allowed verdict.The generic adapter and LiteLLM
modemapping already existed but were never called by the runtime. This connects them and adds provider presets.How
GuardrailsConfig.external: Vec<ExternalGuardrailConfig>(new,#[serde(default)], no schema change since the type is deserialize-only).GuardrailProvider): Presidio posts{text, language}and treats a non-empty findings array as a flag; Generic/Lakera/Aporia post{input}and parse a commonallowed/flagged/blockedverdict. Optionalapi_keyon a configurableauth_header/auth_prefix(defaults toAuthorization: Bearer).verdict_blocks()decides blocking (mode blocks AND content disallowed;logging_onlynever blocks);run_input_external_guardrails()evaluates thedefault_oninput-mode entries and returns the first block.400 guardrail_violationon a block, matching the existing built-in guardrail block shape. Errors honor each entry'sfail_open.Scope / follow-ups
post_call) external guardrails and AWS Bedrock (SigV4ApplyGuardrail) are not yet wired and are noted as follow-ups.default_on: trueruns a guardrail on every request; per-request opt-in via metadata is not included.Tests
sbproxy-ai: Presidio vs generic request/response shapes,verdict_blocksacross modes, config parsing with provider + auth defaults (8 external-guardrail tests pass).sbproxy-ai(962) andsbproxy-core(459) lib suites pass; clippy-D warningsand rustdoc-D warnings -D missing_docsclean; regenerated config schema is byte-identical (no schema change).