Harden the agent gate and CI action for value-routing and untrusted CI#70
Merged
Conversation
Tighten the trust surface a behavioral grade is used through, without changing
the default decision for read-only/low-value callers.
- agent gate: add opt-in GateOptions (allowedAttesters, acceptedMethodologyVersions,
requireEgressVerified), all default-off so the base decision is unchanged; export
PAYMENT_PASSING ({A}) for signed/value actions; DEFAULT_PASSING is now {A,B}
(C is reserved/unassigned, so this only drops an accept for a grade never emitted).
- onchain read: surface methodologyVersion and a derived egressVerified (from the
on-chain C-02 verdict), so a payment gate can distinguish an egress-verified local
A from a remote/no-sandbox B whose network behavior was never observed.
- action.yml: discovery is now opt-in (default off) so a public-repo gate doesn't run
PR-controlled targets by default; pin the run version; document SHA-pinning the
action, pull_request (not pull_request_target), and scoping bearer/api-url.
- ci command: warn when auto-discovery is on (targets are repo-controllable and
grading a server runs its code).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011FW3vDMau8UYnNWCanfT4k
…the trust bar Align the polygraph skill docs and the README action section with the hardened behavior: - npx is a lookup that runs our CLI, not a "no-install" check — pin the version. - SHA-pin the GitHub Action, trigger on pull_request not pull_request_target, and prefer an explicit allowlist over auto-discovery (now off by default). - bearer is sent as an Authorization header to the target — trusted pinned remote only, scoped and short-lived. - raise the bar for signed/value actions to a local A (PAYMENT_PASSING / requireEgressVerified); a remote B never had its egress observed. - a self-minted grade is forgeable: also check attester + methodology version, or re-run the harness, before routing value. - POLYGRAPH_API_URL enforces https; the residual risk is endpoint trust. - a skill grade is a static scan, not equivalent to a behavioral server grade — install-time code or transaction instructions need manual review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011FW3vDMau8UYnNWCanfT4k
RubenSousaDinis
added a commit
that referenced
this pull request
Jun 29, 2026
Ships the gate/action hardening from #70: opt-in GateOptions (allowedAttesters, acceptedMethodologyVersions, requireEgressVerified), PAYMENT_PASSING, on-chain methodologyVersion + egressVerified surfaced for the read path, discovery off-by-default in the action, and the ci discovery warning. Backward-compatible minor. Claude-Session: https://claude.ai/code/session_011FW3vDMau8UYnNWCanfT4k Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tightens the trust surface a behavioral grade is used through, prompted by a security review of the polygraph skill. The default decision for read-only / low-value callers is unchanged; the new rigor is opt-in.
Agent gate (
packages/agent)GateOptions, all default-off:allowedAttesters,acceptedMethodologyVersions,requireEgressVerified.PAYMENT_PASSING({A}) as the documented bar for signed/value actions — a remote server caps at B because its egress was never observed, so requiring a local A excludes egress-unverified grades.DEFAULT_PASSINGis now{A,B}(C is reserved/unassigned, so this only drops an accept for a grade that is never emitted).Onchain read (
packages/onchain)methodologyVersionand a derivedegressVerified(decoded from the on-chain C-02 verdict), so a payment gate can distinguish an egress-verified local A from a remote/no-sandbox B.CI action + command
action.yml: auto-discovery now defaults off so a public-repo gate doesn't run PR-controlled targets by default; pin the run version; document SHA-pinning the action,pull_request(notpull_request_target), and scopingbearer/api-url.cicommand: warns when discovery is on (discovered targets are repo-controllable and grading a server runs its code).Docs
polygraphskill docs reframed: npx is a lookup that runs our CLI (pin the version), SHA-pin the action, bearer is sent to the target, raise the bar to a local A for value, a self-minted grade is forgeable (check attester/methodology or re-run), and a skill grade is a static scan — not equivalent to a behavioral server grade.Backward-compatible (new optional surface only). No methodology/schema semantics change, so
methodologyVersionis not bumped. Full suite green (522 passed). Ships in 0.20.0.