Add polygraph skill: behavioral trust grades for MCP servers by RubenSousaDinis · Pull Request #477 · BankrBot/skills

RubenSousaDinis · 2026-06-16T10:55:21Z

polygraph — behavioral trust grades (A–F) for MCP servers and Agent Skills

Adds a polygraph/ skill (polygraph.so). Polygraph connects to an MCP
server the way an agent would, fingerprints its exact tool surface, and runs four behavioral
probes — C-01 tool-output injection, C-02 permission/egress overreach, C-03
sensitive-data leak, C-04 adversarial-input handling — then grades it A–F and can publish a
reproducible grade as an onchain EAS attestation on Base. The harness is open source, so anyone
can re-run it and disprove a bad grade.

CTA for builders

Check any MCP server before your agent uses it: npx polygraphso check <server>.
Get your own server graded at polygraph.so.

What the skill covers

Check a grade — npx polygraphso check npm/@modelcontextprotocol/server-filesystem
Verify before you trust — recompute the live tool-surface fingerprint and require it to match
the attestation before letting Bankr execute (the runtime gate)
Gate CI on grades — the GitHub Action polygraphso/litmus@v1 (or npx @polygraphso/litmus ci)
fails a build when an MCP server or a skill it ships grades D/F — see
polygraph/references/ci-gate.md
Get your project graded — run the open litmus harness on your own server or skill
Why a server got grade X — the A–F decision logic + how to read the report

Conforms to the contribution guide

polygraph/catalog.json — slug equals the folder, install.type: bankr, so it appears in the
Bankr Discover catalog
polygraph/logo.svg (square mark), polygraph/SKILL.md with name + description frontmatter,
supporting docs under polygraph/references/
README row added; rebased onto current main

Verified against the published packages

polygraphso — the lookup CLI (check / list)
@polygraphso/litmus — the open harness, the ci gate, and the polygraphso/litmus@v1 GitHub
Action

Happy to adjust naming/scope to match your conventions.

Polygraph grades MCP servers A–F by connecting like an agent, fingerprinting the exact tool surface, and running three behavioral probes (C-01 tool-output injection, C-02 permission/egress overreach, C-03 sensitive-data leak), then publishing a reproducible grade as an onchain EAS attestation on Base. The skill covers: checking a grade (`npx polygraphso check <server>`), running the open litmus harness locally to grade your own server, why a server got a given grade, and the verify-before-trust pattern for Bankr agents (recompute the live tool-surface fingerprint and require it to match the attestation before executing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Behavioral grades are now live via `polygraphso check` / `list` (A–F across graded servers), so replace the "rolling out / not yet available" framing and the stale example outputs with the real current CLI output, including the shipped grades. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

"Triggers on:" inside the plain-scalar description made YAML read it as a nested mapping ("mapping values are not allowed in this context"). Reword to "Triggers on mentions of" (no colon), matching the zerion skill convention. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Address review feedback: - Scope to MCP servers (drop "AI tools" — the whole harness is MCP-specific). - Make the remote/Docker-less B-cap explicit and frame it as a property of the measurement, not a knock (a remote B is not "worse than" a local A). - Stop hardcoding named third-party grades; keep one live first-party A as proof and treat the live set / attestation as the point-in-time source of truth. - Present the live scale as A/B/D/F; note C/E are not assigned (C reserved). - Elevate runtime verify-before-trust above the get-graded CTA and surface the evasion caveat at the trust decision. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@Version

Verified the skill against the now-published package. Remove the `challenge` command and the `check <ref[@Version]>` form — neither exists in the published CLI (commands: litmus/check/list; flags --json/--bearer/--header/--allow-state-changing and env POLYGRAPH_API_URL/LITMUS_BEARER/LITMUS_STDIO_ISOLATION all confirmed present). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add references/ci-gate.md (the polygraphso/litmus@v1 Action that fails a build when an MCP server or an Agent Skill grades D/F) and a 'Gate your CI on grades' section + reference link in SKILL.md. Co-Authored-By: Claude <noreply@anthropic.com>

…ory, the ci command Bring the skill up to date with the published @polygraphso/litmus: four probe categories (adds C-04 adversarial-input handling), methodology version litmus-v9 in the illustrative outputs, and the ci command in the CLI reference. Co-Authored-By: Claude <noreply@anthropic.com>

Adds the required catalog.json (slug=polygraph, install type bankr) so the skill appears in the Bankr Discover catalog, a square logo.svg, and a README table row. Rebased onto current main.

saltoriousSIG

A few things to address before merging!

The skill says “one command before your agent installs anything,” but that command is npx polygraphso ..., which installs/executes a third-party npm package. Same for @polygraphso/litmus. Please pin package versions/integrity and avoid presenting npx as a no-install trust check.
The GitHub Action is referenced as polygraphso/litmus@v1. For a security gate, this should be pinned to a commit SHA, not a mutable tag. Same concern for npx @polygraphso/litmus ci.
The CI gate auto-discovers MCP configs and may run/grade servers from PR-controlled files. That is risky on CI, especially with Docker/socket/network/secrets available. Please add strong warnings: don’t run this with secrets on untrusted PRs, don’t use pull_request_target, pin dependencies, and prefer explicit allowlisted targets over auto-discovery for public repos.
bearer / --bearer support can pass credentials into remote MCP checks. Please document that bearer tokens must not be provided on untrusted PRs or auto-discovered targets, and should be scoped/ephemeral.
The runtime gate defaults to accepting A/B, but remote MCP servers cap at B because egress is unverified. That means the default accepts servers where network exfiltration was not tested. Please make this explicit in the Bankr execution path and require a higher/manual review bar before routing signed actions/payments through remote B servers.
Attestation verification needs stricter trust rules. The docs note self-published grades are forgeable, but the examples still frame readAttestation + gateDecision as enough before “pay/execute.” Please require schema ID, chain ID, revocation status, methodology version, attester allowlist or reproducible rerun, and exact fingerprint match before any Bankr action.
POLYGRAPH_API_URL can override the lookup endpoint. That’s useful for dev but dangerous in agent/runtime docs. Please warn not to use untrusted lookup endpoints for execution decisions, or require host allowlisting.
The skill grading described in CI is static-only. Static scanning skill text can miss bundled scripts, install commands, remote URLs, and runtime behavior. Please avoid implying this is equivalent to behavioral MCP grading, and require manual/security review for skills with install-time code execution or transaction instructions.

RubenSousaDinis marked this pull request as ready for review June 16, 2026 13:49

RubenSousaDinis and others added 8 commits June 26, 2026 12:14

polygraph skill: add catalog.json + logo, list in README

f353528

Adds the required catalog.json (slug=polygraph, install type bankr) so the skill appears in the Bankr Discover catalog, a square logo.svg, and a README table row. Rebased onto current main.

RubenSousaDinis force-pushed the add-polygraph-skill branch from 6738fcb to f353528 Compare June 26, 2026 11:17

saltoriousSIG reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add polygraph skill: behavioral trust grades for MCP servers#477

Add polygraph skill: behavioral trust grades for MCP servers#477
RubenSousaDinis wants to merge 8 commits into
BankrBot:mainfrom
RubenSousaDinis:add-polygraph-skill

RubenSousaDinis commented Jun 16, 2026 •

edited

Loading

Uh oh!

saltoriousSIG left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

RubenSousaDinis commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

polygraph — behavioral trust grades (A–F) for MCP servers and Agent Skills

CTA for builders

What the skill covers

Conforms to the contribution guide

Verified against the published packages

Uh oh!

saltoriousSIG left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RubenSousaDinis commented Jun 16, 2026 •

edited

Loading