Prism2 is a Managed Agents harness on Claude Opus 4.7 that audits (a) numerical correctness in GPU inference kernels and (b) clinical reasoning on public benchmarks. This policy covers the security of the auditor and its infrastructure, not a coordinated-disclosure channel for kernel findings.
prism42 carries no embargoed reproduction material, no target-specific
naming, and no maintainer identifiers. Kernel findings that reach the
threshold for external disclosure route through private, off-repo channels
(see docs/kernel-research-posture.md). Clinical findings route through
Anthropic's model-feedback channel after physician review (see
docs/clinical-handling.md).
This policy applies to:
- Prism's own codebase — harness runners, agents, skills, schemas, CI workflows, tooling.
- Generated artifacts — case directories, verdicts, rubric-graded
session outputs under
results/. - Third-party pins —
third_party/simple-evals/(vendored at a specific commit, Apache-2.0).
Out of scope (separate policies):
- Kernel-research findings —
docs/kernel-research-posture.md. - Clinical model-behavior observations —
docs/clinical-handling.md. - Anthropic Managed Agents platform itself — Anthropic's responsibility.
If you discover a security issue in Prism's harness code (not in a target being audited), please report it:
- Email: security@goatnote.app (with
b@thegoatnote.comCC'd). - GitHub private security advisory on this repo.
- Do not open a public issue for security findings; use the channels above.
We acknowledge within 5 business days and aim to remediate within 30 days, faster for critical issues. Good-faith security research is welcomed.
Examples:
- Authentication / credential handling (
.envlifecycle, double-gate bypass). - SDK containment bypass (
anthropicimported outsidedo_commit()). - Schema-validation bypass (malformed artifacts accepted by L1 check).
- CI / pre-commit hook bypass.
- Supply-chain issues in
third_party/pins. - PII / PHI leakage from logs, caches, or published artifacts.
- Double-gate on live LLM / compute calls.
--commitflag +PRISM_*_COMMIT=1env var. Default dry-run. AST-verified byscripts/check_sdk_containment.py. - Schema validation on every case-dir artifact (L1 gate).
- Pipeline invariants check (L4) — model pins, role↔filename correspondence, egress allowlist, no secret mount, manifest shape, schemas compile.
- Pre-commit hooks block target-specific naming and secrets.
- Physician-review gate on clinical findings — adjudicator's
physician_reviewfield; generators never pre-sign.
Clinical findings are model-behavior observations, not vulnerabilities. Private routing: physician review → Anthropic model-feedback channel. Never public issue tracker, never preprint before review, never social. No PHI. Synthetic fixtures only.
prism42 does kernel-research, not coordinated disclosure. If research surfaces a credible kernel correctness concern, it routes through a private channel maintained by the research lead, off this repo. prism42 intentionally excludes reproduction material, target-specific naming, and maintainer contact data from its public surface.
This policy draws on CERT/CC coordinated-disclosure norms and Redwood AI
Control protocol hygiene. See docs/runaway-ai-kb/ for the full literature
map.