Plumbref helps coding agents check risky repo claims before you rely on them.
It gives coding agents a report gate: before you repeat or act on an answer, the agent breaks it into claims, checks local repository evidence, and returns what is supported by checked source lines, what needs qualification, what not to rely on, and the source lines behind that call. When the answer is risky, qualified, or explicitly requested, Plumbref also writes an inspectable report.
Ask natural reliance-check questions about your repository:
The agent said this only affects onboarding. Check that.
Can I tell support that SSO only depends on Okta?
The agent says this field is safe to rename. What is actually supported?
The goal is to replace vague "are you sure?" loops with source-backed reliance decisions your team can inspect.
Plumbref is local-first. It does not call a model API, require an API key, upload your repository, use a database, depend on a vector store, or run a hosted service.
Plumbref does not independently determine truth. The coding agent still extracts claims, chooses searches, reads evidence, and records judgments. Plumbref structures that workflow, stores the evidence trail, applies template and strict-gate checks, and downgrades answers when required recorded checks are missing.
Treat supported as "supported by the cited local evidence and recorded
checks," not as a guarantee about every deployment, runtime state, or external
system.
Prompts and skills can ask an agent to be careful, but they often produce more prose when you ask "are you sure?" Plumbref gates coding-agent answers against recorded source evidence and turns them into safe-to-rely-on statements with explicit limits.
The agent still extracts claims and reasons over evidence. Plumbref supplies the source-grounded workflow, budgets, redaction, status semantics, and report artifacts.
- Install Plumbref once.
- Add it to your MCP-capable coding agent for a repository.
- Ask repo questions naturally in chat.
- The agent uses Plumbref tools to check claims against source evidence.
- You get a concise answer with safe-to-rely-on claims, qualified claims, do-not-rely-on claims, evidence locations, unchecked areas, and safer wording.
Example inline answer shape:
Safe to rely on:
- The checked function rewrites the report label from "items" to "records".
Say with qualification:
- too_broad: The change only affects formatting.
Safer wording:
- The checked function changes report-label wording, but downstream exports were
not fully traced.
Evidence:
- `src/reports/labels.ts:41-58`
Unchecked:
- Conflicting-code-path search not recorded: generated client exports.
You should not need to manually run verification commands during normal use. The CLI exists mainly for setup, smoke tests, debugging, and report rendering.
- repository: the local codebase being checked
- question: the user's natural-language repo question
- claim: one atomic statement the agent wants to make
- evidence: bounded source snippets read from the repo
- judgment:
supported,too_broad,uncertain,contradicted,not_found, ornot_verifiable - template: a verification playbook for a class of engineering question
- report: the source-backed trail the user can inspect
- MCP server for agent-driven verification
- CLI for setup, doctor checks, templates, smoke tests, and report rendering
- built-in templates for flow explanation, field migration, change impact, downstream consumers, and external integrations
- Markdown and JSON reports
- broad-claim detection for phrases like
only,safe to,no downstream,always,never,all,every, andguarantee - local repository search through ripgrep
- source snippet bounds, redaction patterns, budgets, and cache metrics
Install the latest published package:
pipx install plumbrefOr install directly from GitHub:
pipx install git+https://github.com/ftaboadac/plumbref.gitFor local development:
python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install -e ".[dev]"Plumbref uses rg/ripgrep for repository search:
rg --versionFrom the repository you want an agent to verify:
cd /path/to/repo
plumbref init
plumbref doctorplumbref init prints repo-specific MCP configuration and recommended agent
instructions. plumbref doctor checks local readiness: repo root, ripgrep,
config loading, templates, report-path writability, and optional MCP startup.
MCP clients that support stdio servers can launch Plumbref with:
plumbref mcp --repo-root /path/to/repoExample MCP config:
{
"mcpServers": {
"plumbref": {
"command": "plumbref",
"args": ["mcp", "--repo-root", "/path/to/repo"]
}
}
}For agent-specific usage guidance and conversational examples, see Agent Usage Guide.
For the current first-run proof path, see
examples/first-run. It includes captured Codex CLI
transcripts, real Plumbref MCP tool calls, inline answers, generated report
paths, and a careful-prompt-plus-rg baseline.
Templates are versioned verification playbooks. They define evidence categories, required search passes, contradiction searches, budgets, report sections, and unchecked-area prompts for a class of engineering question.
Built-in templates:
generic_verification: fallback protocol for unsupported or unusual casesexplain_flow: explain how a workflow or integration worksfield_migration: check impact of moving or changing a fieldchange_impact: evaluate likely downstream effects of a diff or changed filedownstream_consumers: find direct and likely consumers of a contract or APIexternal_integration: inspect vendor syncs, webhooks, pushes, pulls, and clients
List templates:
plumbref templates --repo-root /path/to/repoInspect one template:
plumbref templates --repo-root /path/to/repo --template-id field_migrationCustom templates can live in .plumbref/templates/, in
~/.config/plumbref/templates/, or in shared template-pack directories listed
in template_paths. Later sources can override earlier templates with the same
ID, so teams can adapt a built-in playbook without forking Plumbref.
Reports are written under:
.cache/plumbref/reports/YYYY-MM-DD/
Reports are local generated receipts. They are intentionally stored under
.cache so they stay out of normal source control and can be deleted without
changing the project.
By default, report_policy = "on_demand" writes report files only when the
user asks for a report or when the result has risk, uncertainty, qualifications,
change impact, or broad/absolute language. Low-risk supported answers can stay
inline in chat without creating report files. Set report_policy = "manual" to
write reports only when explicitly forced, or report_policy = "always" to
write one for every rendered Plumbref result.
Written reports lead with a verification outcome: whether the claim is safe to
rely on from checked evidence, needs qualification, or should not be relied on
as written. They also include broad-claim findings, safer wording, cited
evidence, search traces, limits, and cache metrics.
.cache/plumbref/reports/index.json tracks reports that were actually written.
MCP render responses include inline_answer: the chat-shaped answer agents
should return by default. It summarizes what is safe to rely on, what needs
qualification, what not to rely on, safer wording, evidence locations,
unchecked areas, and verification counts. Markdown and JSON reports are the
inspectable receipt behind that answer, not the primary user-facing surface.
Agents can compare two JSON reports for the same question through the MCP
plumbref_diff_reports tool. The tool returns structured claim changes plus a
Markdown diff, so the agent can summarize what changed inline in chat.
For local debugging, the same diff renderer is available through the CLI:
plumbref diff-reports old-report.json new-report.json --output report-diff.mdThe diff shows claim status changes, evidence-only changes, added claims, removed claims, and unchanged claims. New reports include stable claim IDs for this workflow; older reports can be compared by claim order as a fallback.
Searches are cached by query/options and repository state. Evidence snippets are cached by file path, line range, file hash, and privacy settings. Cache hits and in-session evidence reuse can return compact evidence references without repasting source text.
Current demo artifacts:
Historical dogfood reports live in examples/reports. They are useful for seeing report shape and earlier product exploration, but some line references and quoted README wording may be stale. Do not treat them as current proof until they are regenerated.
In real MCP runs against this repository:
- 3 sessions
- 12 claims checked
- 29 searches run
- 28 bounded evidence snippets read
- 12/12 contradiction passes on judged claims
- 3 unsupported or over-broad claims caught
Plumbref also reduced source text compared with opening every matched file, but it does not claim to always use fewer tokens than a careful expert agent. See Real Workflow Test Results for the measurement details and limitations.
The current public benchmark shows the same pattern more concretely: careful
prompting plus rg can answer the controlled fixtures correctly, while
Plumbref adds structured claim state, contradiction-pass records, durable
Markdown/JSON reports, and rerunnable evidence trails. Do not use the current
fixtures to claim Plumbref is more correct than a careful agent.
Config discovery order:
- explicit
--config <repo-root>/.plumbref.local.toml<repo-root>/.plumbref.toml~/.config/plumbref/config.toml
Minimal example:
ignored_paths = [".git", ".venv", "node_modules", ".cache"]
privacy_patterns = [
"(?i)(api[_-]?key|secret|token|password)\\s*[:=]\\s*['\\\"][^'\\\"]+['\\\"]",
]
default_budget_mode = "normal"
default_output_modes = ["engineer", "json"]
default_template_id = "generic_verification"
report_policy = "on_demand"
template_paths = ["plumbref-template-pack"]redaction_patterns is accepted as an alias for privacy_patterns.
template_paths entries are resolved relative to the repository root unless
they are absolute paths.
Run tests:
python3 -m pytestRun lint:
ruff check .See ROADMAP.md for the implementation roadmap. For public-demo readiness and honest positioning, see Launch Checklist.
- Plumbref does not extract claims by itself.
- Plumbref does not decide truth with an LLM.
- Plumbref search is lexical and repo-local.
supportedmeans supported by the cited source evidence, not globally true for every deployment or runtime state.- Plumbref cannot verify claims that require production data, private services, or external systems unless the relevant evidence exists in the local repository.
- no model API dependency
- no hosted service
- no database
- no vector store
- no UI
- no automatic code review replacement
- no production-data inspection
