deep-research

Protocol-first, gated, multi-agent literature investigation. Every claim a synthesis ships is backed by a row in research_evidence with a DOI and a verbatim quote span. If a claim can't produce its quote, the claim gets cut.

That is the entire point.

Why this exists

LLM-generated literature reviews fail in a specific way: they read beautifully and contain claims that no actual paper makes. The drift is usually paraphrase, occasionally fabrication, and almost never caught by the author of the prompt.

This repo is a workflow that makes the failure mode impossible by construction:

Protocol pre-registration — search strategy, inclusion criteria, effect measure, and analysis plan are written and approved BEFORE any search runs. Drift after this point is logged, not absorbed.
Three concurrent search-oriented roles in Pass-1 — a Scout for coverage, a Skeptic actively hunting refutation, a Methodologist grading design. The Synthesizer doesn't see the corpus yet.
A human spend gate between cheap Pass-1 (abstracts) and expensive Pass-2 (full text). The corpus the human approves is the corpus the Synthesizer gets.
A Synthesizer with hard rules: every claim cites a row, every cited row has a quote_span, every numeric value matches verbatim. If the corpus doesn't support a claim, the doc says so explicitly.

The output is a structured document with native heading hierarchy, effect-size table, forest plot, blockquoted citations, and a full reference list. It is meant to be defensible under adversarial re-read.

What ships in v0.2.0

Path	What's in it
`SKILL.md`	The full protocol, agent-runtime-agnostic.
`schema/schema.sql`	PostgreSQL schema (three append-only tables).
`schema/schema_sqlite.sql`	SQLite equivalent for local dev.
`agents/scout.md`	Pass-1 broad-recall role prompt.
`agents/skeptic.md`	Pass-1 refutation-hunting role prompt.
`agents/methodologist.md`	Pass-1 design-grading role prompt.
`agents/synthesizer.md`	Phase 4 no-fabrication role prompt.
`agents/critic.md`	Continuation mode `critique` role prompt.
`lib/scholar.py`	stdlib-urllib adapter over OpenAlex, Semantic Scholar, PubMed, arXiv, Europe PMC, Crossref, and Unpaywall.
`lib/synthesis_doc_builder.py`	`python-docx` + `matplotlib` helper that renders the synthesis doc (forest plot, PRISMA flow, stance heat-table) with a pluggable upload callback.
`manifests/deep-research.v0.4.json`	install-manifest-spec v0.4 declaration.
`examples/cholesterol_primary_prevention/`	A real run, end to end.
`tests/`	stdlib `unittest` smoke tests. Network-free.

What's new in v0.2.0

lib/scholar.py — six free academic APIs behind one normalized hit schema. stdlib only. Configure the polite-pool contact email at install time via scholar.configure(contact_email=...) or the SCHOLAR_CONTACT_EMAIL env var. Embedding-based dedup in multi_search is pluggable — wire your runtime's embeddings provider via scholar.set_embedding_deduper(fn) or leave it unregistered and fall back to DOI / title-hash dedup.
lib/synthesis_doc_builder.py — decoupled upload via dependency injection. Pass an Uploader callable ((local_path, name, mime_type) -> dict) and the helper hands your builder the local .docx; no uploader = no upload, you keep the file. matplotlib + python-docx are soft deps gated behind the [viz] extra.
pyproject.toml — installable package. Core is stdlib-only.
Tests — tests/test_scholar_smoke.py and tests/test_synthesis_doc_builder_smoke.py. Run with python -m unittest discover tests.

A working reference of both modules also lives upstream in the Yep agent codebase.

Agent-runtime compatibility

The protocol is runtime-agnostic. The reference implementation uses the Claude Agent SDK, but the workflow only requires:

Capability	What's needed
Subagent fan-out	Spawn 3 sibling agents in parallel with isolated contexts.
Persistent KV	Read/write the three `research_*` tables. PostgreSQL or SQLite.
Scholarly HTTP	Reach OpenAlex / Semantic Scholar / PubMed / arXiv / Crossref / Unpaywall. PDF fetch + extract-to-text.
Human-in-loop	Two checkpoints where execution blocks until the user approves.

Pipe the agents/*.md prompts through whatever orchestration layer your agent runtime provides.

Quickstart

Install (the [viz] extra pulls python-docx + matplotlib + numpy for the synthesis doc builder; omit it if you only need the protocol + scholar adapter):
```
pip install "deep-research[viz]"
# or, from a clone:
pip install -e ".[viz]"
```

Pick your database:

# PostgreSQL
psql "$DATABASE_URL" < schema/schema.sql

# or SQLite (local dev)
sqlite3 deep_research.db < schema/schema_sqlite.sql

Read SKILL.md end to end. The protocol is short. The gates are not optional.
Pick a question. Draft the protocol JSON per the schema in SKILL.md. Save it. Walk through Gate 1 with a human.
On approval, fan out the three Pass-1 subagents with the prompts in agents/. Each writes its research_searches audit rows and research_evidence candidate rows. Wire lib.scholar.scholar(...) in for the search calls.
Roll up the corpus. Walk through Gate 2. Approve, revise, or abort.
On approval, run Pass-2 retrieval. Stage extracted text.
Run the Synthesizer. Read the hard-rules block at the top of agents/synthesizer.md first. The Synthesizer's job is to NOT make anything up; that job is harder than it sounds. Use lib.synthesis_doc_builder.build_synthesis_doc(inputs, uploader=...) for the artifact — pass your own uploader for Drive/S3 wiring, or omit it to keep the local .docx.

Worked example: cholesterol primary prevention

See examples/cholesterol_primary_prevention/.

This was a real run of the protocol on the question:

Does pharmacological LDL-lowering reduce all-cause mortality in strict primary prevention (no prior cardiovascular events, no known cardiovascular disease)?

The example ships the approved protocol, the run log (subagent counts per phase, retraction sweep result, top-graded candidates), and the synthesis document with all citations intact.

Continuation modes

Once a project is complete, the user can revisit it under one of four modes without spinning up a fresh project:

refresh — pull literature since the last cutoff.
deepen — same question + corpus; retry paywalled rows.
rescope — same corpus, new sub-question.
critique — adversarial re-read of the synthesis against its quotes.

Lineage is preserved. The prior synthesis stays as immutable record. See the "Continuing a prior project" section in SKILL.md.

License

Apache-2.0. See LICENSE.

Author

Maintained by Dimitri T (@drknowhow). This protocol was extracted from the Yep agent and generalized for reuse by other agent owners.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deep-research

Why this exists

What ships in v0.2.0

What's new in v0.2.0

Agent-runtime compatibility

Quickstart

Worked example: cholesterol primary prevention

Continuation modes

License

Author

Related

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
agents		agents
examples/cholesterol_primary_prevention		examples/cholesterol_primary_prevention
lib		lib
manifests		manifests
schema		schema
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

deep-research

Why this exists

What ships in v0.2.0

What's new in v0.2.0

Agent-runtime compatibility

Quickstart

Worked example: cholesterol primary prevention

Continuation modes

License

Author

Related

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages