Evidence-bound research pipeline
Dual-source retrieval · citation-native synthesis · critique loop · prompt audit
github.com/ArttuAn/multi-agent-research-system
This repository implements CiteGraph (the GitHub slug predates the name).
Illustrative mockup of the demo layout — run streamlit run app.py for the live UI.
CiteGraph is a narrow take on “agent research”: it is built for claims you can trace. The graph merges open web (Tavily) and scholarly metadata (OpenAlex) into one numbered source index ([Wn] / [Pn]), forces the writer to cite inline, runs a structured critique (hallucination risk, issues, revision guidance), and can loop until approval or a cap—plus prompt-trace PDF and LangSmith for audit.
Default demo topic: “AI regulation in Europe 2026” (edit freely in the UI).
| Typical stack | CiteGraph |
|---|---|
| One RAG corpus or vague “search the web” | Dual corpus: news/pages and OpenAlex works, unified index |
| Free-form report | Citation-native draft tied to [Wn]/[Pn] tokens |
| Optional “self-critique” prose | Schema-locked critic (approved, risk, issues, guidance) + conditional revise edge |
| Black-box runs | Per-step prompt audit PDF, Streamlit step I/O, LangSmith spans per agent |
| Ad-hoc agent code | Repeatable agent folders with skills, hooks, guardrails, episodic memory (AGENTS.md) |
This uses LangGraph for explicit graph state, conditional edges, and iteration—patterns closer to production agent orchestration than a single ReAct loop. CrewAI is a reasonable alternative for role-based crews; this repo standardizes on LangGraph for the graph-native control flow.
Orchestration is LangGraph (research_system/graph.py). Each graph node delegates to a LangChain runnable in research_system/agents/, named for LangSmith (run_name + tags):
| LangGraph node | LangChain agent | Role |
|---|---|---|
gather_web |
WebResearchAgent |
Tavily retrieval (RunnableLambda) |
gather_papers |
PaperResearchAgent |
OpenAlex works search (no API key) |
prepare_sources |
SourceBundlerAgent |
Builds shared [Wn] / [Pm] source index |
synthesize |
SynthesisAgent |
ChatPromptTemplate → ChatOpenAI → StrOutputParser |
critique |
CritiqueAgent |
Prompt → structured output (CritiqueResult) |
finalize |
FinalizeAgent |
Assembles final markdown (no LLM) |
Import from code: from research_system.agents import web_research_agent, synthesis_agent, ....
Per-agent layout (skills, memory, hooks, guardrails): see AGENTS.md and each research_system/agents/*/AGENT.md.
flowchart LR
A[gather_web] --> B[gather_papers]
B --> C[prepare_sources]
C --> D[synthesize]
D --> E[critique]
E -->|revise| D
E -->|finalize| F[finalize]
- gather_web: Tavily
searchAPI (advanced depth). - gather_papers: OpenAlex
workssearch API — no API key. SetOPENALEX_MAILTOin.envto your email for the polite pool (better rate limits). The client retries with backoff on HTTP 429. - prepare_sources: Builds a numbered index
[W1]…,[P1]…so the writer and critic share the same evidence bundle. - synthesize: LLM emits markdown with mandatory inline
[Wn]/[Pm]citations. - critique: Structured output (
approved,hallucination_risk,issues,revision_guidance) comparing the draft to the source index only. - finalize: Appends critique summary to the delivered report.
- Create a project and API key at smith.langchain.com.
- In
.env(or Streamlit secrets), use either the names from the LangSmith UI or the LangChain-style names (the SDK checksLANGSMITH_*first, thenLANGCHAIN_*):
# Same as LangSmith “Tracing” onboarding:
LANGSMITH_TRACING=true
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_PROJECT=citegraph
LANGSMITH_ENDPOINT=https://api.smith.langchain.com# Equivalent legacy names:
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langsmith_api_key
LANGCHAIN_PROJECT=citegraphLANGSMITH_PROJECT / LANGCHAIN_PROJECT must match the project name you selected in the LangSmith sidebar (e.g. citegraph), or traces will land in another project / default.
- Run the app or
run_research(...). Traces show the LangGraph run with nested agent spans (SynthesisAgent,CritiqueAgent, etc.) and LLM calls.
Versus the LangSmith docs snippet: you do not need @traceable on every function here. This repo uses LangChain runnables (RunnableLambda, ChatPromptTemplate | ChatOpenAI, etc.); they emit runs automatically when tracing is enabled.
langsmith is listed in requirements.txt. The Streamlit app clears LangSmith’s env-var cache after each load_dotenv so .env edits apply on refresh; if traces still look stale, restart Streamlit once.
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS/Linux
pip install -r requirements.txtCreate a .env file in the project root (it is gitignored). Use the variables below and in the LangSmith section above.
Required
TAVILY_API_KEY— TavilyOPENAI_API_KEY— OpenAI API
Optional
OPENAI_MODEL(defaultgpt-4o-mini),OPENALEX_MAILTO(your email; recommended for OpenAlex)- LangSmith:
LANGSMITH_TRACING=true,LANGSMITH_API_KEY,LANGSMITH_PROJECT, … or the equivalentLANGCHAIN_*names (see LangSmith section) - Optional:
AGENT_MEMORY_DIR— directory for episodic JSONL store (defaultdata/agent_memory/under the project)
pytest tests/ -vCoverage (no live APIs): OpenAlex/Tavily clients (mocked HTTP), graph compile and node list, should_revise routing, execution trace merge / step records, source bundler + finalize agents, prompt formatters, CritiqueResult schema, LangSmith feed latency helper. CI runs the same suite on push/PR.
streamlit run app.py- Push this repo to GitHub.
- New app → select the repo, main file
app.py, Python 3.11+. - Under Secrets, add:
TAVILY_API_KEY = "..."
OPENAI_API_KEY = "..."
# Optional — better OpenAlex rate limits:
# OPENALEX_MAILTO = "you@example.com"from research_system.graph import run_research
state = run_research("AI regulation in Europe 2026", max_iterations=3)
print(state["final_report"])Each run accumulates state["prompt_trace"]: per-step before (Tavily/OpenAlex request shape, exact synthesis/critique system + human strings) and after (results or model output previews). Export:
from pathlib import Path
from research_system.graph import run_research
from research_system.prompt_trace_pdf import build_prompt_trace_pdf
state = run_research("AI regulation in Europe 2026", max_iterations=3)
pdf_bytes = build_prompt_trace_pdf(state["topic"], state.get("prompt_trace") or [])
Path("trace.pdf").write_bytes(pdf_bytes)The PDF builder tries to cache DejaVu Sans under data/fonts/ (gitignored) for Unicode; if download fails, it falls back to Latin-1 with replacements.
MIT License — see LICENSE for the full text.