Generic framework for building co-scientist-style lab workflows.
This project combines two ideas:
- Google Co-Scientist style multi-agent scientific reasoning: generate -> review -> rank -> evolve -> summarize.
- Experiment-as-Code (EaC) style execution: declarative experiment specs -> static checks -> execution plan -> runtime checks.
The goal is to keep the framework generic first, so a lab can be plugged in through config and adapters instead of rewriting the core for every domain.
- Domain-agnostic core
- hypotheses, reviews, tournaments, experiment specs, and lab state are generic models.
- Clean boundaries
- scientific reasoning, experiment compilation, scheduling, and device execution are separated by ports.
- Declarative labs
- each lab declares capabilities, resources, safety constraints, and preferred evidence sources in YAML.
- Human-in-the-loop by default
- the framework proposes, validates, and plans; it does not silently run risky physical actions.
- Portable artifacts
- hypotheses, reviews, experiment specs, and plans are serializable JSON/YAML so they can be versioned and moved across labs.
- Co-Scientist: Nature 10.1038/s41586-026-10644-y; arXiv 2502.18864.
- Experiment-as-Code Labs: arXiv 2605.04375.
A Concept → module map lives in docs/ARCHITECTURE.md.
- Core data models for hypotheses, reviews, experiment specs, execution plans, lab state, and resources.
- Ports/interfaces for literature providers, hypothesis agents, reviewers, rankers, compilers, schedulers, and executors.
- A supervisor orchestrator that coordinates the reasoning loop.
- An EaC compiler that turns declarative experiment specs into a simple execution plan with static checks.
- A minimal CLI to inspect lab configs and compile example experiment specs.
- A generic protein-lab example config showing how to register constraints and capabilities.
ExecutionPlandistinguisheserrors(block execution) fromwarningsand carriesstatus+spec_hash.- Declarative
SafetyHooks (deny/require_approval) replace string rules for blocking checks. - Compiler enforces dependency graph (missing-step + cycle detection) and required capabilities per step.
- CLI:
validate-lab,validate-spec,dry-run(non-zero exit on errors). adapters/null/provides heuristic / no-op agent implementations for smoke runs without LLMs.
- Real LLM adapters
- Real literature search adapters
- Real device/instrument drivers
- Real scheduling heuristics
- Persistent run store / experiment database
- Async workflow execution engine
This is intentional. v0.1 is the generic skeleton, not a fake demo that pretends to run a wet lab.
src/agentic_lab_eac/models.py— core entities and DTOssrc/agentic_lab_eac/ports.py— clean-architecture interfacessrc/agentic_lab_eac/orchestrator.py— supervisor loopsrc/agentic_lab_eac/eac.py— declarative experiment compiler and scheduler stubssrc/agentic_lab_eac/cli.py— local CLIexamples/labs/— lab capability definitionsexamples/goals/— example experiment specsdocs/— architecture notes and research grounding
git clone https://github.com/Grizaceo/agentic-lab-eac.git agentic-lab-eac
cd agentic-lab-eac
pip install -e ".[dev]"Python 3.10+ required. Only one runtime dependency (PyYAML).
# inspect a lab
agentic-lab-eac show-lab examples/labs/protein_lab.yaml
# validate without compiling
agentic-lab-eac validate-spec examples/labs/protein_lab.yaml \
examples/goals/proteinlab_hit_to_hypothesis.yaml
# compile a declarative experiment spec into an execution plan
agentic-lab-eac dry-run examples/labs/protein_lab.yaml \
examples/goals/proteinlab_hit_to_hypothesis.yamldry-run prints the plan as JSON and exits non-zero if any compile errors are
found (missing resources, capability mismatches, dependency cycles, denied
safety hooks).
- Add real literature adapter interfaces for PubMed/arXiv/local corpus.
- Add pluggable reviewer/ranker adapters backed by Hermes tool calls or external LLMs.
- Add per-lab constraint packs and experiment DSL validation.
- Add persistent run history and negative-results memory.
- Add adapter packages for specific labs: protein-lab, financial-lab, cyber-lab.