Skip to content

Grizaceo/agentic-lab-eac

agentic-lab-eac

Generic framework for building co-scientist-style lab workflows.

tests license python

This project combines two ideas:

  • Google Co-Scientist style multi-agent scientific reasoning: generate -> review -> rank -> evolve -> summarize.
  • Experiment-as-Code (EaC) style execution: declarative experiment specs -> static checks -> execution plan -> runtime checks.

The goal is to keep the framework generic first, so a lab can be plugged in through config and adapters instead of rewriting the core for every domain.

Design principles

  1. Domain-agnostic core
    • hypotheses, reviews, tournaments, experiment specs, and lab state are generic models.
  2. Clean boundaries
    • scientific reasoning, experiment compilation, scheduling, and device execution are separated by ports.
  3. Declarative labs
    • each lab declares capabilities, resources, safety constraints, and preferred evidence sources in YAML.
  4. Human-in-the-loop by default
    • the framework proposes, validates, and plans; it does not silently run risky physical actions.
  5. Portable artifacts
    • hypotheses, reviews, experiment specs, and plans are serializable JSON/YAML so they can be versioned and moved across labs.

References

  • Co-Scientist: Nature 10.1038/s41586-026-10644-y; arXiv 2502.18864.
  • Experiment-as-Code Labs: arXiv 2605.04375.

A Concept → module map lives in docs/ARCHITECTURE.md.

What is implemented in v0.1

  • Core data models for hypotheses, reviews, experiment specs, execution plans, lab state, and resources.
  • Ports/interfaces for literature providers, hypothesis agents, reviewers, rankers, compilers, schedulers, and executors.
  • A supervisor orchestrator that coordinates the reasoning loop.
  • An EaC compiler that turns declarative experiment specs into a simple execution plan with static checks.
  • A minimal CLI to inspect lab configs and compile example experiment specs.
  • A generic protein-lab example config showing how to register constraints and capabilities.

Hardening shipped on top of v0.1

  • ExecutionPlan distinguishes errors (block execution) from warnings and carries status + spec_hash.
  • Declarative SafetyHooks (deny / require_approval) replace string rules for blocking checks.
  • Compiler enforces dependency graph (missing-step + cycle detection) and required capabilities per step.
  • CLI: validate-lab, validate-spec, dry-run (non-zero exit on errors).
  • adapters/null/ provides heuristic / no-op agent implementations for smoke runs without LLMs.

What is not implemented yet

  • Real LLM adapters
  • Real literature search adapters
  • Real device/instrument drivers
  • Real scheduling heuristics
  • Persistent run store / experiment database
  • Async workflow execution engine

This is intentional. v0.1 is the generic skeleton, not a fake demo that pretends to run a wet lab.

Repository layout

  • src/agentic_lab_eac/models.py — core entities and DTOs
  • src/agentic_lab_eac/ports.py — clean-architecture interfaces
  • src/agentic_lab_eac/orchestrator.py — supervisor loop
  • src/agentic_lab_eac/eac.py — declarative experiment compiler and scheduler stubs
  • src/agentic_lab_eac/cli.py — local CLI
  • examples/labs/ — lab capability definitions
  • examples/goals/ — example experiment specs
  • docs/ — architecture notes and research grounding

Install

git clone https://github.com/Grizaceo/agentic-lab-eac.git agentic-lab-eac
cd agentic-lab-eac
pip install -e ".[dev]"

Python 3.10+ required. Only one runtime dependency (PyYAML).

Quick start

# inspect a lab
agentic-lab-eac show-lab examples/labs/protein_lab.yaml

# validate without compiling
agentic-lab-eac validate-spec examples/labs/protein_lab.yaml \
                              examples/goals/proteinlab_hit_to_hypothesis.yaml

# compile a declarative experiment spec into an execution plan
agentic-lab-eac dry-run examples/labs/protein_lab.yaml \
                        examples/goals/proteinlab_hit_to_hypothesis.yaml

dry-run prints the plan as JSON and exits non-zero if any compile errors are found (missing resources, capability mismatches, dependency cycles, denied safety hooks).

Next intended milestones

  1. Add real literature adapter interfaces for PubMed/arXiv/local corpus.
  2. Add pluggable reviewer/ranker adapters backed by Hermes tool calls or external LLMs.
  3. Add per-lab constraint packs and experiment DSL validation.
  4. Add persistent run history and negative-results memory.
  5. Add adapter packages for specific labs: protein-lab, financial-lab, cyber-lab.

About

Generic Co-Scientist + Experiment-as-Code framework for reusable AI-driven labs

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages