GitHub - Hongcheng-Dong/Claw-AI-Lab: One dashboard. An entire research team.

Claw AI Lab: An Autonomous Multi-Agent Research Team

🔥 Updates

[2026.04.02]: Preview v1.1.0 — powered by Claw-Code Harness.
[2026.03.25]: Preview v1.0.0 - initial release.

🤔 What Is This?

Claw AI Lab is a lab-native multi-agent research platform for interactive and scalable AI-driven science. It enables users to create a full AI research lab from a single prompt, with customizable roles, research directions, and collaborative workflows, rather than relying on a single-agent or fixed serial pipeline. Claw orchestrates multiple agents and projects in parallel through a FIFO-based scheduling framework, maximizing compute utilization while supporting cross-project knowledge sharing and mutual improvement. Crucially, the system keeps humans in the loop: users can intervene whenever needed, provide feedback under ambiguity, inject new ideas, and iteratively refine the research process through rollback and continuation. Combined with a simple UI that reduces everything to prompts and clicks, Claw transforms automated research into a more intuitive, steerable, and laboratory-like experience.

We welcome contributions from the community to make this project better together!
You are warmly invited to scroll to the bottom of the page to join our group for beta testing and discussion.

🖥️ Claw AI Lab Dashboard

Launch projects, monitor agents, and inspect every artifact — all from a single interface.
_{Real-time event stream · Multi-project overview · One-click rollback & resume · Artifact inspector}

✨ Key Features

🖥️	Interactive UI	Real-time web dashboard with event stream, data shelf, and multi-project monitoring
🧬	Claw Code Harness	Reads your local codebases, datasets & checkpoints — writes runnable code back to disk
🔬	End-to-End Pipeline	One prompt → paper + code + figures + experiment logs, fully autonomous
🤝	Three Research Modes	Explore · Discussion (multi-agent debate) · Reproduce

🏆 Generated Project Showcase

Each project autonomously produces a full research deliverable: Paper · Code · Figures · Experiment Logs

OATH: Quantifying Video Hallucination via Occlusion Debt
_{Lab Explore · CV · Video Generation Evaluation}

_{Best method achieves 0.1714 primary error vs CLIP-T baseline 0.2393 (↓28%)}

Reproducing PhyCustom on FLUX
_{Reproduce · Image Gen · Multi-Concept Customization}

_{5 methods × 3 seeds = 15 runs; output-space decoupling edges at 0.2813}

🏆 Discussion Mode Showcase

Multi-agent discussion on: "What is the most deployable direction for Video Action Models in Embodied AI?"

Agent A — World Model + MPC (Model Predictive Control) is the most industrially stable path.

Agent B — "Train with video, infer with action" is the most deployable policy paradigm.

Agent C — Execution monitoring & SOP (Standard Operating Procedure) automation lands fastest as a product.

Consensus: The most deployable form is not a single end-to-end model, but a layered, modular system — use video supervision during training to learn rich dynamics, output actions directly at inference for low latency, and layer planning/MPC/safety modules on top for closed-loop robustness and recovery.

Top 3 Research Directions (ranked by deployability)

#	Direction	Deployability
1	Layered Video-Action Stack — video-action joint training + direct action inference + MPC safety	Highest — best balance of latency, interpretability & safety
2	Video-to-Plan / SOP — demo videos → step sequences & skill graphs for existing robots	High — smallest embodiment gap, clearest commercial path
3	Execution Monitor — real-time step tracking, anomaly detection, re-planning triggers	High — fastest to production; critical for industrial reliability

Key Contradictions Resolved

Debate	Resolution
World Model + MPC vs. Direct Action?	Combine both — world model for representation, direct action for control, MPC for safety
Human video: valuable or too much gap?	Pre-training yes; direct low-level transfer not yet reliable
Is monitoring a "real" action model?	Not the backbone, but fastest to reach production value

→ Full Transcript · → Consensus Synthesis

🚀 Quick Start

1. Install

git clone https://github.com/Claw-AI-Lab/Claw-AI-Lab.git
cd Claw-AI-Lab

# Create python environment
conda create -n clawailab python=3.11
conda activate clawailab

# Backend
cd backend/agent
pip install -e ".[all]"
pip install websockets

# Frontend
cd ../../frontend
npm install
cd ..

# ML dependencies
# You can add more packages based on your research project
pip install torch torchvision diffusers transformers accelerate safetensors datasets \
            huggingface_hub opencv-python pandas matplotlib scikit-image scipy einops tqdm

2. Configure

Fill in following configurations in examples/config_template.yaml:

llm:
  api_key: "your-api-key"
  primary_model: "gpt-5.4"
  coding_model: "gpt-5.4"
  image_model: "gemini-3-pro-image-preview"
  fallback_models:
    - "qwen3.5-plus"
    - "qwen-plus"

sandbox:
  python_path: "/path/to/your/python3"

Thanks a lot for KOKONI's support for this project, and api_key can be obtained here.

3. Run

./start.sh              # Start all services
./start.sh stop         # Stop
./start.sh restart      # Restart
./start.sh status       # Status check
./start.sh fresh        # Clean restart (reset all data)

Open http://localhost:5903/ → Submit your research topic and let the agents work.

💡 Tips to Get the Best Results

#	Recommendation	Why
1	Prepare local codebases, datasets & checkpoints — enter their paths when submitting a project	Avoids download delays and network failures during runs
2	Use a strong coding model like GPT 5.4	Significantly better code quality and fewer iteration cycles
3	Review the `IMPORTANT` fields in Configuration Details	Misconfigured API keys or resource limits are the #1 cause of failed runs

⚙️ Configuration Details

Every field in examples/config_template.yaml explained. Fields marked IMPORTANT are the ones you almost always need to set.

Click to expand full reference

# === Project ===
project:
  name: "my-project"              # Project identifier, used for directory naming and UI display
  mode: "full-auto"               # Pipeline mode: "full-auto" runs all stages without human gates

# === Research ===
research:
  topic: "Your research topic"    # The research topic or paper to reproduce (required)
  domains:                        # Research domains for literature search scope
    - "deep-learning"
  daily_paper_count: 5            # Number of papers to retrieve per search query
  quality_threshold: 3.0          # Minimum relevance score (1-5) for literature screening
  reference_papers: []            # List of reference paper titles or arXiv IDs

# === Notifications ===
notifications:
  channel: "console"              # Notification channel: "console" | "discord" | "slack"
  on_stage_start: true            # Notify when a stage begins
  on_gate_required: true          # Notify when human approval is needed

# === Knowledge Base ===
knowledge_base:
  backend: "markdown"             # Storage format: "markdown" | "obsidian"
  root: "docs/kb"                 # Root directory for knowledge base files

# === OpenClaw Bridge ===
openclaw_bridge:
  use_message: false              # Enable progress notifications via messaging platforms
  use_memory: false               # Enable cross-session knowledge persistence
  use_web_fetch: false            # Enable live web search during literature review

# === LLM ===
llm:
  provider: "openai-compatible"   # LLM provider: "openai-compatible" | "openai" | "deepseek" | "acp"
  api_key: "sk-your-key"          # ⚠️ **IMPORTANT** API key (or use api_key_env to read from environment)
  api_key_env: "RESEARCHCLAW_API_KEY"  # Environment variable name for API key (fallback)
  primary_model: "gpt-5.4"        # ⚠️ **IMPORTANT** Main model for research, analysis, and writing
  coding_model: "gpt-5.4"         # ⚠️ **IMPORTANT** Model for code generation (S11)
  image_model: "gemini-3-pro-image-preview"  # ⚠️ **IMPORTANT** Model for figure generation in paper
  fallback_models:                # Fallback model chain — used when primary model fails
    - "qwen3.5-plus"
    - "qwen-plus"

# === Security ===
security:
  hitl_required_stages: []        # Stage numbers requiring human approval (e.g. [5, 9, 20])

# === Experiment ===
experiment:
  mode: "sandbox"                 # Execution mode: "sandbox" (local Python) | "docker" | "simulated"
  time_budget_sec: 2400           # ⚠️ **IMPORTANT** Max wall-clock time per experiment run (seconds)
  max_iterations: 3               # Number of iterative refinement cycles in S15 (Edit-Run-Eval loop)
  metric_key: "primary_metric"    # Name of the primary evaluation metric
  metric_direction: "minimize"    # Optimization direction: "minimize" | "maximize"
  datasets_dir: ""                # ⚠️ **IMPORTANT** Absolute path to datasets directory
  checkpoints_dir: ""             # ⚠️ **IMPORTANT** Absolute path to model weights directory
  codebases_dir: ""               # Absolute path to reference codebases directory
  shared_results_dir: ""          # Directory for cross-project shared results
  paper_length: "long"            # Paper length: "short" (~4 pages) | "long" (~8 pages)
  sandbox:
    python_path: "/path/to/python3"  # ⚠️ **IMPORTANT** Python interpreter for running experiments
  sanity_check_max_iterations: 100   # Max fix attempts in S12 code testing

# === Prompts ===
prompts:
  custom_file: ""                 # Path to custom prompts YAML file (empty = use defaults)

🙏 Acknowledgement

We learned and reused code from the following projects: AutoResearchClaw, AutoResearch, claw-code.

We thank the authors for their contributions to the community!

📄 License

MIT — see LICENSE for details.

📌 Citation

If you find Claw AI Lab useful, please cite:

@misc{wu2026clawailab,
  author       = {Wu, Fan and Chen, Cheng and Tan, Zhenshan and Zhang, Taiyu and
                  Gao, Dingcheng and Zhu, Lanyun and Zhu, Qi and Tan, Yi and Ji, Deyi and 
                  Lin, Guosheng and Chen, Tianrun and Ye, Deheng and Liu, Fayao},
  title        = {Claw AI Lab: An Autonomous Multi-Agent Research Team},
  year         = {2026},
  url          = {https://github.com/Claw-AI-Lab/Claw-AI-Lab},
  note         = {GitHub repository}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claw AI Lab: An Autonomous Multi-Agent Research Team

🔥 Updates

🤔 What Is This?

🖥️ Claw AI Lab Dashboard

✨ Key Features

🏆 Generated Project Showcase

🏆 Discussion Mode Showcase

🚀 Quick Start

1. Install

2. Configure

3. Run

💡 Tips to Get the Best Results

⚙️ Configuration Details

🙏 Acknowledgement

📄 License

📌 Citation

💬 Community

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 178 Commits
assets		assets
backend		backend
examples		examples
frontend		frontend
.gitignore		.gitignore
README.md		README.md
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

Claw AI Lab: An Autonomous Multi-Agent Research Team

🔥 Updates

🤔 What Is This?

🖥️ Claw AI Lab Dashboard

✨ Key Features

🏆 Generated Project Showcase

🏆 Discussion Mode Showcase

🚀 Quick Start

1. Install

2. Configure

3. Run

💡 Tips to Get the Best Results

⚙️ Configuration Details

🙏 Acknowledgement

📄 License

📌 Citation

💬 Community

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages