CI (main) Live Reports Python Install License
Let's Bring Brain To Robots
Visible robotics demos driven by VLM policies, OpenClaw, and AI coding agents.
Roboclaws is a thin demo repo for making AI-driven robotics behavior reviewable: frames, maps, tool traces, scores, and public/private evaluation boundaries are published as HTML reports instead of buried in terminal logs.
It answers three practical questions:
- How can an AI agent drive a robot?
- What context and tools does the agent need?
- What did the agent actually do in the simulated or robot-backed world?
Roboclaws treats reusable robot behavior as skills first and MCP tools as a bounded public robot capability surface.
| Principle | Practice |
|---|---|
| Start from open-ended goals | A user asks for work such as "clean the room" or "take useful photos"; an agent selects or creates a skill to do it. |
| Keep strategy in skills | Skills own prompt strategy, scripts, examples, checks, and task-specific loops such as photo capture or cleanup. |
| Keep MCP bounded | MCP tools expose semantic robot capabilities like observe, move, pick, place, and done; they should not hide a whole task behind one opaque call. |
| Profile public capabilities | Semantic profiles describe the public tool contract for a backend/domain, currently ai2thor_navigation_v1 and molmospaces_cleanup_v1. |
| Label privileged help | Simulator or demo helpers such as full object inventory and target-relative teleport are useful, but they stay labeled as privileged tools, not canonical robot abilities. |
| Protect private evaluation truth | Hidden mess sets, acceptable destinations, private manifests, and scoring truth stay out of public profile metadata and agent-facing skill inputs. |
| Let reports improve skills | Traces, artifacts, and evals feed the skill lifecycle: improve, split, merge, prune, or promote behavior only when the boundary is stable. |
The working abstraction ladder is:
open-ended goal
-> agent skill
-> composite action
-> semantic capability
-> environment primitive
-> execution backend
Default decision: improve or add a skill. Promote behavior into MCP only when multiple skills need it, the input/output shape is stable, public/private boundaries are clear, and traces can preserve the important substeps. The detailed profile and skill reference is docs/human/mcp-skills-and-semantic-profiles.md.
Install the project once:
uv sync --extra dev --extra openclawFor MolmoSpaces/MuJoCo cleanup demos, include the heavier extra:
uv sync --extra dev --extra molmospacesThe public command grammar is:
just task::run <task> <driver> [report|profile] [key=value ...]For full command routing, profiles, and maintainer-only recipes, read just/README.md.
GitHub Actions publishes the report site at
miaodx.com/roboclaws. If a link looks stale,
check the CI workflow:
Pages republishes from successful main runs.
| Demo | What it proves | Run it locally | Live CI report |
|---|---|---|---|
| AI2-THOR territory | Multiple robots compete for reachable cells in an iTHOR scene. | Local VLM route is being repaired; use mock/OpenClaw reports for now. | mock, Kimi smoke, OpenClaw |
| AI2-THOR coverage | Multiple robots cooperate to cover as much of the room as possible. | just task::run coverage vlm visual agents=2 steps=100 |
mock, Kimi smoke, OpenClaw |
| OpenClaw navigation | OpenClaw Gateway agents control robots through the shared Roboclaws APIs. | just task::run ai2thor-nav openclaw visual |
openclaw/demo/report.html |
| Coding-agent MCP control | Docker-backed Codex or Claude Code drives the robot directly through MCP tools. | just task::run ai2thor-nav codex visual or just task::run ai2thor-nav claude visual |
Local-only today; reports write to output/runs/<stamp>/. |
| Photo task | A robot navigates the room and photographs chairs/sofas. | just task::run photo-chairs openclaw visual |
Local/OpenClaw report artifact. |
| MolmoSpaces cleanup | A cleanup agent tidies a generated household mess while private scoring stays hidden. | just task::run molmo-cleanup direct world-labels seed=7 generated_mess_count=5 |
Molmo live index, Kimi K2.6, MiMo v2.5 Pro, MiMo v2 Omni |
| MolmoSpaces live agent | Docker-backed Claude Code or Codex connects to the cleanup MCP server and produces the same cleanup report shape. | just task::run molmo-cleanup claude world-labels seed=7 generated_mess_count=5 |
Same Molmo live index; CI currently runs Claude Code through Kimi/MiMo provider profiles. |
| Railway appliance | Single-container hosted demo with UI, viewer, Gateway, and AI2-THOR. | DEMO_PASSWORD=demo just appliance::run local |
Local appliance surface. |
| Maintainer gate | Fast mock confidence check before shipping repo changes. | just agent::verify mock |
CI status: workflow |
See ARCHITECTURE.md for the code map and the full operating mode contract.
| Need | Read |
|---|---|
| Code map and operating modes | ARCHITECTURE.md |
| Human setup/runbooks/domain docs | docs/human/README.md |
| Detailed MCP profile reference | docs/human/mcp-skills-and-semantic-profiles.md |
| Skill library convention | skills/README.md |
| Public command grammar | just/README.md |
| Local keys and report artifacts | docs/human/local-runtime.md |
| Coding-agent navigation guide | docs/human/coding-agent-nav-server.md |
| MolmoSpaces settings | docs/human/molmospaces-settings.md |
| Current project focus | STATUS.md |
| Agent operating rules | AGENTS.md |
- Roboharness - visual testing harness for AI coding agents in robot simulation
- Robowbc - whole-body-control experiments
- OpenClaw - open-source personal AI assistant
- ROSClaw - OpenClaw to ROS 2 bridge
- AI2-THOR - interactive 3D indoor simulation
MIT