Experimental runtime for safe tool-using agents.
A modular agent runtime for experimenting with tool-using LLM agents: ReAct-style execution, sandboxed tool calls, model routing, browser automation, memory services, and multi-service orchestration.
graph TD
User["User / Client"]
User --> Gateway["API Gateway"]
Gateway --> Runtime["Agent Runtime\n(ReAct Loop)"]
Runtime --> Sandbox["Tool Sandbox"]
Runtime --> Router["Model Router"]
Runtime --> Browser["Browser Automation"]
Runtime --> Memory["Memory Service"]
subgraph Security Layer
Sandbox
end
Sandbox --> FS["File System"]
Sandbox --> Shell["Shell"]
Sandbox --> HTTP["HTTP"]
Router --> Gemini["Gemini"]
Router --> Anthropic["Anthropic"]
Router --> OpenAI["OpenAI"]
Memory --> VectorStore["Vector Store"]
Memory --> History["Conversation History"]
brain-agent-platform/
├── gateway/ # HTTP + WebSocket gateway, session routing, rate limiting
├── runtime/ # ReAct agent loop, model routing, cost tracking
├── enterprise-runtime/ # Higher-level cognitive modules and adaptive routing experiments
├── agents/ # Critic, decomposer, repair, and strategy modules
├── core/ # Shared runtime primitives: budget, state, executor, etc.
├── sandbox/ # Tool execution sandbox: filesystem, shell, git, browser-related tools
├── browser-relay/ # Playwright browser automation service
├── desktop-relay/ # Desktop control and screenshot/vision relay
├── brain-os/ # Local OS daemon for process/window/filesystem control
├── os-control/ # OS control abstraction layer
├── vision-service/ # Optional Python vision pipeline
├── cron-service/ # Scheduled task service
├── memory-service/ # Memory management API
├── messaging-service/ # Messaging gateway experiments
├── device-relay/ # Camera/screen/location relay experiments
├── ui/ # React web chat interface
├── shared/ # Shared config, logging, and error utilities
├── tools/ # Tool implementations
├── validators/ # Goal checking, quality checks, schemas
├── protocols/ # LLM action schemas
├── tests/ # Unit/integration/e2e tests
├── scripts/ # Smoke tests and utilities
└── docs/ # Runbooks, component maps, integration notes
| Service | Port | Purpose |
|---|---|---|
| Gateway | 4000 | HTTP/WebSocket entry point, session routing, rate limiting |
| Runtime | 3001 | Agent loop, model routing, tool orchestration, cost tracking |
| Browser Relay | 3003 | Playwright browser automation pool |
| Desktop Relay | 4003 | Screenshot, input, and optional vision integration |
| Brain OS Daemon | 8787 | Local process/window/filesystem control behind approval policy |
| Vision Service | 5100 | Optional object detection / vision pipeline |
| Cron Service | 4004 | Scheduled tasks |
| UI | 5173 | React web chat interface |
Requirements:
- Node.js 18+
- npm
- Python 3.8+ for optional vision-service components
# 1. Start OS daemon
cd brain-os
npm install
npm run build
BRAIN_APPROVAL_TOKEN=DEV-OK npm start
# 2. Start runtime and gateway
cd ..
npm install
node runtime/server.js
node gateway/server.js
# 3. Run smoke test
BRAIN_APPROVAL_TOKEN=DEV-OK ./scripts/e2e_smoke.shSee docs/RUN_LOCAL.md for the full local setup.
npm test
npm run test:unit
npm run test:integration
BRAIN_APPROVAL_TOKEN=DEV-OK ./scripts/e2e_smoke.shThe most important tests to keep strong are:
- sandbox permission boundaries;
- tool execution safety;
- gateway/runtime integration;
- model routing behavior;
- end-to-end agent task execution;
- failure recovery and retry paths.
The project is designed around the assumption that tool-using agents need explicit execution boundaries.
Key design goals:
- require approval tokens for sensitive local actions;
- isolate service responsibilities;
- keep filesystem, shell, browser, and OS-control tools behind sandbox modules;
- make tool execution auditable;
- treat destructive operations as policy-gated actions;
- separate runtime orchestration from tool execution.
This is not presented as production-hardened security software. It is a working research/prototype platform for studying agent runtime design and safer tool execution patterns.
lint,format, anddocsscripts are stubs (echo-only) and need real implementations.- The platform is a prototype. Security boundaries are designed but not production-hardened.
- Browser relay requires local Playwright installation.
- No production authentication or multi-tenant support.
- Cost tracking is experimental.
- Enterprise runtime modules are early-stage experiments.
Add CI status badge and GitHub Actions workflow.✅Add architecture diagram.✅- Add example tasks for model routing and browser automation.
- Expand unit tests around sandbox and policy enforcement.
docs/DESIGN.md— architecture decisions, trade-offs, and lessons learneddocs/RUN_LOCAL.md— local setup and runbookdocs/COMPONENT_MAP.md— component mapdocs/INTEGRATION_OS_DAEMON.md— OS daemon integration notes
- Agent runtime architecture with tool execution boundaries.
- Sandboxed tool use with policy-gated actions.
- Multi-service orchestration (gateway, runtime, browser, desktop, memory, cron).
- Local-first development with auditable tool execution.
MIT