XR — The AI Agent You Can Actually Trust

▀▄▀ █▀█
█░█ █▀▄

XR — The AI Agent You Can Actually Trust

BYOK · local-first · spend-capped · tamper-evident · memory engine · research engine · voice stack · plugin platform · MCP-ready · multi-agent runtime · supervisor workflows · offline-capable · safe computer control · universal provider engine

You bring the key. We ship none. XR runs on your provider API key or your local model — it costs us $0 to maintain and you $0 to trust.

🚀 Install XR

# Linux / macOS / Termux / WSL
curl -fsSL https://raw.githubusercontent.com/ahmadrrrtx/xr/main/install.sh | bash

# Windows PowerShell
iex (irm https://raw.githubusercontent.com/ahmadrrrtx/xr/main/install.ps1)

# After install — first time setup
xr onboarding        # guided setup wizard (incl. memory + optional voice)
xr doctor            # full health check (incl. memory + research + voice + plugins)
xr plugins search    # discover safe, permissioned plugins
xr agents list       # inspect the built-in multi-agent workforce
xr agents plan "refactor this repo safely"
xr voice setup       # optional local-first voice setup
xr "hello, XR"       # run your first task
xr --tui             # open interactive terminal UI
xr serve             # start local dashboard + chat in browser

✨ What Makes XR Different

	Most AI agents	XR
Provider	locked to vendor	BYOK — any of 20+ providers, or fully local via Ollama, LM Studio, llama.cpp, Jan, LocalAI, vLLM, GPT4All, KoboldCPP, Text Generation WebUI, SGLang
Cost	"soft" warnings	hard ceiling enforced in code (`checkBeforeStep()`)
Security	trust us	deterministic injection benchmark, signed block-rate report
Audit	scrollback only	SHA-256 hash chain — tamper-evident, offline, free
Terminal UI	raw prompts	Claude Code–style TUI — spinner, history, status bar, slash commands
Browser UI	cloud dashboard	self-hosted chat + dashboard at `localhost:3141`
Computer Control	wild west	safe-by-construction — classify → preview → approve → audit
Multi-step planner	hidden prompts	typed Action[] schema validated with Zod, every step previewed
Plan memory	none	cached deterministic plans — second run skips the LLM
Durable memory	silent auto-save, creepy	explicit-by-default — XR only remembers what you ask; live "remember this?" with consent
Memory recall	injects everything, opaque	explainable — shows match-% + why; conservative floor, never floods the prompt
Memory hygiene	grows forever	TTL/expiry + prune + access tracking — see what's stale, delete permanently
Research	answer-first summaries	source-first Research Engine — live discovery, trust/freshness ranking, evidence ledger, claims, contradictions, signed reports
Dashboard	cloud-only	127.0.0.1 only, token-authed, live approvals, no telemetry
Voice	silent cloud listener	Stage 8 Voice Stack — disabled by default, push-to-talk default, local Whisper/Piper/Kokoro/system adapters, explicit cloud consent
Extensibility	arbitrary packages or hardcoded integrations	Stage 10 Plugin Platform + Stage 11 MCP Platform — first-class standardized MCP (tools, resources, prompts), opt-in servers, explicit permissions & trust, health checks, approval-gated invocation, clean lifecycle
Multi-agent orchestration	one big agent with tool spam	Stage 12 Multi-Agent Runtime — supervisor, planner, researcher, builder, reviewer, executor, synthesizer, memory manager, security checker
Runtime	procedural script	AI OS Kernel with DI, Lifecycle management, and a persisted multi-agent workflow store

🖥️ Stage 5 — User Interfaces

Stage 5 gives XR a complete, polished UI layer across every user-facing surface. XR now feels like a real product.

Terminal Interface (Claude Code–style TUI)

xr --tui

▀▄▀ █▀█  XR — The AI Agent You Can Actually Trust
█░█ █▀▄  by @rrrtx · local-first · BYOK · spend-capped · secure

  Project: my-project  Stack: TypeScript, Bun

  provider: ollama  │  model: qwen2.5:7b  │  mode: agent

  Type a message to talk to XR. Use /help to see all commands.
  Quick: /ask  /plan  /model  /status  /dashboard

  xr [agent] ›  _

What the TUI provides:

✦ Claude Code–style star-burst spinner while XR thinks (· ✻ ✽ ✶ ✳ ✢)
✦ Command history navigation (↑/↓ arrows, 200 entries)
✦ Provider / model / mode / budget status bar on every prompt
✦ Structured tool-call display with live status icons
✦ Ctrl+C interrupt → safe recovery (not immediate exit)
✦ xr.md / .xrrc / CLAUDE.md project context auto-load
✦ Graceful non-TTY degradation

All slash commands — grouped by category:

Chat & Tasks         /ask  /plan  /mode  /model  /budget
Navigation           /dashboard  /chat
System               /status  /doctor  /cost  /index  /help  /clear  /exit
Tools                /memory  /shell
Security             /attacks  /verify-log  /export
Local AI             /skills

Browser Chat Interface

xr serve
# Opens: http://localhost:3141/chat?token=<TOKEN>

A full ChatGPT-style chat interface running locally in your browser:

Streaming SSE responses, token-by-token
XR branding — not a generic template
Slash command hint chips: /plan /status /research /ask /memory /budget
Markdown rendering (code blocks, bold, inline code)
Typing indicator, clear button, keyboard shortcuts
Graceful error states with recovery tips

Browser Dashboard

xr serve
# Opens: http://localhost:3141/?token=<TOKEN>

A mission-control dashboard with 12 navigation panels:

Panel	What it shows
Dashboard	4 stat cards (spend, security score, audit chain, skills) + provider health + local AI + memory + recent audit
Chat	Full streaming chat UI
Status	Complete system health grid
Providers	All 12+ providers with status, tier, key configuration
Models	Local runtime status, installed models
Memory	Health cards (total/expired/never-recalled), live search, all entries with inline delete, expiry badges
Research	Research mode quick reference
Plugins	Installed plugins, permissions, enabled state, trust, health, catalog search, enable/disable/remove actions
Voice	Voice control quick reference
Security	Injection lab (run-on-demand), egress list, security posture
Audit Log	Full SHA-256 chain with integrity badge
Settings	Privacy, budget, approval gates, CLI reference

Dashboard UX features:

Command palette: press ? or ⌘K
Keyboard shortcuts: g d = Dashboard, g c = Chat, g s = Security, g a = Audit
Live 30-second auto-refresh
Toast notifications for all async actions
Zero external dependencies — works fully offline

Design System (`src/ui/`)

All terminal styling now routes through a single design-token layer:

src/ui/
  theme.ts      Brand palette, ANSI codes, CSS variables, spinner frames
  spinner.ts    Spinner, ProgressBar, StepTracker
  layout.ts     banner(), kv(), table(), box(), helpPanel(), notify()
  index.ts      Public re-export

XR brand identity:

Primary: #00D4FF cyan
Success / local: #00FF88 green
Warning / cloud: #F59E0B amber
Background: #0A0A0F
Logo: ▀▄▀ █▀█ / █░█ █▀▄ (block-char ASCII art spelling XR)

🧠 Stage 6 — The Memory Engine

Stage 6 makes XR stateful, personal, and durable — without becoming creepy, noisy, or unsafe. XR now remembers your preferences, projects, workflows, and long-lived context in a way that is transparent and controllable.

Design principles

Explicit by default — XR only remembers what you ask it to. No silent auto-save.
Local-first — all memory lives in ~/.xr/xr.db. Nothing leaves your machine.
Explainable — recall shows the match percentage and why each entry surfaced.
Reversible — everything is editable, deletable, exportable, and importable.
Non-creepy — conservative retrieval; session summaries are off by default.
Private — audit log stores content length only, never raw content; secrets redacted.
Kill-switch — XR_MEMORY_DISABLED=1 turns memory off completely.

Memory commands

# Inspect
xr memory                          # status + counts by category
xr memory list [--scope s] [--category c] [--json]
xr memory search "typescript"      # keyword search
xr memory recall "what runtime"    # EXACTLY what XR would surface + WHY (match %)
xr memory health                   # expired, never-recalled, by category
xr memory reindex                  # warm the semantic-recall cache

# Write (explicit, consent-gated)
xr memory add "I prefer TypeScript" --category preference
xr memory add "temp note" --ttl 3600          # expires after 1 hour
xr memory add "tmp" --ttl-days 7              # expires after 7 days
xr memory edit mem_ab12 "new text"
xr memory remove mem_ab12                      # permanent
xr memory clear [--scope s]                    # permanent

# Maintain
xr memory prune                   # permanently delete expired entries
xr memory summarize [--days 30]   # fold old, low-importance entries into compact summaries

# Portability
xr memory export [path]           # JSON bundle
xr memory import <path>           # merge (dedupes; drops stale entries)

Live capture — "remember this?"

In the TUI, chat, and voice, XR intercepts memory intents naturally:

you: remember I prefer dark mode for everything
xr: ✓ remembered: I prefer dark mode for everything (preference)

you: what do you remember about my preferences?
xr: here's what I remember (1):
       • I prefer dark mode for everything

you: don't remember my email address
xr: ✓ got it — I won't remember that.           # no prompt needed (reduces stored data)

you: forget the note about vim
xr: ✓ forgotten 1 entry.

Durable adds ask for consent first (autoSuggest).
"Don't remember" and "forget" never prompt — reducing stored data is always honoured.
"What do you remember" is read-only recall.

Memory categories & scopes

Category	Use for	Surfaced in recall?
`preference`	coding style, provider, tools	✅
`project`	long-running project context	✅
`workflow`	repeated procedures	✅
`fact`	stable long-term facts	✅
`exclusion`	do-not-remember rules	❌ (never surfaced)

Scopes separate personal memory (global) from project memory (per-directory), so your TypeScript preferences don't leak into a Python project.

Retention, expiry & hygiene

Per-entry TTL — --ttl / --ttl-days make an entry auto-expire.
Expired = forgotten — expired entries are excluded from recall and list (visible only with --include-expired / xr memory list flag in code).
xr memory prune — permanently deletes expired entries.
Access tracking — every entry records lastAccessedAt + accessCount, so xr memory health can show you what's never been used.
Import safety — importing an already-expired entry drops it (no silent resurrection of stale memory).

Explainable retrieval

Recall is never a black box. Every surfaced entry comes with a score and a reason:

🧠 Recall "what typescript runtime" (1)
  mem_177cce4b 27% preference I prefer TypeScript and Bun for backend work
      why: lexical match 27% · scope=global

Retrieval uses semantic embeddings (Ollama nomic-embed-text) when available, with an automatic lexical fallback so it always works — even fully offline.

Session summaries (opt-in)

Conversations can fold into compact session summaries, kept in a separate store so the agent never confuses ephemeral chat recaps with durable facts. Off by default (memory.saveSessionSummaries).

Dashboard memory panel

The dashboard Memory panel now includes health cards (total / expired / never-recalled), a live search box, expiry badges, and inline delete. xr doctor includes a memory-health row.

Memory config

// ~/.xr/config.json
{
  "memory": {
    "enabled": true,              // master switch
    "autoSuggest": true,          // offer to remember in chat/voice (asks first)
    "injectInChat": true,         // inject relevant memory into prompts
    "recallLimit": 5,             // max entries surfaced per prompt
    "semanticRecall": true,       // embeddings (with lexical fallback)
    "autoExpireDays": 0,          // 0 = never auto-expire
    "saveSessionSummaries": false,// off by default (non-creepy)
    "sessionSummaryMinTurns": 6
  }
}

🔬 Stage 7 — The Research Engine

Stage 7 turns XR from a chat wrapper into a source-first research system. XR gathers sources before forming conclusions, tracks where every claim came from, marks uncertainty, detects contradictions, and exports signed research artifacts.

Research principles

Source-first, not answer-first — sources are discovered, ranked, fetched, and checked before synthesis.
No fabricated citations — reports only cite collected source IDs like [s1]; unknown source IDs are stripped.
Evidence ledger — every evidence block tracks source, quote, claim kind, confidence, strength, verification state, and extraction time.
Claim ledger — supported, weak, unverified, and contested claims are tracked separately from prose.
Contradiction log — disagreements are surfaced instead of hidden.
Freshness-aware — sources track Last-Modified or apparent dates, freshness labels, last verification, and refresh history.
Safe live web — default egress allow-list remains fail-closed; broad public fetch requires explicit --allow-public-web.
Auditable exports — Markdown reports are signed with a SHA-256 footer and paired with a JSON sidecar.

Research commands

# Run research
xr research "topic"                         # quick research
xr research quick "topic"                   # fast source-first pass
xr research deep "topic"                    # deeper source discovery + synthesis
xr research compare "A vs B"                # comparison workflow + matrix
xr research factcheck "claim"               # verify a claim against sources
xr research briefing "topic"                # briefing-style deep report

# Inspect the workflow
xr research plan "topic"                    # collaborative research plan
xr research status [id]                     # session status
xr research sources [id]                    # source list + trust/freshness
xr research evidence [id]                   # evidence ledger + quotes
xr research claims [id]                     # claim ledger
xr research contradictions [id]             # contradiction log
xr research list                            # recent research sessions

# Maintain and export
xr research summarize [id]                  # regenerate synthesis from evidence
xr research refresh [id]                    # re-check sources and refresh changed evidence
xr research export [id] [path]              # signed Markdown + JSON sidecar
xr research remember [id]                   # explicitly save finding to durable memory

Live research safety

By default, XR can search through the configured SearXNG host and fetch only allow-listed domains. To fetch public web pages returned by search, opt in explicitly:

xr research deep "Gemini Deep Research MCP support" --allow-public-web
xr research deep "topic" --allow-public-web --live-sources-only

--allow-public-web still blocks localhost, private IP ranges, link-local addresses, unsafe redirects, non-HTTP(S) URLs, and oversized responses.

Research data model

Every ResearchSession stores:

id, topic, query, mode, status, createdAt, updatedAt
plan with research questions, search queries, strategy, source requirements
ranked sources with trust, relevance, quality, type, freshness, last verification
evidence / notes with quotes, claim kind, confidence, strength, verified flag
claims, contradictions, summary, finalReport
reportVersions, refreshHistory, optional comparison
tags, projectId, lastRefreshedAt, exportPath

Doctor integration

xr doctor
xr doctor --json

Doctor now includes Research Engine health: total sessions, latest session state, and next inspection commands.

🎙️ Stage 8 — The Voice Stack

Stage 8 gives XR a privacy-respecting voice interface: talk to XR naturally, hear XR respond, interrupt it, and safely trigger XR actions by voice. It is designed as a first-class subsystem, not a toy chatbot mode.

Voice principles

Disabled by default — XR never silently turns on your microphone.
Push-to-talk by default — wake-word and always-listen modes are opt-in.
Local-first — prefers local Whisper / whisper.cpp STT and Piper / Kokoro / system TTS.
Explicit cloud consent — Groq/OpenAI STT only run when cloud audio is explicitly allowed.
Safe computer control — voice actions still pass through XR's risk classifier, preview, approval, and audit layers.
Interruption-aware — stop, cancel, repeat, say again, mute voice, and barge-in are handled directly.
Private transcripts — voice history is not persisted unless you choose local-private transcript policy.
Text fallback — missing microphones, speakers, STT, or TTS degrade safely back to text mode.

Voice commands

xr voice status              # privacy, mode, STT/TTS, device health
xr voice setup               # guided optional setup
xr voice devices             # list microphones and speakers
xr voice test                # record → VAD → STT → TTS loopback
xr voice start               # push-to-talk voice loop
xr voice start --wake-word   # opt-in wake-word transcript gating
xr voice start --always-listen # explicit confirmation required
xr voice stop                # disable voice and always-listen
xr voice config --stt whisper-cli --tts piper
xr speak "hello from XR"     # speak text once
xr listen                    # listen once and print transcript

Supported local-first adapters

Layer	Backends
Microphone	`ffmpeg`, `arecord`, `rec`
Speaker	`ffplay`, `afplay`, `aplay`, `paplay`, `play`, PowerShell
STT	`auto`, local HTTP, `whisper-cli`, `whispercpp`, explicit `groq` / `openai`, `disabled`
TTS	`auto`, local HTTP, `piper`, `kokoro-cli`, `system`, `say`, `espeak`, PowerShell, `disabled`
VAD	local energy VAD now; Silero/openWakeWord-compatible extension points
Wake	transcript-side wake phrase now; external openWakeWord-compatible extension point

Voice config

// ~/.xr/config.json
{
  "voice": {
    "enabled": false,
    "mode": "push-to-talk",
    "inputDevice": "default",
    "outputDevice": "default",
    "sttBackend": "auto",
    "sttModel": "base.en",
    "ttsBackend": "auto",
    "ttsVoice": "default",
    "wakeWord": "hey xr",
    "pushToTalkKey": "enter",
    "alwaysListen": false,
    "interruptionPolicy": "barge-in",
    "confirmationPolicy": "always-risky",
    "transcriptPolicy": "session",
    "fallbackTextMode": true,
    "allowCloudStt": false,
    "allowCloudTts": false
  }
}

Voice can safely trigger XR capabilities

Voice can route to:

open app, open website, type text, click, scroll, press, focus window
research requests: “research local-first voice assistants”
memory: “remember I prefer short answers” / “what do you remember about TypeScript?”
provider/model switching: “switch provider to ollama”, “switch model to qwen2.5:7b”
budget questions: “what is my budget?”
normal XR agent tasks

High-risk actions still require confirmation. Unknown or unsafe states fail closed.

Doctor integration

xr doctor
xr doctor --json

Doctor includes Voice Stack health: capture tools, playback tools, device count, STT/TTS adapter status, mode, and privacy posture.

🏛️ v1.0 Foundation Runtime — AI OS Kernel

XR has evolved into a True AI Operating System. The v1.0 kernel introduces:

Service Container (DI) — lightweight dependency injection managing Agent, Budget, Provider, Plugins with a strictly controlled lifecycle
Lifecycle Management — formal Bootstrap → Start → Stop sequence
Specialized Store Architecture — Session, Audit, Memory, Cost, Skill stores decomposed from the monolithic DB
Command Registry — decoupled CLI commands; adding capabilities requires no core router changes
Event-Driven Core — internal Event Bus for decoupled async service communication

🎯 Core Features

🖥 Safe Computer Control (v0.8 → v0.8.2)

xr control start                                          # opt-in (off by default)
xr control plan "open github.com and search for ahmadrrrtx" --yes

Four execution layers, all enforced in code:

Action schema — every action is a typed, Zod-validated Action variant
Risk classifier — pure function returns safe | sensitive | destructive
Approval gate — safe runs immediately; sensitive prompts; destructive always prompts
Hash-chained audit — every plan, exec, denial, memory hit is tamper-evident

Approvals work from both the CLI prompt and the dashboard "Approve / Deny" buttons.

🧭 Multi-Step Planner (v0.8.1)

xr control plan "fill the contact form on example.com"         # dry-run
xr control plan "fill the contact form on example.com" --step  # confirm each
xr control plan "fill the contact form on example.com" --yes   # auto-approve sensitive

🌐 Browser Automation (Playwright, v0.8.1)

xr control browser status   # check Playwright availability
xr control browser install  # one-shot: install + chromium (~150 MB)

🧠 Plan Memory (v0.8.2)

xr control plan "open github notifications" --yes  # first run: LLM plans (~$0.002)
xr control plan "open github notifications" --yes  # next run: ⚡ recalled, $0.00
xr control memory list

🧠 Durable Memory (v0.9 → Stage 6)

xr memory add "I prefer TypeScript and Bun" --category preference
xr memory add "this project is called XR" --category project --scope xr
xr memory list             # see everything XR remembers
xr memory recall "what do I prefer?"    # shows match % + WHY
xr memory search "bun"
xr memory health           # expired, never-recalled, by category
xr memory prune            # delete expired entries permanently
xr memory edit mem_ab12 "prefer Bun + Zod"
xr memory remove mem_ab12
xr memory clear            # forget everything (asks first)
xr memory export memories.json
xr memory import memories.json

XR remembers only what you explicitly tell it to. Everything is local-first, inspectable, editable, and permanently deletable. See Stage 6 — The Memory Engine for the full feature set: live capture, explainable recall, TTL/expiry, session summaries, and memory health.

🏠 Stage 4 — Local AI Runtime Manager

xr models                     # local AI status
xr models runtimes            # detect Ollama, LM Studio, Jan, llama.cpp, vLLM...
xr models recommend [use-case] # hardware-aware recommendation
xr models install [model]     # safe Ollama setup/pull with approval
xr models set <runtime> <model>
xr models test [model]        # local inference smoke test

Supported local runtimes: Ollama (auto-install) · LM Studio · llama.cpp · Jan · LocalAI · vLLM · GPT4All · KoboldCPP · Text Generation WebUI · SGLang · any OpenAI-compatible endpoint.

🔬 Research Engine (Stage 7)

xr research "compare Rust vs Go for embedded development"
xr research deep "best self-hosted alternatives to Cloudflare Tunnel" --allow-public-web
xr research compare "OpenAI Deep Research vs Gemini Deep Research"
xr research factcheck "Gemini Deep Research supports MCP servers"
xr research evidence      # inspect evidence ledger
xr research claims        # inspect claim ledger
xr research contradictions
xr research refresh       # re-check source freshness
xr research export        # signed Markdown + JSON sidecar

See Stage 7 — The Research Engine for the full workflow: live source discovery, trust/freshness ranking, evidence extraction, contradiction detection, comparison matrices, refresh history, and signed reports.

🧩 Stage 10 — Plugin Platform

xr plugins search                     # catalog metadata search
xr plugins inspect ./plugins/github   # manifest + permissions, no code execution
xr plugins install ./plugins/github   # shows permissions, asks to approve
xr plugins enable github              # explicit activation
xr plugin github repo ahmadrrtx/xr    # run a plugin command
xr plugins permissions github         # requested vs granted permissions
xr plugins doctor                     # health/trust check

Plugins are opt-in, permissioned, inspectable, and disableable. XR records entrypoint + whole-tree hashes at install, rejects tampered plugins as untrusted, blocks common ambient-authority imports (node:fs, child_process, raw fetch, process.env, dynamic eval), and forces approval for tools from plugins with sensitive grants. Plugins can ship tools, commands, skills, MCP-backed tools, UI metadata, provider adapters, workflow packs, research packs, voice packs, security packs, business packs, and developer packs without editing XR core.

Full spec: docs/PLUGINS.md

🤖 JARVIS-Level Vision Loop

xr --computer "open Safari and search for AI agents"

Vision-driven: screenshot → LLM reasons → action loop. For open-ended tasks where the planner doesn't know steps in advance.

💰 Cost Governor — Enforced in Code

xr --budget 0.10 "write me a full React app"

The agent literally cannot exceed your budget. checkBeforeStep() runs before every model call and blocks if the next step would breach the ceiling.

🛡️ Provable Security

xr test --attacks --json    # signed, publishable block-rate report

🔒 Tamper-Evident Audit Log

xr verify-log    # → "✓ Audit chain intact (N entries)"

SHA-256 hash chain on every action — git's trick, $0, offline. Any tampering detected instantly.

📡 Providers

XR supports 20+ providers. Swap anytime — no restart, no re-config.

Provider	Type	Notes
Ollama	Local	Auto-detect, model pull, free
Claude (Anthropic)	Cloud	claude-opus-4, claude-sonnet-4
OpenAI	Cloud	gpt-4o, o3
Gemini (Google)	Cloud	gemini-2.5-pro
Groq	Cloud	llama-3.3-70b, ultra-fast
DeepSeek	Cloud	deepseek-r2, reasoning
Together AI	Cloud	Open models, batch
Mistral	Cloud	mistral-large-2
Cohere	Cloud	command-r-plus
Cerebras	Cloud	llama-3.3-70b, fast
OpenRouter	Cloud	100+ models, unified
AWS Bedrock	Cloud	Enterprise
+ any OpenAI-compatible endpoint	Local/Cloud	Custom base URL

xr providers list      # all providers + status
xr providers set openai
xr providers add claude   # enter API key (masked, stored in OS keychain)
xr providers test         # test all configured providers live

🔒 Security — Built In, Not Bolted On

Every security feature is code-enforced, not a suggestion:

Feature	How
Hard budget ceiling	`checkBeforeStep()` blocks — no exceptions
Tamper-evident audit	SHA-256 hash chain, offline, `xr verify-log`
Plugin trust	Explicit permissions, tree hashes, static scan, health checks, disable/remove
Injection defense	10-attack benchmark, signed block-rate report
Egress allow-list	Only configured domains receive data
Approval gates	`write_file`, `delete`, `shell`, `send` need consent
API key redaction	Keys never appear in audit log
Local-first	Nothing leaves your machine by default
Dashboard security	127.0.0.1 only, bearer token, `X-Frame-Options: DENY`

🖥️ Install Modes

xr install --mode minimal   # core only
xr install --mode local     # local/free, no API key required
xr install --mode byok      # cloud keys you own
xr install --mode hybrid    # cloud primary + local fallback
xr install --mode full      # all optional packs

🗂 Repository Structure

xr/
├── src/
│   ├── ui/               # Stage 5: design system
│   │   ├── theme.ts      # brand palette, ANSI codes, CSS vars
│   │   ├── spinner.ts    # Spinner, ProgressBar, StepTracker
│   │   ├── layout.ts     # terminal layout primitives
│   │   └── index.ts
│   ├── interfaces/       # all user-facing surfaces
│   │   ├── tui.ts        # interactive terminal UI (Claude Code–style)
│   │   ├── cli.ts        # CLI output helpers
│   │   ├── onboard.ts    # setup wizard
│   │   ├── providers.ts  # provider management UI
│   │   └── models.ts     # local AI model UI
│   ├── commands/         # CLI command handlers
│   │   ├── memory.ts     # xr memory — the Memory Engine (Stage 6)
│   │   ├── help.ts       # xr help [topic]
│   │   ├── budget.ts
│   │   ├── config.ts
│   │   ├── doctor.ts     # includes memory + research + voice + plugin health
│   │   ├── plugins.ts    # xr plugins / xr plugin command adapters
│   │   └── ...
│   ├── daemon/           # local server
│   │   ├── server.ts      # xr serve — dashboard + chat + API
│   │   ├── plugin-api.ts  # plugin management API
│   │   └── dashboard.ts   # full SPA: 12 panels, chat, command palette
│   ├── core/             # agent runtime
│   ├── providers/        # 20+ provider adapters
│   ├── memory/           # durable memory + RAG
│   ├── security/         # injection lab, audit, egress
│   ├── control/          # computer control, planner
│   ├── local/            # local AI runtime manager
│   ├── plugins/          # Stage 10 plugin platform: manifests, registry, host, loader, lifecycle
│   ├── research/         # Stage 7 Research Engine
│   ├── voice/            # Stage 8 Voice Stack
│   ├── cost/             # budget governor
│   └── config/           # configuration
├── bin/                  # CLI entry point
├── plugins/              # built-in plugins
├── skills/               # learned skills
├── docs/                 # PLUGINS.md, etc.
├── website/              # Next.js marketing site
└── test/                 # test suite

⌨️ Quick Reference

# One-shot tasks
xr "write a README for this project"
xr "explain this codebase"          --mode ask
xr "refactor auth module"            --budget 0.25
xr "build a REST API"                --mode plan   # plan only

# Interactive TUI
xr --tui                             # full terminal workspace

# Browser interfaces
xr serve                             # dashboard + chat at localhost:3141

# Local AI
xr models                            # status
xr models recommend                  # hardware-aware recommendation
xr models install                    # install recommended model

# Providers
xr providers list
xr providers set ollama
xr providers add claude

# Memory
xr memory add "I prefer TypeScript" --category preference
xr memory add "temp" --ttl 3600          # expires after 1h
xr memory list
xr memory recall "what do I prefer?"     # match % + why
xr memory health
xr memory prune                         # delete expired
xr memory clear

# Research
xr research "topic"
xr research deep "topic" --allow-public-web
xr research compare "A vs B"
xr research factcheck "claim"
xr research evidence
xr research refresh
xr research export

# Plugins (Stage 10)
xr plugins search
xr plugins inspect ./plugins/hello
xr plugins install ./plugins/hello --yes
xr plugins enable hello
xr plugin hello greet Ahmad
xr plugins doctor

# Voice (Stage 8, opt-in)
xr voice status
xr voice setup
xr voice devices
xr voice test
xr voice start                         # push-to-talk
xr voice start --wake-word             # opt-in wake phrase
xr speak "hello"
xr listen

# Computer control (opt-in)
xr control start
xr control plan "open browser and go to github.com"

# Security
xr verify-log
xr attacks

# Help
xr help                              # full command reference
xr help tui                          # TUI guide
xr help security                     # security guide
xr help providers                    # providers guide
xr help memory                       # memory engine guide

📋 Compatibility

Platform	Status
Linux (Ubuntu, Debian, Fedora, Arch)	✅ Full support
macOS (Apple Silicon + Intel)	✅ Full support
Windows (PowerShell, WSL)	✅ Full support
Android (Termux)	✅ Full support

Runtime: Bun (required) — install with curl -fsSL https://bun.sh/install | bash

🗺 Roadmap

Stage	Name	Status
Stage 1	Core Agent	✅ Done
Stage 2	Security + Audit	✅ Done
Stage 3	Research + Plugins	✅ Done
Stage 4	Local AI Runtime	✅ Done
Stage 5	User Interfaces	✅ Done
Stage 6	Memory Engine	✅ Done
Stage 7	Research Engine	✅ Done
Stage 8	Voice Stack	✅ Done
Stage 9	Computer Control Engine	✅ Done
Stage 10	Plugin Platform	✅ Done
Stage 11	MCP Platform	✅ Done
Stage 12	Multi-Agent + Advanced Orchestration	✅ Done
Stage 13	Advanced Runtime / Visual Orchestration / Background Workers	🔜 Next

🤝 Stage 12 — The Multi-Agent System

Stage 12 turns XR from a powerful single-agent shell into a professional multi-agent operating layer.

XR now supports a supervisor / worker runtime with:

Supervisor / Coordinator
Planner
Researcher
Builder / Developer
Reviewer
Executor
Synthesizer
Memory Manager
Router / Model Selector
Security Checker

What Stage 12 adds

Agent registry with explicit roles, capabilities, permissions, memory scopes, and tool scopes
Persisted workflow graphs (agent_workflows + agent_tasks) for inspectability, resume, and auditability
Task decomposition with dependency tracking, parallel branches, handoffs, and review checkpoints
Role-scoped execution so reviewer agents stay read-only while builder/executor agents get narrow side-effect lanes
Memory boundaries via a dedicated memory-manager brief instead of broad memory injection into every worker
Security and critique gates that can block downstream execution with CHANGES_REQUESTED or REJECTED
Live CLI progress for xr agents run, so you can see which agent is active and what it is doing

Commands

xr agents list
xr agents status
xr agents status <workflowId>
xr agents inspect <agentId|workflowId>
xr agents plan "implement this safely"
xr agents run "implement this safely" --dry-run
xr agents delegate <workflowId> <agentId> "investigate X"
xr agents review <workflowId>
xr agents synthesize <workflowId>
xr agents stop <workflowId>
xr agents resume <workflowId>

Design principles

Not one giant agent
Supervisor owns the flow
Workers stay narrow
Review is separate from generation
Synthesis is separate from execution
Memory and tools are scoped per role
High-risk execution must pass through security/review gates
Everything important is persisted and inspectable

Current architecture

src/agents/registry.ts — built-in agent workforce
src/agents/planner.ts — deterministic workflow/task graph compiler
src/services/multi-agent-service.ts — supervisor runtime
src/state/stores/workflow-store.ts — persisted workflow/task storage
src/commands/agents.ts — CLI surface

🖥️ Stage 9 — Computer Control Engine

Stage 9 turns XR from a chat assistant into a safe, vision-capable AI operator. XR can now operate your computer exactly like a human assistant, with full transparency and permission gates.

Design principles

Safe-by-construction — all actions are Zod-validated against a strict schema.
Human-in-the-loop — risk classifier checks every step. Sensitive actions prompt; destructive actions always prompt.
Vision-assisted — XR uses screenshots and Vision LLMs to navigate modern UIs, not just shell commands.
Transparent — every click, keystroke, and scroll is logged to the tamper-evident audit chain.
Kill-switch — xr control stop disables the engine immediately.

Control commands

# Setup & Status
xr control status              # check tools (keyboard, mouse, browser)
xr control start               # enable control engine
xr control stop                # disable control engine
xr control browser install     # setup Playwright + Chromium

# Automation
xr control plan "task"         # plan and execute multi-step workflow
xr control computer "task"     # vision-driven agentic loop (screenshot -> act)
xr control test                # run a safety/capability self-test

# Primitive Actions
xr control app "VS Code"       # launch application
xr control open "https://..."  # open URL or file
xr control click 640,480       # click coordinates
xr control type "hello"        # type text
xr control key "cmd+tab"       # press keys
xr control scroll down 5       # scroll

Vision Agent (Computer-Use)

The xr control computer command starts a JARVIS-level loop:

Observe: Captures a high-res screenshot.
Reason: A Vision LLM (like Claude 3.5 Sonnet) analyzes the UI state.
Act: XR sends a mouse/keyboard action.
Repeat: Loop continues until the task is marked DONE.

Platform Support

Platform	Keyboard	Mouse	Apps	Browser
macOS	`osascript`	`cliclick`	`open`	Playwright
Linux	`xdotool`	`xdotool`	`xdg-open`	Playwright
Windows	PowerShell	.NET Native	`Start-Process`	Playwright

🔌 Stage 11 — The MCP Platform

Stage 11 turns XR into a true standardized integration platform using the Model Context Protocol (MCP 2025-06-18).

MCP servers expose tools, resources, and prompts in a uniform way. XR can now safely consume any compliant MCP server (GitHub, databases, browsers, filesystems, workflows, etc.) without bespoke glue code.

MCP Principles (enforced in code)

Opt-in only — servers are never auto-discovered or enabled
Inspect before activate — xr mcp inspect shows capabilities + declared permissions
Explicit permissions — 15 permission scopes (fs:read, fs:write, net, secrets, control, shell, etc.)
Always approval-gated — every MCP tool/resource call requires explicit user approval
Audit + budget inheritance — all calls go through XR's existing safety rails
Clean lifecycle — enable / disable / remove actually stops access
Fail closed — broken servers are isolated and never crash XR

Full MCP Command Surface

xr mcp list
xr mcp add github stdio npx @modelcontextprotocol/server-github
xr mcp add postgres http http://127.0.0.1:8765/mcp
xr mcp inspect github
xr mcp enable github
xr mcp disable github
xr mcp tools github
xr mcp resources github
xr mcp prompts github
xr mcp health
xr mcp permissions github
xr mcp remove github
xr mcp doctor

MCP tools/resources/prompts automatically appear in the agent as namespaced tools:

mcp.github.create_issue
mcp.postgres.query
etc.

All are wrapped with requiresApproval: true, full audit, and budget checks.

Supported Transports

stdio (local processes)
http / sse / streamable-http (remote)

Security Model

Never stores raw API keys (only apiKeyEnv)
All calls go through XR approval + egress + audit
Remote servers are treated as higher trust risk
Disable/remove actually stops execution

Doctor & Health

xr doctor
# includes: MCP platform (X installed, Y enabled, Z healthy)

Dashboard

MCP servers will appear in future dashboard updates (core is complete).

See xr mcp --help for the full surface.

🧩 Stage 10 — The Plugin Platform

Stage 10 makes XR extensible without making core a monolith. Plugins are capability bundles that can add tools, skills, integrations, MCP connectors, workflow packs, research packs, voice packs, security packs, business packs, developer packs, and UI metadata through a strict manifest and explicit permission model.

Plugin platform principles

Opt-in install and opt-in enable — XR never silently installs or activates plugins.
Manifest-first — xr plugins inspect shows identity, version, permissions, capabilities, source, trust, skills, and MCP declarations without executing code.
Explicit permissions — a plugin receives only permissions it declares and the user grants.
No ambient authority by default — plugin scanning blocks direct node:fs, child_process, raw fetch, process.env, Bun host APIs, eval/dynamic Function, and symlinks.
Tamper detection — XR records entrypoint and full plugin-tree SHA-256 hashes at install; changed code fails closed as untrusted.
Budget/security inheritance — plugin provider calls, network calls, memory, secrets, MCP tools, and computer-control paths still pass through XR gates.
Recoverable lifecycle — disable/remove actually stops loaded contributions and deletes installed files.

Plugin commands

xr plugins list                         # installed state + health
xr plugins search [query]               # catalog metadata search
xr plugins inspect <id|path>            # manifest + permissions; no code runs
xr plugins install <path|catalog-id>    # review and approve permissions
xr plugins enable <id>
xr plugins disable <id>
xr plugins update <id> [path]           # blocks new permissions until reinstall approval
xr plugins remove <id>
xr plugins permissions <id>             # requested/granted permissions
xr plugins skills                       # skills contributed by enabled plugins
xr plugins doctor                       # health/trust check
xr plugin <id> <command> [args...]      # run plugin CLI command

Plugin manifest surfaces

xr-plugin.json supports:

id, name, version, author, description, type, entrypoint
permissions, capabilities, dependencies, compatibility, apiVersion
source, sourceUrl, updateSource, trustLevel, trust.signature, trust.sha256, trust.treeSha256
uiHooks, commandHooks, toolHooks, mcpServers, skillPaths
plugin types: tool, skill, integration, provider, memory, research, automation, ui, mcp, voice, security, business, developer, workflow

Dashboard integration

The local dashboard Plugins panel now shows installed plugins, version, type, enabled state, health, trust level, requested/granted permissions, capabilities, catalog search, and enable/disable/remove actions.

See docs/PLUGINS.md for the full authoring and security contract.

🤝 Contributing

XR is MIT licensed. Contributions welcome.

git clone https://github.com/ahmadrrrtx/xr
cd xr
bun install
bun test

📄 License

MIT — LICENSE

▀▄▀ █▀█
█░█ █▀▄

XR — built by @rrrtx · xr-gules.vercel.app · MIT

Local-first. Spend-capped. Tamper-evident. Yours.

🔌 Stage 11 — The MCP Platform (Completed)

XR is now a first-class MCP-native platform.

MCP (Model Context Protocol) lets XR connect to any compliant server for tools, resources, and prompts using a single standard — no more bespoke integrations.

Core capabilities delivered:

Full MCP 2025-06-18 client (tools/list, tools/call, resources, prompts)
4 transports: stdio, http, sse, streamable-http
Persistent registry + complete lifecycle (add / enable / disable / remove / inspect)
Explicit permission model (15 scopes) + trust levels
Every MCP invocation is approval-gated, audited, budget-aware, and egress-controlled
MCP tools/resources/prompts surface automatically into the agent as mcp.<server>.<name>
Health checks + xr mcp doctor
Clean fail-closed behavior

Quick start

xr mcp add github stdio npx @modelcontextprotocol/server-github
xr mcp enable github
xr mcp inspect github
xr mcp tools github
xr "list my open PRs using the github MCP server"

All safety rails from previous stages still apply. MCP servers are never trusted by default.

See xr mcp --help for the full surface.

Name		Name	Last commit message	Last commit date
Latest commit History 450 Commits
assets		assets
bin		bin
docs		docs
extensions/vscode		extensions/vscode
plugins		plugins
scripts		scripts
skills		skills
src		src
test		test
website		website
Dockerfile		Dockerfile
LAUNCH-POSTS.md		LAUNCH-POSTS.md
LICENSE		LICENSE
MCP-STAGE11-IMPLEMENTED.md		MCP-STAGE11-IMPLEMENTED.md
MIGRATION.md		MIGRATION.md
README.md		README.md
bun.lock		bun.lock
docker-compose.yml		docker-compose.yml
gitignore		gitignore
install.ps1		install.ps1
install.sh		install.sh
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

XR — The AI Agent You Can Actually Trust

🚀 Install XR

✨ What Makes XR Different

🖥️ Stage 5 — User Interfaces

Terminal Interface (Claude Code–style TUI)

Browser Chat Interface

Browser Dashboard

Design System (src/ui/)

🧠 Stage 6 — The Memory Engine

Design principles

Memory commands

Live capture — "remember this?"

Memory categories & scopes

Retention, expiry & hygiene

Explainable retrieval

Session summaries (opt-in)

Dashboard memory panel

Memory config

🔬 Stage 7 — The Research Engine

Research principles

Research commands

Live research safety

Research data model

Doctor integration

🎙️ Stage 8 — The Voice Stack

Voice principles

Voice commands

Supported local-first adapters

Voice config

Voice can safely trigger XR capabilities

Doctor integration

🏛️ v1.0 Foundation Runtime — AI OS Kernel

🎯 Core Features

🖥 Safe Computer Control (v0.8 → v0.8.2)

🧭 Multi-Step Planner (v0.8.1)

🌐 Browser Automation (Playwright, v0.8.1)

🧠 Plan Memory (v0.8.2)

🧠 Durable Memory (v0.9 → Stage 6)

🏠 Stage 4 — Local AI Runtime Manager

🔬 Research Engine (Stage 7)

🧩 Stage 10 — Plugin Platform

🤖 JARVIS-Level Vision Loop

💰 Cost Governor — Enforced in Code

🛡️ Provable Security

🔒 Tamper-Evident Audit Log

📡 Providers

🔒 Security — Built In, Not Bolted On

🖥️ Install Modes

🗂 Repository Structure

⌨️ Quick Reference

📋 Compatibility

🗺 Roadmap

🤝 Stage 12 — The Multi-Agent System

What Stage 12 adds

Commands

Design principles

Current architecture

🖥️ Stage 9 — Computer Control Engine

Design principles

Control commands

Vision Agent (Computer-Use)

Platform Support

🔌 Stage 11 — The MCP Platform

MCP Principles (enforced in code)

Full MCP Command Surface

Supported Transports

Security Model

Doctor & Health

Dashboard

🧩 Stage 10 — The Plugin Platform

Plugin platform principles

Plugin commands

Plugin manifest surfaces

Dashboard integration

🤝 Contributing

📄 License

Design System (`src/ui/`)

Packages