Role-Playing for Agents. A masque is a temporary cognitive identity—bundling lens (how to think), context (who you're helping), and attributes (metadata) into a single assumable primitive.
And, crucially, a measured one. A masque without an audience is just a system prompt. The thesis of this project is that the interesting question isn't that you wore a costume — it's whether wearing it made the work better, and that an always-seated local audience can tell you, honestly, on your own work. That measurable identity is the differentiated half, described next.
The audience is always seated: a local, always-on observer captures every
session — masque or baseline — and scores it two ways. From session one you get a
7-point house reaction (perfect · great · good · neutral · bad · awful · detracting) — an honest read of how the session went. As your own baseline corpus
thickens, the audience adds lift: how a masque compares to your no-masque
baseline on the same kind of work — "Codesmith runs +1.4 on your refactor
work." A difference, never a vanity number, and it never leaves your machine.
This is the part that makes masques more than prompt presets. See
docs/evaluation.md and docs/otel-setup.md.
Agents today get configured through scattered mechanisms: system prompts, MCP servers, environment variables, knowledge bases. These are disconnected. Masques unifies them into a single "become this identity" operation.
It's a representation layer you slap on top of any agent: don a masque to adopt its lens and context, do the work, then doff to step backstage and return to baseline. The core needs zero infrastructure — a masque is just YAML, and identity lives in a session file. The roadmap sketches where this could grow — bundled knowledge per masque.
The core loop needs no databases, no services, no credentials — just YAML and a session file:
Don → read masque YAML, inject lens + context, write .claude/masque.session.yaml
Work → operate with the masque's framing
Doff → clear the session, return to baseline Claude
That don/doff loop needs zero infrastructure. The audience that measures it (above) runs locally and stays on your machine:
| Component | Role | Where |
|---|---|---|
| OTEL collector | Always-on capture of every session → local JSONL | Local Docker, seated once (/audience seat) |
| DuckDB judge | Two-layer scoring (reaction + lift) | Local, ephemeral |
| ClickHouse | Remote reputation store (opt-in, deferred — Tier 3) | masques.ai, off by default |
The local collector + DuckDB are the measurable-identity layer. Remote forwarding to masques.ai is strictly opt-in and ships only derived scores — never your prompts, code, or tool I/O.
# Install as a Claude Code plugin
claude plugins add github:ChrisDBaldwin/masques/don <masque> [intent] # Assume a masque identity
/doff # Return to baseline Claude
/id # Show active masque info
/list # List available masques
/inspect [masque] # View full masque details
/sync-manifest [scope] # Regenerate manifest files
/audience [action] # Manage the audience (seat/dismiss/status/logs)
/performance # Score masque session performanceMasques also ships as an MCP server, so any MCP client — Claude Code, Claude
Desktop, Cursor, the MCP Inspector — can list / inspect / don a masque and
score a session. It's a thin adapter over the same authoritative core the
plugin uses, so both surfaces compose identical identities (no drift).
Local, free, and unauthenticated over stdio. Scoring runs the local DuckDB judge on-device and never leaves your machine.
cd services/mcp
uv tool install --editable . # installs masques-cli + masques-mcp
claude mcp add masques -- masques-mcp # register in Claude Code
claude mcp list # → masques: … ✓ ConnectedIt exposes five tools (list_masques, inspect_masque, don, doff, score),
one don-<name> prompt per masque, and masque://catalog resources. The
masques-cli command is also what the plugin shells out to, so plugin and server
stay in lockstep. See docs/mcp-server.md.
A hosted catalog on masques.ai with OAuth is designed but not built — the shipping server is local stdio only.
A masque bundles cognitive identity into a single YAML file:
name: string # Required. Human-readable name
version: "x.y.z" # Required. Semantic version
attributes: # Optional. Flexible metadata
domain: string
tagline: string
style: string
philosophy: string
context: | # Optional. Situational framing
Who you're helping, what they value, operational environment.
lens: | # Required. Cognitive framing (system prompt fragment)
How to approach problems. What to prioritize. What to reject.
rubric: | # Optional. How to know it worked — the measurable
shadow of the lens. The audience's judge reads a session against this to
assign the Layer-A reaction band; masques without one fall back to a
generic activity band.
spinnerVerbs: # Optional. Custom activity indicators
mode: replace # replace | append | prepend
verbs:
- "Masque:Verbing"See Schema Reference for the full specification.
The always-on audience. Receives metrics and logs from every Claude Code session via OTLP and writes them to local JSONL. Local-only by default — nothing leaves the machine. Seated once and left running:
cd services/collector
docker compose up -d --build # or: /audience seat — seat the house onceReads the local OTEL exports and emits the two-layer score (see evaluation):
- Layer A — house reaction (always): a 7-point verdict —
perfect · great · good · neutral · bad · awful · detracting— from a rubric judge (if the masque carries arubric) or an activity fallback. - Layer B — lift (once earned): the masque's delta vs your baseline corpus on the same task-class — never a bare number, never below threshold.
The old activity proxies (tool success, throughput, cost…) are demoted to supporting signals — context, not the verdict.
services/judge/judge.sh # Outputs the two-layer YAML score to stdout
# or: /performanceThe remote reputation store (Tier 3, masques.ai). Off by default and not wired into the shipping collector — the local audience never depends on it. When enabled it must forward only the derived Tier-2 signal (scores + coarse metadata), never prompts, code, or tool I/O.
Terminal UI for browsing masques and drafting teams. Built with Zig 0.16+ and libvaxis.
cd tui && zig build run # build and launch
# or: zig build && ./zig-out/bin/masques- Animated portraits with domain-specific patterns (forge, cybernetic, art, etc.)
- Theatrical mask silhouettes per category — sovereign (executive), cerebral (cognitive), classic (specialist), theatrical (art), geometric (meta)
- Full lens text, attributes, and metadata in the detail panel
- Team drafting with role assignment and YAML export
Navigate with arrow keys, Enter to add to team, Tab to switch focus, 1–6 for category tabs, q to quit.
Masques is the identity layer for any agent. Today it provides cognitive framing (lens, context, attributes) and telemetry-based scoring. The minimal product stops there. Possible future integration:
| Need | Status | Why | Approach |
|---|---|---|---|
| Telemetry | Working | Measure what masques actually do | OTEL → Collector → ClickHouse + DuckDB |
| Knowledge | Planned | Masques should bring their own context | MCP URIs bundled per masque |
Credentials and tools are deliberately out of scope — they're agent primitives the host already handles (vaults, MCP config). Masques stays the identity layer and composes on top of them rather than re-owning them.
The larger "agent marketplace" direction — spawning masques as paid workers with a reputation + payment gate — is deferred. See docs/future/ for that vision.
| Guide | Description |
|---|---|
| Getting Started | Create your first masque in 5 minutes |
| Vision | The theater metaphor and why masques exist |
| Concepts | The five components explained |
| Schema | Full YAML specification |
| MCP Server | Run Masques as an MCP server for any client |
| OTEL Setup | Configuring the telemetry pipeline |
| Evaluation | DuckDB session scoring |
| Evaluations | Testing masque behavioral fidelity |
| Future | Deferred vision — agent marketplace, payments |
| TUI | masques — terminal UI for browsing and team drafting |
Contributions welcome! Please read CONTRIBUTING.md before starting.
The process: Open an issue first, then fork, branch, and submit a PR referencing that issue.
This is a personal project maintained in spare time. For bugs, please open an issue with:
- What you tried and what happened
- A screenshot or GIF of the experience (really helps!)
- Your environment details
A Claude Code plugin for donning cognitive identities, an MCP server exposing the same masques to any MCP client, a library of 35 masques, and a Zig TUI for team drafting. The core is infrastructure-free; telemetry (OTEL → optional ClickHouse + DuckDB scoring) is opt-in. Payment/marketplace infrastructure is deferred (see docs/future/).
Temporary identities. Coherent work. Measured performance.