Skip to content

ChrisDBaldwin/Masques

Masques

Masques

Role-Playing for Agents. A masque is a temporary cognitive identity—bundling lens (how to think), context (who you're helping), and attributes (metadata) into a single assumable primitive.

And, crucially, a measured one. A masque without an audience is just a system prompt. The thesis of this project is that the interesting question isn't that you wore a costume — it's whether wearing it made the work better, and that an always-seated local audience can tell you, honestly, on your own work. That measurable identity is the differentiated half, described next.

Measurable Identity — the product thesis

The audience is always seated: a local, always-on observer captures every session — masque or baseline — and scores it two ways. From session one you get a 7-point house reaction (perfect · great · good · neutral · bad · awful · detracting) — an honest read of how the session went. As your own baseline corpus thickens, the audience adds lift: how a masque compares to your no-masque baseline on the same kind of work"Codesmith runs +1.4 on your refactor work." A difference, never a vanity number, and it never leaves your machine.

This is the part that makes masques more than prompt presets. See docs/evaluation.md and docs/otel-setup.md.

What Is This?

Agents today get configured through scattered mechanisms: system prompts, MCP servers, environment variables, knowledge bases. These are disconnected. Masques unifies them into a single "become this identity" operation.

It's a representation layer you slap on top of any agent: don a masque to adopt its lens and context, do the work, then doff to step backstage and return to baseline. The core needs zero infrastructure — a masque is just YAML, and identity lives in a session file. The roadmap sketches where this could grow — bundled knowledge per masque.

How It Works

The core loop needs no databases, no services, no credentials — just YAML and a session file:

Don   → read masque YAML, inject lens + context, write .claude/masque.session.yaml
Work  → operate with the masque's framing
Doff  → clear the session, return to baseline Claude

That don/doff loop needs zero infrastructure. The audience that measures it (above) runs locally and stays on your machine:

Component Role Where
OTEL collector Always-on capture of every session → local JSONL Local Docker, seated once (/audience seat)
DuckDB judge Two-layer scoring (reaction + lift) Local, ephemeral
ClickHouse Remote reputation store (opt-in, deferred — Tier 3) masques.ai, off by default

The local collector + DuckDB are the measurable-identity layer. Remote forwarding to masques.ai is strictly opt-in and ships only derived scores — never your prompts, code, or tool I/O.

Quick Start

# Install as a Claude Code plugin
claude plugins add github:ChrisDBaldwin/masques

Commands

/don <masque> [intent]    # Assume a masque identity
/doff                     # Return to baseline Claude
/id                       # Show active masque info
/list                     # List available masques
/inspect [masque]         # View full masque details
/sync-manifest [scope]    # Regenerate manifest files
/audience [action]        # Manage the audience (seat/dismiss/status/logs)
/performance              # Score masque session performance

MCP Server

Masques also ships as an MCP server, so any MCP client — Claude Code, Claude Desktop, Cursor, the MCP Inspector — can list / inspect / don a masque and score a session. It's a thin adapter over the same authoritative core the plugin uses, so both surfaces compose identical identities (no drift).

Local, free, and unauthenticated over stdio. Scoring runs the local DuckDB judge on-device and never leaves your machine.

cd services/mcp
uv tool install --editable .              # installs masques-cli + masques-mcp
claude mcp add masques -- masques-mcp     # register in Claude Code
claude mcp list                           # → masques: … ✓ Connected

It exposes five tools (list_masques, inspect_masque, don, doff, score), one don-<name> prompt per masque, and masque://catalog resources. The masques-cli command is also what the plugin shells out to, so plugin and server stay in lockstep. See docs/mcp-server.md.

A hosted catalog on masques.ai with OAuth is designed but not built — the shipping server is local stdio only.

Schema

A masque bundles cognitive identity into a single YAML file:

name: string              # Required. Human-readable name
version: "x.y.z"          # Required. Semantic version

attributes:               # Optional. Flexible metadata
  domain: string
  tagline: string
  style: string
  philosophy: string

context: |                # Optional. Situational framing
  Who you're helping, what they value, operational environment.

lens: |                   # Required. Cognitive framing (system prompt fragment)
  How to approach problems. What to prioritize. What to reject.

rubric: |                 # Optional. How to know it worked — the measurable
  shadow of the lens. The audience's judge reads a session against this to
  assign the Layer-A reaction band; masques without one fall back to a
  generic activity band.

spinnerVerbs:             # Optional. Custom activity indicators
  mode: replace           # replace | append | prepend
  verbs:
    - "Masque:Verbing"

See Schema Reference for the full specification.

Services

OTEL Collector

The always-on audience. Receives metrics and logs from every Claude Code session via OTLP and writes them to local JSONL. Local-only by default — nothing leaves the machine. Seated once and left running:

cd services/collector
docker compose up -d --build   # or: /audience seat — seat the house once

Performance Judge (DuckDB)

Reads the local OTEL exports and emits the two-layer score (see evaluation):

  • Layer A — house reaction (always): a 7-point verdict — perfect · great · good · neutral · bad · awful · detracting — from a rubric judge (if the masque carries a rubric) or an activity fallback.
  • Layer B — lift (once earned): the masque's delta vs your baseline corpus on the same task-class — never a bare number, never below threshold.

The old activity proxies (tool success, throughput, cost…) are demoted to supporting signals — context, not the verdict.

services/judge/judge.sh   # Outputs the two-layer YAML score to stdout
# or: /performance

ClickHouse (opt-in, deferred)

The remote reputation store (Tier 3, masques.ai). Off by default and not wired into the shipping collector — the local audience never depends on it. When enabled it must forward only the derived Tier-2 signal (scores + coarse metadata), never prompts, code, or tool I/O.

TUI — masques

Terminal UI for browsing masques and drafting teams. Built with Zig 0.16+ and libvaxis.

cd tui && zig build run   # build and launch
# or: zig build && ./zig-out/bin/masques
  • Animated portraits with domain-specific patterns (forge, cybernetic, art, etc.)
  • Theatrical mask silhouettes per category — sovereign (executive), cerebral (cognitive), classic (specialist), theatrical (art), geometric (meta)
  • Full lens text, attributes, and metadata in the detail panel
  • Team drafting with role assignment and YAML export

Navigate with arrow keys, Enter to add to team, Tab to switch focus, 16 for category tabs, q to quit.

Roadmap

Masques is the identity layer for any agent. Today it provides cognitive framing (lens, context, attributes) and telemetry-based scoring. The minimal product stops there. Possible future integration:

Need Status Why Approach
Telemetry Working Measure what masques actually do OTEL → Collector → ClickHouse + DuckDB
Knowledge Planned Masques should bring their own context MCP URIs bundled per masque

Credentials and tools are deliberately out of scope — they're agent primitives the host already handles (vaults, MCP config). Masques stays the identity layer and composes on top of them rather than re-owning them.

The larger "agent marketplace" direction — spawning masques as paid workers with a reputation + payment gate — is deferred. See docs/future/ for that vision.

Documentation

Guide Description
Getting Started Create your first masque in 5 minutes
Vision The theater metaphor and why masques exist
Concepts The five components explained
Schema Full YAML specification
MCP Server Run Masques as an MCP server for any client
OTEL Setup Configuring the telemetry pipeline
Evaluation DuckDB session scoring
Evaluations Testing masque behavioral fidelity
Future Deferred vision — agent marketplace, payments
TUI masques — terminal UI for browsing and team drafting

Contributing

Contributions welcome! Please read CONTRIBUTING.md before starting.

The process: Open an issue first, then fork, branch, and submit a PR referencing that issue.

Support

This is a personal project maintained in spare time. For bugs, please open an issue with:

  • What you tried and what happened
  • A screenshot or GIF of the experience (really helps!)
  • Your environment details

Status

A Claude Code plugin for donning cognitive identities, an MCP server exposing the same masques to any MCP client, a library of 35 masques, and a Zig TUI for team drafting. The core is infrastructure-free; telemetry (OTEL → optional ClickHouse + DuckDB scoring) is opt-in. Payment/marketplace infrastructure is deferred (see docs/future/).


Temporary identities. Coherent work. Measured performance.

About

Role-playing for Agents — temporary cognitive identities bundling intent, context, and lens. An MCP server and Claude Code plugin.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors