Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 36 additions & 11 deletions SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,28 @@ description: "Skill ๅ…จ็”Ÿๅ‘ฝๅ‘จๆœŸ็ฎก็†๏ผšๅˆ›ๅปบ โ†’ ๅๆ€ไผ˜ๅŒ– โ†’ ่ฏ„ๆต‹

Full lifecycle management for AI agent skills: create, reflect, evaluate, publish, search, install, merge, review, and uninstall.

## Core Principles
## Lifecycle Flow

Each stage has a validation gate before advancing to the next:

### Concise is Key
1. **Create** โ†’ verify triggers fire correctly and scripts run (`--help` smoke test)
2. **Use** โ†’ observe real executions for failure signals
3. **Reflect** โ†’ after failures or user corrections, identify root cause and patch
4. **Evaluate** โ†’ run regression tests to confirm fixes don't break other cases
5. **Maturity check** โ†’ confirm publish readiness (3+ successful runs, no recent reflects)
6. **Publish** โ†’ push to registry: `python3 scripts/publish.py .claude/skills/<name>/`

The context window is a shared resource. Only add context Claude doesn't already have. Challenge each piece: "Does Claude really need this?" Prefer concise examples over verbose explanations.
Fail any gate โ†’ loop back. Example: reflect fix at step 3 requires re-evaluation at step 4 before proceeding to step 5.

## Core Principles

### Degrees of Freedom

Match specificity to the task's fragility:

- **High freedom** (text instructions): Multiple approaches valid, context-dependent decisions
- **Medium freedom** (pseudocode/scripts with params): Preferred pattern exists, some variation OK
- **Low freedom** (specific scripts): Operations fragile, consistency critical, exact sequence required
- **High freedom** (text instructions): "Choose an appropriate error message" โ€” multiple valid answers
- **Medium freedom** (pseudocode with params): "Format errors as `ERROR: {context} โ€” {detail}`" โ€” pattern fixed, content varies
- **Low freedom** (exact scripts): `sys.exit(1)` on failure โ€” no variation allowed

## Routing

Expand All @@ -36,6 +45,23 @@ Match specificity to the task's fragility:
- **Merging skill variants** ("merge", "ๅˆๅนถ็‰ˆๆœฌ", "่žๅˆ") โ†’ ่ฏปๅ– `references/merge.md`
- **Uninstalling a skill** ("uninstall", "ๅˆ ้™ค skill", "ๅธ่ฝฝ") โ†’ use `scripts/uninstall.py --name <skill> --yes`

### Quick Example: Reflect โ†’ Evaluate โ†’ Publish

```bash
# 1. Reflect โ€” skill failed on "create a meeting" (routed to doc-writer instead of calendar)
# Fix: narrow trigger in SKILL.md description, add negative example
grep -rl "create" .claude/skills/ | head -5 # impact scan: find colliding triggers

# 2. Evaluate โ€” verify the fix didn't break other cases
python3 .claude/skills/prompt-eval/scripts/run_eval.py \
--prompts /tmp/skill-system-prompt.md \
--tests .claude/skills/doc-writer/evals.yaml \
--task-id reflect-fix-001 --output-dir /tmp/eval-out

# 3. Publish โ€” all tests pass + 3 successful runs + no recent reflects
python3 scripts/publish.py .claude/skills/doc-writer/
```

## When NOT to Create a Skill

Don't build for hypothetical future needs. Skip if ANY apply:
Expand All @@ -59,8 +85,7 @@ Tool design matters more than prompt design. When a skill has `scripts/`, invest

## Writing Guidelines

- **Do** include: non-obvious procedures, domain specifics, gotchas from real failures
- **Don't** include: things Claude already knows, verbose explanations, auxiliary docs
- **Keep** SKILL.md โ‰ค150 lines (routing layer); move scenario details to references/
- **Challenge each line**: "Would removing this cause Claude to make mistakes?" If not, cut it.
- **Prefer examples over explanations**: One concrete pair teaches more than a paragraph
- **Include**: non-obvious procedures, domain specifics, gotchas from real failures
- **Exclude**: things Claude already knows, verbose explanations, auxiliary docs
- **Size**: SKILL.md โ‰ค150 lines; move scenario details to `references/`
- **Litmus test**: "Would removing this line cause Claude to make mistakes?" If not, cut it