English | δΈζ
Audit, fix, and optimize agent skill descriptions so they trigger reliably. Works with OpenClaw, Claude Code, Cursor, and any agentskills.io-compatible platform.
The problem: 65% of skills never trigger because their descriptions don't tell the agent when to fire. Skill Compass finds these and fixes them β automatically where possible, AI-assisted where not.
When first installed (or when you say "audit my skills"), Skill Compass runs a comprehensive ecosystem health check:
- Scan all installed skills β audit every SKILL.md for YAML validity, description quality, and discoverability
- Present a health report β total skills, healthy vs broken, token budget usage, issues found
- Ask: auto-fix or manual? β "I found X issues. Want me to fix them automatically?"
- Conflict detection β flag any skills with overlapping trigger keywords
- Token budget advice β warn if total description chars are approaching the limit
- Record onboarding β subsequent runs skip the overview and go straight to audit workflow
π Skill Ecosystem Health Report
Total skills: 18
β
Healthy: 15
β οΈ Needs attention: 3
β Broken: 0
Token budget: 8,420 / 30,000 chars (28% used)
Issues found:
β’ 3 skills with low description scores (<70)
β’ 10 skills missing negative constraints
β’ 18 skills missing category assignment
β’ 1 trigger conflict detected
Fix all automatically? (y/n)
Onboarding runs only once. Say "full report" anytime to see it again.
The single most effective way to prevent false activations. Every skill description must specify both when to trigger and when NOT to trigger.
[Trigger condition] + [Negative constraint] + [Capability declaration] + [Search vocabulary]
Example:
Use when creating, editing, or auditing PowerPoint presentations (.pptx).
Do NOT use for Google Slides, Keynote, or general document editing.
Covers layouts, placeholders, charts, notes, and visual QA.
Triggers on: PPT, PPTX, slides, deck, presentation, εΉ»η―η, ζΌη€Ίζη¨Ώ.
Without negative constraints, a "search" skill might trigger for code search, file search, and web search. With them:
// Good β clear boundary
"Use when the user asks to search the web.
Do NOT use for searching local files, code, or GitHub issues."
// Bad β no boundary
"Use when the user asks to search for information online."
| Template | When to Use | Example |
|---|---|---|
Do NOT use for [adjacent domain]. |
Two skills cover similar verbs on different targets | Do NOT use for PDF editing or document creation. |
Not for [common false trigger phrase]. |
Users often phrase requests ambiguously | Not for creating new presentations from scratch. |
Do NOT invoke when [specific condition]. |
Skill has prerequisites or constraints | Do NOT invoke when the user only wants to read (not edit) the file. |
This skill does NOT handle [capability]. Use [other skill] instead. |
Direct hand-off to another skill | This skill does NOT handle image generation. Use the image_generate tool instead. |
# Check which skills are missing negative constraints
python3 scripts/audit_skills.py --routing
# Generate suggested negative constraints for skills missing them
python3 scripts/audit_skills.py --negative-samplesOutput:
π github (uncategorized)
Current: Use when the user asks to interact with GitHub...
Suggested negative constraint:
"Do NOT use for code-review-and-quality, git-workflow-and-versioning.
Those have dedicated skills."
Scoring impact: Missing negative constraints now costs β15 points (was β5). Skills with explicit negatives get a +5 bonus.
When skill count grows beyond ~15, flat matching degrades β skills compete with each other, token budget inflates, and false triggers multiply. Hierarchical routing solves this with two-stage dispatch.
User query
β
Layer 1: Category matching ("Is this about coding? documents? system?")
β
Layer 2: Skill matching within winning category ("Which coding skill?")
Skills in different categories never compete with each other.
Add an optional category field to each skill's YAML frontmatter:
---
name: github
category: coding
description: Use when the user asks to interact with GitHub...
---| Category | Covers |
|---|---|
coding |
GitHub, code review, git workflow, debugging, implementation |
documents |
Feishu docs, PPTX, PDF, README, writing |
system |
Healthcheck, node-connect, auto-updater, security |
communication |
Search, weather, translations |
creative |
Image generation, music, video, prompt optimization |
meta |
Skill compass, skill creator, skill vetter, onboarding |
reasoning |
Self-discover, self-refine, cognitive frameworks |
python3 scripts/audit_skills.py --routingOutput:
Category Skills Chars % Budget
ββββββββββββββββββββββββββββββββββββββββββββββββββ
coding 5 1200 23.0%
documents 4 980 18.7%
system 3 720 13.7%
uncategorized 2 510 9.7% β οΈ
π Negative Constraints: 9/14 have "Do NOT use for..."
β οΈ 5 skill(s) missing negative constraints
β‘ Cross-Category Keyword Leaks (1):
coding β documents (32% overlap)
Shared: file, create, edit, write
- Lower token pressure β only relevant category descriptions need detailed parsing
- Fewer false triggers β skills in different categories never compete
- Scales better β adding skill #30 doesn't degrade skill #5's accuracy
- Built-in disambiguation β category separation provides baseline boundary even without per-skill negatives
For advanced strategies (sub-categories, multi-level hierarchies), see references/hierarchical-routing.md.
# ββ Core Workflow ββ
# One-command setup: audit + auto-fix + AI prompt (also triggers onboarding)
python3 scripts/audit_skills.py --init
# Audit only β scan and score all skills
python3 scripts/audit_skills.py
# Auto-fix YAML syntax errors (creates backup automatically)
python3 scripts/audit_skills.py --fix
# Generate AI prompt to rewrite bad descriptions
python3 scripts/audit_skills.py --suggest
# JSON output for CI/CD pipelines
python3 scripts/audit_skills.py --json
# ββ Routing & Negative Constraints ββ
# Analyze hierarchical routing health
python3 scripts/audit_skills.py --routing
# Generate negative constraint suggestions for skills missing them
python3 scripts/audit_skills.py --negative-samples
# ββ Backup & Rollback ββ
# Snapshot all SKILL.md files
python3 scripts/audit_skills.py --backup
# List available backups
python3 scripts/audit_skills.py --list-backups
# Rollback to latest backup
python3 scripts/audit_skills.py --rollback
# Rollback to a specific backup
python3 scripts/audit_skills.py --rollback --backup-id 20260614-151204
# ββ Upstream Feedback ββ
# Generate GitHub issue drafts to share fixes with skill authors
python3 scripts/audit_skills.py --feedback--init runs the full setup in one pass β audit, auto-fix, and AI prompt generation:
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
β 1. Audit ββββββΆβ 2. Fix YAML ββββββΆβ 3. Suggest promptβ
β (find bugs) β β (auto patch)β β (AI rewrites) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
β² β
β Agent applies rewrites β
βββββββββββββββββββββββββββββββββββββββββββ
Re-audit to verify
| Step | Command | What happens | Needs human? |
|---|---|---|---|
| Audit | --init or bare |
Scans all SKILL.md files, scores each 0β100 | No |
| Auto-fix | --fix or --init |
Fixes YAML syntax errors, creates backup first | No |
| Suggest | --suggest or --init |
Outputs a prompt with rules + context for each failing skill | No |
| Rewrite | Agent reads the prompt | Rewrites descriptions following the rules | No β agent does it |
| Verify | bare | Re-run audit to confirm scores improved | No |
Each skill gets a score 0β100 based on six factors:
| Factor | Weight | What it checks |
|---|---|---|
| Trigger condition | Β±35 | Does the description say when to fire? ("Use when...") |
| Negative constraint | Β±15 | Does it say what NOT to fire on? ("Do NOT use for...") |
| Language strength | Β±15 | Directive ("ALWAYS invoke") vs passive ("Helps with") |
| YAML validity | Pass/fail | Does frontmatter parse without errors? |
| Description length | Β±15β30 | Under 350 chars, over 30 chars, enough keywords (β₯10 words) |
| Conflict detection | Advisory | Do descriptions overlap with other skills? (Jaccard > 0.4) |
Real data: Directive descriptions ("Use when...") achieve ~100% activation rate. Passive descriptions ("Helps with...") only ~37%. (Source: 650-skill activation experiment, Ivan Seleznov, Medium)
| Problem | Prevalence | Who fixes it | How |
|---|---|---|---|
| Missing trigger condition | 65% | Agent (via --suggest) |
Rewrites with "Use when..." |
| Missing negative constraint | common | Script + Agent | --negative-samples suggests, agent applies |
| No category assigned | common | Script (advisory) | --routing flags uncategorized skills |
| Weak/passive language | common | Agent (via --suggest) |
Rewrites to directive form |
| YAML syntax errors | 15% | Script (--fix) |
Quotes unquoted colons automatically |
| Description too long | 10% | Script (advisory) | Flags for trimming |
| Description overlap | 5% | Agent (judgment call) | Add "Do NOT use for..." constraints |
| Cross-category keyword leak | rare | Script (advisory) | --routing flags shared keywords |
| Token budget exceeded | rare | Script (advisory) | Reports total chars across all skills |
Every --fix and --init automatically creates a backup before modifying files:
| Command | What it does |
|---|---|
--backup |
Snapshot all SKILL.md files to ~/.skill-compass-backups/ |
--list-backups |
Show all snapshots with timestamps and file counts |
--rollback |
Restore everything from the latest backup |
--rollback --backup-id <ID> |
Restore from a specific backup |
--feedback compares your improved descriptions against the last backup and generates GitHub issue drafts to share with the original skill authors:
π§ UPSTREAM FEEDBACK
Found 1 improved description(s).
ββββββββββββββββββββββββββββββββββββββ
1. multi-search-engine
ββββββββββββββββββββββββββββββββββββββ
Repo: https://github.com/Thomaszhou22/multi-search-engine
Submit with:
gh issue create -R Thomaszhou22/multi-search-engine \
--title "Improve description for reliable agent triggering" \
--body "..."
This creates a community improvement loop: fix locally β share upstream β all users benefit.
# ClawHub
clawhub install skill-compass-guardian
# Or from source
git clone https://github.com/Thomaszhou22/skill-compass.git
cp -r skill-compass ~/.openclaw/skills/Requires Python 3.8+. No other dependencies.
skill-compass/
βββ SKILL.md # Diagnostic workflow + fix guide for agents
βββ scripts/
β βββ audit_skills.py # Automated scanner (Python 3, no deps)
βββ references/
β βββ failure-patterns.md # 6 failure patterns with real examples
β βββ hierarchical-routing.md # Layered dispatch guide & best practices
βββ README.md
- β OpenClaw
- β Claude Code
- β Cursor, Codex CLI, Gemini CLI (anything using SKILL.md)
- β Any platform following the agentskills.io standard
MIT-0 β Free to use, modify, and distribute. No attribution required.