AI Skill & Prompt Security Scanner — detect malware, prompt injection, hidden payloads, and credential leaks in AI skills, plugins, and prompt files.
As AI agents become more powerful with skills, plugins, and MCP servers, the attack surface grows. A malicious skill can:
- 🔴 Steal credentials — read
.env, API keys, tokens - 🔴 Execute commands — reverse shells, subprocess abuse
- 🔴 Inject prompts — jailbreak, system override, data exfiltration
- 🔴 Hide payloads — zero-width Unicode, nested base64, obfuscation
- 🔴 Pivot attacks — supply chain poisoning, persistence mechanisms
SkillGuard scans AI skills, prompt files, and plugins for these threats before they run.
- 📁 File Scanning — upload
.py,.md,.js,.sh,.yamlfiles for threat analysis - 💬 Prompt Scanning — analyze text input for injection patterns
- 🔗 URL Scanning — scan GitHub repos and HuggingFace models remotely
- 🔬 AST Scanner — Python AST-based code structure analysis (exec, eval, subprocess, dynamic imports)
- 🔗 Taint Tracking — credential→network/source→sink data flow analysis (Python)
- 🔍 OSV.dev CVE Lookup — live dependency vulnerability checking (PyPI, npm, RubyGems, Go, Rust, Maven)
- 📋 SARIF Output — CI/CD integration (GitHub Code Scanning compatible)
- 🔒 MCP Audit — audit MCP server tool definitions for security risks
- 🛡️ 263 Patterns — 10 malware categories + 11 injection categories
- 📊 Risk Scoring — 0-100 with dynamic severity levels (LOW / MEDIUM / HIGH / CRITICAL)
- 🔐 Admin Dashboard — scan history with date/type filters
- ⚡ Rate Limiting — 5 scans/minute per IP
- 🎨 Dark UI — terminal aesthetic, developer-first design
pip install skillguard# Scan a file
skillguard scan suspicious_skill.py
# Scan a directory
skillguard scan ./my-skills/
# Scan a prompt
skillguard scan --prompt "ignore all previous instructions"
# Scan a GitHub repo
skillguard scan --url https://github.com/user/repo# Start the web server
skillguard serve --port 5000Or try the live demo: skillguard.burakgider.com
| Category | Severity | Examples |
|---|---|---|
| Reverse Shell | Critical | bash -i >& /dev/tcp, socket.connect(), nc -e |
| Credential Theft | Critical | os.environ[], os.getenv(), .env access |
| Crypto Miner | Critical | stratum+tcp://, xmrig, coinhive |
| Stealer | Critical | Keylogger, cookie theft, browser data |
| Obfuscation | Warning | eval(atob()), base64.b64decode, getattr(__builtins__) |
| Network Exfil | Critical | Discord/Slack webhooks, webhook.site, Pastebin |
| Suspicious Imports | Warning | pickle, marshal, subprocess, paramiko |
| Hidden Payloads | Critical | Nested base64, zlib+base64, zero-width Unicode |
| Supply Chain | Critical | `curl |
| Persistence | High | Crontab, bashrc, chmod 777, authorized_keys |
| Category | Severity | Examples |
|---|---|---|
| System Override | Critical | "ignore previous instructions", "override safety" |
| Jailbreak | Critical | DAN, AIM, STAN, forced compliance |
| Data Exfiltration | Critical | System prompt extraction, credential in response |
| Tool Abuse | High | Command execution, file deletion, package install |
| Indirect Injection | High | {{template}}, <system> tags, XSS vectors |
| Prompt Leaking | Warning | Repeat trick, translation trick, rule enumeration |
| Context Hijacking | Critical | Academic/regulatory framing, SDS laundering, anti-disclaimer bypass |
| Skill Poisoning | Critical | Conditional triggers, hidden instructions, nested decode |
curl -X POST http://localhost:5000/api/scan/file \
-F "file=@suspicious_skill.py"curl -X POST http://localhost:5000/api/scan/prompt \
-H "Content-Type: application/json" \
-d '{"content": "ignore all previous instructions"}'curl -X POST http://localhost:5000/api/audit/mcp \
-H "Content-Type: application/json" \
-d '{"definition": "[{\"name\": \"read_file\", \"description\": \"Read any file\", \"inputSchema\": {\"properties\": {\"path\": {\"type\": \"string\"}}}}]"}'curl http://localhost:5000/health📄 For Badge API and advanced endpoints, see docs/api-reference.md
106 tests covering all pattern categories, API endpoints, and security features:
pip install -e ".[dev]"
pytest tests/ -v| Test Suite | Count | Coverage |
|---|---|---|
test_static_scanner.py |
34 | All malware categories |
test_prompt_scanner.py |
33 | All injection categories |
test_api.py |
16 | File/prompt/admin/badge/export endpoints |
test_security.py |
23 | Rate limiting, auth, path sanitize, risk score |
skillguard/
├── src/skillguard/
│ ├── scanners/
│ │ ├── static.py # File/directory scanner
│ │ └── prompt.py # Prompt injection scanner
│ ├── patterns/
│ │ ├── malware.json # 136 malware patterns
│ │ └── injection.json # 101 injection patterns
│ ├── web/
│ │ ├── app.py # Flask web server
│ │ └── templates/ # Dark-themed UI
│ ├── mcp_server.py # MCP Server — 6 tools (stdio transport)
│ ├── auth.py # Admin authentication
│ ├── ratelimit.py # Rate limiter
│ ├── logger.py # SQLite scan logger
│ └── cli.py # CLI interface
├── tests/ # 106 pytest tests
├── docs/ # Documentation
└── pyproject.toml # Package config
- AI Agent Developers — scan skills before loading into agents
- Security Researchers — analyze prompt injection techniques
- MCP Server Operators — validate tools before deployment
- DevSecOps Teams — integrate into CI/CD pipelines
- Open Source Consumers — verify third-party AI plugins
- VS Code Extension — real-time scanning while editing skill files
- GitHub Action — automatic scanning on PR/push
- Pattern DB v2 — community-contributed signatures
- Docker Image — one-command deployment
- Webhook Alerts — Slack/Discord notifications on high-risk scans
Major upgrade inspired by NVIDIA SkillSpector. Adds code-level analysis beyond regex pattern matching.
What's new:
- 🔬 AST Scanner — Python AST-based code structure analysis:
exec(),eval(),subprocess,os.system, dynamic imports, dangerousgetattr()chains - 🔗 Taint Tracking — Intra-function data flow analysis: credential/env var → network requests, code execution, file writes (Python only, 50 file limit, 3 variable hops)
- 🔍 OSV.dev CVE Lookup — Live dependency vulnerability checking against OSV.dev database (PyPI, npm, RubyGems, Go, Rust, Maven)
- 📋 SARIF Output — GitHub Code Scanning compatible SARIF v2.1.0 export via CLI
--format sarif - 26 new injection patterns — 3 new categories:
excessive_agency(10),memory_poisoning(9),trigger_abuse(7) - CLI
--deepflag — Enable AST + Taint + OSV scanning in one command - CLI
--format sarif— Output SARIF for CI/CD pipelines
Pattern count: 237 → 263 (+26)
New context_hijacking category with 22 patterns targeting sophisticated jailbreak techniques that use legitimate context (academic research, regulatory compliance, safety documentation) to mask harmful requests.
What's new:
- Academic/research context laundering (HuggingFace dataset cards, peer review, ethics team references)
- Regulatory agency legitimacy hijacking (OSHA, EPA, CFR, SDS, DEA citations)
- Anti-disclaimer bypass via authority claims ("No disclaimers needed — this is a regulatory filing")
- Lab process terminology detection (reflux, workup, cyclization, extraction, distillation)
- Controlled precursor chemical names (BMK, pseudoephedrine, anthranilic, red phosphorus)
- Synthesis pathway extraction patterns
- Bulk harmful record generation detection
- Enforcement threat pressure patterns
Tested against real-world jailbreaks:
- Gemini 3.5 Flash jailbreak (Pliny Agent, <15 min pwn) — Score 100, 15 findings
- SDS/regulatory context hijack — Score 100, 33 findings
Pattern count: 136 + 101 = 237 total (10 malware + 8 injection categories; context_hijacking added)
MCP Server v2 — 3 new tools (6 total):
scan_url: Scan GitHub repos and HuggingFace models via stdioaudit_mcp: Security audit of MCP server tool definitionsget_patterns: List all detection patterns with categories
MCP Audit tab — 4th tab on web UI:
- Paste MCP tool definitions (JSON) for security analysis
- Tool ratings: SAFE / CAUTION / UNSAFE
- Detects: filesystem access, network calls, shell execution, credential handling, destructive ops, path traversal
- Export JSON support
- File, prompt, and URL scanning
- 136 malware patterns (10 categories) + 101 injection patterns (8 categories)
- Web UI with dark terminal aesthetic
- MCP Server (stdio transport)
- 106 pytest tests
Contributions welcome! Especially:
- New pattern signatures
- False positive reports
- Security research on AI-specific threats
- UI/UX improvements
- Fork the repo
- Create a feature branch:
git checkout -b feature/new-patterns - Add tests for your changes
- Submit a pull request
MIT License — see LICENSE for details.
Built by PHRIXUS — AI tools, secured by design.