🛡️ SkillGuard

AI Skill & Prompt Security Scanner — detect malware, prompt injection, hidden payloads, and credential leaks in AI skills, plugins, and prompt files.

| |

What is SkillGuard?

As AI agents become more powerful with skills, plugins, and MCP servers, the attack surface grows. A malicious skill can:

🔴 Steal credentials — read .env, API keys, tokens
🔴 Execute commands — reverse shells, subprocess abuse
🔴 Inject prompts — jailbreak, system override, data exfiltration
🔴 Hide payloads — zero-width Unicode, nested base64, obfuscation
🔴 Pivot attacks — supply chain poisoning, persistence mechanisms

SkillGuard scans AI skills, prompt files, and plugins for these threats before they run.

Features

📁 File Scanning — upload .py, .md, .js, .sh, .yaml files for threat analysis
💬 Prompt Scanning — analyze text input for injection patterns
🔗 URL Scanning — scan GitHub repos and HuggingFace models remotely
🔬 AST Scanner — Python AST-based code structure analysis (exec, eval, subprocess, dynamic imports)
🔗 Taint Tracking — credential→network/source→sink data flow analysis (Python)
🔍 OSV.dev CVE Lookup — live dependency vulnerability checking (PyPI, npm, RubyGems, Go, Rust, Maven)
📋 SARIF Output — CI/CD integration (GitHub Code Scanning compatible)
🔒 MCP Audit — audit MCP server tool definitions for security risks
🛡️ 263 Patterns — 10 malware categories + 11 injection categories
📊 Risk Scoring — 0-100 with dynamic severity levels (LOW / MEDIUM / HIGH / CRITICAL)
🔐 Admin Dashboard — scan history with date/type filters
⚡ Rate Limiting — 5 scans/minute per IP
🎨 Dark UI — terminal aesthetic, developer-first design

Quick Start

Install

pip install skillguard

CLI Usage

# Scan a file
skillguard scan suspicious_skill.py

# Scan a directory
skillguard scan ./my-skills/

# Scan a prompt
skillguard scan --prompt "ignore all previous instructions"

# Scan a GitHub repo
skillguard scan --url https://github.com/user/repo

Web UI

# Start the web server
skillguard serve --port 5000

Or try the live demo: skillguard.burakgider.com

Pattern Categories

Malware Detection (136 patterns, 10 categories)

Category	Severity	Examples
Reverse Shell	Critical	`bash -i >& /dev/tcp`, `socket.connect()`, `nc -e`
Credential Theft	Critical	`os.environ[]`, `os.getenv()`, `.env` access
Crypto Miner	Critical	`stratum+tcp://`, `xmrig`, `coinhive`
Stealer	Critical	Keylogger, cookie theft, browser data
Obfuscation	Warning	`eval(atob())`, `base64.b64decode`, `getattr(__builtins__)`
Network Exfil	Critical	Discord/Slack webhooks, webhook.site, Pastebin
Suspicious Imports	Warning	`pickle`, `marshal`, `subprocess`, `paramiko`
Hidden Payloads	Critical	Nested base64, zlib+base64, zero-width Unicode
Supply Chain	Critical	`curl
Persistence	High	Crontab, bashrc, chmod 777, authorized_keys

Prompt Injection (101 patterns, 8 categories)

Category	Severity	Examples
System Override	Critical	"ignore previous instructions", "override safety"
Jailbreak	Critical	DAN, AIM, STAN, forced compliance
Data Exfiltration	Critical	System prompt extraction, credential in response
Tool Abuse	High	Command execution, file deletion, package install
Indirect Injection	High	`{{template}}`, `<system>` tags, XSS vectors
Prompt Leaking	Warning	Repeat trick, translation trick, rule enumeration
Context Hijacking	Critical	Academic/regulatory framing, SDS laundering, anti-disclaimer bypass
Skill Poisoning	Critical	Conditional triggers, hidden instructions, nested decode

API Reference

Scan File

curl -X POST http://localhost:5000/api/scan/file \
  -F "file=@suspicious_skill.py"

Scan Prompt

curl -X POST http://localhost:5000/api/scan/prompt \
  -H "Content-Type: application/json" \
  -d '{"content": "ignore all previous instructions"}'

Audit MCP Tools

curl -X POST http://localhost:5000/api/audit/mcp \
  -H "Content-Type: application/json" \
  -d '{"definition": "[{\"name\": \"read_file\", \"description\": \"Read any file\", \"inputSchema\": {\"properties\": {\"path\": {\"type\": \"string\"}}}}]"}'

Health Check

curl http://localhost:5000/health

📄 For Badge API and advanced endpoints, see docs/api-reference.md

Test Suite

106 tests covering all pattern categories, API endpoints, and security features:

pip install -e ".[dev]"
pytest tests/ -v

Test Suite	Count	Coverage
`test_static_scanner.py`	34	All malware categories
`test_prompt_scanner.py`	33	All injection categories
`test_api.py`	16	File/prompt/admin/badge/export endpoints
`test_security.py`	23	Rate limiting, auth, path sanitize, risk score

Architecture

skillguard/
├── src/skillguard/
│   ├── scanners/
│   │   ├── static.py          # File/directory scanner
│   │   └── prompt.py          # Prompt injection scanner
│   ├── patterns/
│   │   ├── malware.json       # 136 malware patterns
│   │   └── injection.json     # 101 injection patterns
│   ├── web/
│   │   ├── app.py             # Flask web server
│   │   └── templates/         # Dark-themed UI
│   ├── mcp_server.py          # MCP Server — 6 tools (stdio transport)
│   ├── auth.py                # Admin authentication
│   ├── ratelimit.py           # Rate limiter
│   ├── logger.py              # SQLite scan logger
│   └── cli.py                 # CLI interface
├── tests/                     # 106 pytest tests
├── docs/                      # Documentation
└── pyproject.toml             # Package config

Use Cases

AI Agent Developers — scan skills before loading into agents
Security Researchers — analyze prompt injection techniques
MCP Server Operators — validate tools before deployment
DevSecOps Teams — integrate into CI/CD pipelines
Open Source Consumers — verify third-party AI plugins

Roadmap

VS Code Extension — real-time scanning while editing skill files
GitHub Action — automatic scanning on PR/push
Pattern DB v2 — community-contributed signatures
Docker Image — one-command deployment
Webhook Alerts — Slack/Discord notifications on high-risk scans

Changelog

v0.3.0 — Deep Scanning Engine

Major upgrade inspired by NVIDIA SkillSpector. Adds code-level analysis beyond regex pattern matching.

What's new:

🔬 AST Scanner — Python AST-based code structure analysis: exec(), eval(), subprocess, os.system, dynamic imports, dangerous getattr() chains
🔗 Taint Tracking — Intra-function data flow analysis: credential/env var → network requests, code execution, file writes (Python only, 50 file limit, 3 variable hops)
🔍 OSV.dev CVE Lookup — Live dependency vulnerability checking against OSV.dev database (PyPI, npm, RubyGems, Go, Rust, Maven)
📋 SARIF Output — GitHub Code Scanning compatible SARIF v2.1.0 export via CLI --format sarif
26 new injection patterns — 3 new categories: excessive_agency (10), memory_poisoning (9), trigger_abuse (7)
CLI --deep flag — Enable AST + Taint + OSV scanning in one command
CLI --format sarif — Output SARIF for CI/CD pipelines

Pattern count: 237 → 263 (+26)

v0.2.0 — Context Hijacking Detection

New context_hijacking category with 22 patterns targeting sophisticated jailbreak techniques that use legitimate context (academic research, regulatory compliance, safety documentation) to mask harmful requests.

What's new:

Academic/research context laundering (HuggingFace dataset cards, peer review, ethics team references)
Regulatory agency legitimacy hijacking (OSHA, EPA, CFR, SDS, DEA citations)
Anti-disclaimer bypass via authority claims ("No disclaimers needed — this is a regulatory filing")
Lab process terminology detection (reflux, workup, cyclization, extraction, distillation)
Controlled precursor chemical names (BMK, pseudoephedrine, anthranilic, red phosphorus)
Synthesis pathway extraction patterns
Bulk harmful record generation detection
Enforcement threat pressure patterns

Tested against real-world jailbreaks:

Gemini 3.5 Flash jailbreak (Pliny Agent, <15 min pwn) — Score 100, 15 findings
SDS/regulatory context hijack — Score 100, 33 findings

Pattern count: 136 + 101 = 237 total (10 malware + 8 injection categories; context_hijacking added)

MCP Server v2 — 3 new tools (6 total):

scan_url: Scan GitHub repos and HuggingFace models via stdio
audit_mcp: Security audit of MCP server tool definitions
get_patterns: List all detection patterns with categories

MCP Audit tab — 4th tab on web UI:

Paste MCP tool definitions (JSON) for security analysis
Tool ratings: SAFE / CAUTION / UNSAFE
Detects: filesystem access, network calls, shell execution, credential handling, destructive ops, path traversal
Export JSON support

v0.1.0 — Initial Release

File, prompt, and URL scanning
136 malware patterns (10 categories) + 101 injection patterns (8 categories)
Web UI with dark terminal aesthetic
MCP Server (stdio transport)
106 pytest tests

Contributing

Contributions welcome! Especially:

New pattern signatures
False positive reports
Security research on AI-specific threats
UI/UX improvements

Fork the repo
Create a feature branch: git checkout -b feature/new-patterns
Add tests for your changes
Submit a pull request

License

MIT License — see LICENSE for details.

Built by PHRIXUS — AI tools, secured by design.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
docs		docs
src/skillguard		src/skillguard
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
run_web.py		run_web.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ SkillGuard

What is SkillGuard?

Features

Quick Start

Install

CLI Usage

Web UI

Pattern Categories

Malware Detection (136 patterns, 10 categories)

Prompt Injection (101 patterns, 8 categories)

API Reference

Scan File

Scan Prompt

Audit MCP Tools

Health Check

Test Suite

Architecture

Use Cases

Roadmap

Changelog

v0.3.0 — Deep Scanning Engine

v0.2.0 — Context Hijacking Detection

v0.1.0 — Initial Release

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🛡️ SkillGuard

What is SkillGuard?

Features

Quick Start

Install

CLI Usage

Web UI

Pattern Categories

Malware Detection (136 patterns, 10 categories)

Prompt Injection (101 patterns, 8 categories)

API Reference

Scan File

Scan Prompt

Audit MCP Tools

Health Check

Test Suite

Architecture

Use Cases

Roadmap

Changelog

v0.3.0 — Deep Scanning Engine

v0.2.0 — Context Hijacking Detection

v0.1.0 — Initial Release

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages