Governed Reconnaissance, Intelligence Management, Network Inspection & Reporting
A Claude Code orchestrator that coordinates specialized security testing agents for comprehensive penetration testing engagements. GRIMNIR classifies targets, scopes engagements, delegates work to the right specialist, tracks progress, shares intelligence across agents, and compiles unified reports.
┌──────────────────────────┐
│ GRIMNIR │
│ PM / Orchestrator │
│ │
│ Classifies targets │
│ Scopes engagements │
│ Tracks state & intel │
│ Delegates testing │
│ Compiles reports │
└───────────┬──────────────┘
│
┌───────────▼──────────────┐
│ HUGINN │
│ Phase 0 Recon │
│ │
│ Subdomain enum │
│ Port scanning │
│ Tech fingerprinting │
│ Asset map generation │
│ Intel distribution │
└───┬─────────┬─────────┬──┘
│ │ │
┌────────────┘ │ └────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ HUNTER │ │ SPECTER │ │ ARTEMIS │
│ Network & Infra │ │ API & Web Sec │ │ GenAI & LLM Sec │
│ │ │ │ │ │
│ Port Scanning │ │ REST, GraphQL │ │ Prompt Injection│
│ Service Enum │ │ gRPC, Auth │ │ Jailbreaking │
│ Exploitation │ │ Injection │ │ System Prompt │
│ Priv Escalation │ │ Business Logic │ │ Agent Abuse │
│ Lateral Movement│ │ WAF Bypass │ │ RAG Poisoning │
└──────────────────┘ └──────────────────┘ └──────────────────┘
| Agent | Role | Model | Named After |
|---|---|---|---|
| GRIMNIR | Orchestrator / PM | opus | Odin's mask ("the hooded one") |
| HUGINN | Recon & Intel | sonnet | Odin's raven ("thought") |
| Specter | API & Web Security | sonnet | — |
| ARTEMIS | GenAI & LLM Security | sonnet | AI Red Team Engagement, Methodology & Intelligence System |
| Hunter | Network & Infrastructure | sonnet | — |
All specialist agents (Specter, ARTEMIS, Hunter) include an Operating Protocol that defines event-driven rules for state updates (triggered by phase transitions), intel logging (triggered by cross-agent-relevant discoveries), and intel consumption (check routed intel at startup). This ensures agents use the data systems consistently without requiring hard-coded quotas.
# Launch via wrapper (recommended)
grimnir
# Or from the project directory
cd ~/workspace/grimnir && claudeOn startup, GRIMNIR shows available commands. You can use slash commands or natural language:
# Slash commands
/engage acme-corp # Start engagement (reads project.md if it exists)
/full-audit acme-corp # Jump straight to testing
# Natural language — GRIMNIR detects intent automatically
"I need to pentest acme-corp, details are in project.md"
"Test the API at https://api.acme.com"
"Resume the acme-corp engagement"Create ~/workspace/<project>/project.md before starting — GRIMNIR reads it automatically during /engage:
# Copy the template
mkdir -p ~/workspace/acme-corp
cp ~/workspace/grimnir/templates/project.md ~/workspace/acme-corp/project.md
# Edit with your target details, credentials, scope, etc.The intake file replaces manual Q&A — GRIMNIR extracts target info, credentials, scope, and priorities from it. A template is at templates/project.md. If no intake file exists, GRIMNIR asks for details interactively.
| Command | Description |
|---|---|
/engage <target> |
Start new engagement (classify, scope, plan, initialize state) |
/scope <target> |
Define engagement scope and authorization |
/recon <target> |
Coordinated reconnaissance (delegates to HUGINN) |
/net-audit <target> |
Network/infrastructure audit (Hunter) |
/api-audit <target> |
API/web audit (Specter) |
/genai-audit <target> |
GenAI/LLM audit (ARTEMIS) |
/full-audit <target> |
Combined audit — HUGINN recon, then all applicable specialists |
/report |
Generate unified report from all agent findings |
/resume <project> [agent] |
Resume a paused or interrupted engagement from checkpoint |
/progress |
Engagement progress dashboard |
/sitrep |
Strategic situation report |
/intel |
Query and manage the cross-agent intel feed |
/engage target
│
├── Classify target type
├── Define scope & authorization
├── Generate test plan
│
▼
/full-audit target
│
├── Phase 0: HUGINN recon (foreground)
│ ├── Passive: subdomains, DNS, certs, URL archives
│ ├── Active: port scan, service detection, HTTP probing
│ ├── Fingerprint: technologies, WAF, headers, TLS
│ ├── Map: crawling, JS analysis, endpoint discovery
│ └── Output: asset-map.json + intel.jsonl
│
├── Phase 1: Specialists (parallel) + HUGINN correlator (background)
│ ├── Specter: API/web testing (skips recon, uses asset map)
│ ├── ARTEMIS: AI/LLM testing (skips recon, uses asset map)
│ └── Hunter: Network testing (skips discovery, uses asset map)
│
├── Phase 2: Cross-domain attack chains
│
└── Phase 3: Unified reporting
GRIMNIR includes three foundational data layers that enable structured tracking, cross-agent intelligence, and engagement resumability.
All findings are created as JSON with companion markdown via scripts/finding.sh. Each agent has type-specific extensions:
- Specter: endpoint, HTTP method, request/response, response code
- ARTEMIS: success rate, model, guardrail status, attack vector
- Hunter: affected hosts, ports, protocols, CVEs, access level gained
Findings support attack chain tagging — chain_id and chain_position fields link related findings across agents into multi-step attack narratives (e.g., Hunter compromises a host → Specter exploits the web app → ARTEMIS extracts the AI system prompt).
Findings are stored per-agent ($PROJECT_DIR/<agent>/findings/FINDING-NNN.json) and merged at report time. Schema: schemas/finding.schema.json.
scripts/state.sh tracks per-agent phase, status, progress, and finding counts in state.json. Supports:
- Phase transitions with automatic history tracking
- Checkpoints for session resumability (completed tests, pending tests, last action)
- Auto-derived status — overall engagement status computed from agent statuses
Schema: schemas/state.schema.json.
scripts/intel.sh manages intel.jsonl — an append-only JSONL feed for cross-agent intelligence sharing. Discovery types include endpoints, services, credentials, technologies, vulnerability hints, subdomains, certificates, and relationships.
Intel items are tagged for routing (needs-huginn, needs-specter, needs-artemis, needs-hunter, high-value) so agents can pick up discoveries from other agents.
HUGINN pre-populates the intel feed during Phase 0 recon so specialists start with rich context.
Schema: schemas/intel.schema.json.
scripts/handoff.sh auto-serializes engagement state into structured JSON payloads for agent delegation. It reads state.json, scope.md, asset-map.json, and intel.jsonl, then outputs a single JSON payload containing:
- Engagement ID, target info, scope (in/out)
- Agent-relevant asset map sections (Specter gets endpoints/WAF, Hunter gets hosts/ports, ARTEMIS gets AI indicators)
- Unprocessed intel items routed to the target agent
- Checkpoint data for resuming interrupted work (
--resume) - Optional rate limits, proxy config, and deadlines
# Standard handoff
bash scripts/handoff.sh --project-dir $HOME/workspace/acme-corp --agent specter
# Resume from checkpoint
bash scripts/handoff.sh --project-dir $HOME/workspace/acme-corp --agent hunter --resumeHUGINN produces asset-map.json — a structured inventory of the entire attack surface. Includes hosts, subdomains, technologies, endpoints, auth mechanisms, AI indicators, WAF detection, TLS certificates, and JS analysis results.
Specialists receive relevant sections in their handoff payloads, eliminating duplicated recon work.
Schema: schemas/asset-map.schema.json.
All engagement data lives under ~/workspace/$PROJECT/ using absolute paths. This is enforced at the system level, project level, and in every agent definition.
~/workspace/acme-corp/ (example engagement)
├── project.md # Intake file (target, creds, scope, priorities)
├── scope.md # Authorization and scope
├── classification.md # Target classification
├── test-plan.md # Unified test plan
├── state.json # Engagement state (state.sh)
├── intel.jsonl # Cross-agent intel feed (intel.sh)
├── asset-map.json # HUGINN's asset inventory
├── recon/ # HUGINN's recon output (shared)
├── specter/ # Specter's working directory
│ ├── recon/
│ ├── findings/ # FINDING-NNN.json + .md
│ └── reports/
├── artemis/ # ARTEMIS's working directory
│ ├── recon/
│ ├── findings/
│ └── reports/
├── hunter/ # Hunter's working directory
│ ├── recon/
│ ├── findings/
│ └── reports/
├── findings/ # Merged findings (post-dedup)
├── evidence/ # Screenshots, PoCs, captures
├── logs/ # Tool output, scan logs
└── report.md # Unified final report
grimnir/
├── CLAUDE.md ← Orchestrator identity, workflow, classification rules
├── .claude/
│ ├── agents/ ← Agent definitions (HUGINN, Hunter, Specter, ARTEMIS, report-compiler)
│ ├── commands/ ← Slash commands (/engage, /recon, /intel, etc.)
│ ├── skills/ ← Target classification, engagement management
│ └── settings.json ← Permissions and hooks
├── schemas/ ← JSON schemas (findings, state, intel, asset-map)
├── scripts/ ← Core scripts (finding.sh, state.sh, intel.sh, handoff.sh)
├── docs/ ← Project documentation (landscape analysis, etc.)
├── templates/ ← Report template, finding rendering, project intake template
├── huginn/ ← Reconnaissance & intel agent
│ ├── scripts/ ← 3 scripts (recon-pipeline, asset-map, correlator)
│ └── docs/ ← Recon methodology
├── hunter/ ← Network & infrastructure agent
│ ├── .claude/ ← 6 commands, 10 skills, settings
│ ├── scripts/ ← 6 scripts (discovery, enum, vuln scan, cred spray, SMB, full audit)
│ └── wordlists/ ← Ports, credentials, SNMP communities
├── specter/ ← API & web security agent
│ ├── .claude/ ← 7 subagents, 11 skills, 10 commands
│ ├── scripts/ ← 8 scripts
│ ├── wordlists/ ← 9 wordlists
│ └── templates/ ← Nuclei templates, checklists
├── artemis/ ← GenAI & LLM security agent
│ ├── .claude/ ← 7 subagents, 11 skills, 11 commands
│ ├── scripts/ ← 9 scripts
│ ├── wordlists/ ← 9 wordlists
│ └── references/ ← OWASP, methodology, payloads
└── setup/ ← Provisioning system
├── setup.sh ← Main entry point
├── profiles/ ← lite, standard, full
└── modules/ ← Modular installers
# Raspberry Pi / drop box (2-4GB)
./setup/setup.sh --profile lite
# Pi 5 / VPS (4-8GB)
./setup/setup.sh --profile standard
# Kali workstation (8GB+)
./setup/setup.sh --profile full
# Preview without installing
./setup/setup.sh --dry-run --profile lite| Profile | Agents | RAM | Metasploit |
|---|---|---|---|
| lite | HUGINN + Hunter (recon only) | 2-4GB | No |
| standard | HUGINN + Hunter + Specter | 4-8GB | Optional |
| full | HUGINN + Hunter + Specter + ARTEMIS | 8GB+ | Yes |
| Framework | Agent | Categories |
|---|---|---|
| PTES | Hunter | Network/infrastructure methodology |
| MITRE ATT&CK | Hunter | Adversarial tactics and techniques |
| OWASP API Top 10 (2023) | Specter | API1-API10 |
| OWASP WSTG v5.0 | Specter | WSTG-* |
| OWASP LLM Top 10 (2025) | ARTEMIS | LLM01-LLM10 |
| OWASP Agentic Top 10 (2026) | ARTEMIS | ASI01-ASI10 |
| MITRE ATLAS | ARTEMIS | Adversarial ML tactics |
GRIMNIR automatically classifies targets to determine which agents to deploy. HUGINN always runs Phase 0 recon first.
| Target Type | Specialists | Examples |
|---|---|---|
| IP / CIDR / hostname | Hunter | 10.10.10.0/24, server.corp.local |
| REST API | Specter | OpenAPI/Swagger, JSON endpoints |
| GraphQL / gRPC | Specter | /graphql, HTTP/2 + protobuf |
| Web application | Specter | HTML forms, server-rendered pages |
| LLM chatbot | ARTEMIS | Chat interface, AI responses |
| RAG application | ARTEMIS + Specter | Document upload + AI + API layer |
| Agentic system | ARTEMIS | AI with tool use, autonomous actions |
| Full-stack AI product | Specter + ARTEMIS | API layer + AI layer |
| Infrastructure + apps | Hunter + Specter | Mixed services on hosts |
- Scope enforcement —
scripts/scope.shthat agents call before requests to validate targets - Bug bounty export — per-finding submission format for HackerOne/Bugcrowd/Intigriti
- Retest mode —
/retestcommand to replay reproduction steps against confirmed findings - Custom nuclei templates — auto-generate templates from confirmed finding JSON
- Cloud/container module — AWS/Azure/GCP metadata, S3, IAM, container escapes
- Multi-provider support — model-agnostic variant for Gemini and other LLM providers