Features • Installation • Usage • AI Hunt • Database • Scope
JAXEN is a single Go binary that harvests Shodan queries into a stateful SQLite asset database and drives AI/ML infrastructure discovery at scale. It issues queries, paginates the full population, and persists every discovered host into empire.db with IP, port, org, ISP, hostname, product, version, and Shodan index-computed favicon hash. Pre-built dork workflows cover six AI/ML exposure categories. The import subcommand ingests external tool output with optional Shodan enrichment. Forensic subcommands deep-inspect TLS certificates from live IPs, PEM files, or extracted firmware trees.
A raw Shodan query gives you a list of hits. JAXEN gives you a stateful corpus that survives across sessions, lets you diff against the previous harvest, and feeds the next stage of the assessment chain.
- Single Go binary, no CGo, no C compiler required (pure-Go SQLite via
modernc.org/sqlite) - Stateful
empire.dbasset database, IP plus port primary key, idempotent upserts - Six pre-built AI/ML dork workflows (vector-db, inference, orchestration, gpu, mlops, gateway)
--webmode uses a browser session through shodan-fetch instead of an API key (zero query credits)importsubcommand ingests external tool output, with--no-lookupfor manual Shodan web exports- Favicon hash enrichment passive (no extra request), product-default match flagged in notes
- TLS cert forensics from live IPs, PEM files, or extracted firmware directory trees
- CIDR diffing with Slack and Discord webhook alerting on new assets
runsubcommand compiles and executes a Go file with directempire.dbaccess (per-survey analysis lives in version control)
go install -v github.com/nuclide-research/JAXEN@latestOr build from source:
git clone https://github.com/nuclide-research/JAXEN
cd JAXEN
go build -o jaxen .Requires Go 1.25 or later.
Configure the Shodan API key:
export SHODAN_API_KEY="your_key"
# or write it to ~/.config/shodan/api_key (Shodan CLI location)
# env var takes precedence; both are checkedjaxen hunt --max 200 --clean 'product:"Ollama" port:11434'
jaxen ai-hunt vector-db
jaxen import --no-lookup ips.txt
jaxen diff --webhook https://hooks.slack.com/...
jaxen aimap 192.0.2.10Full command surface
jaxen hunt [--clean] [--export] [--passive <domain>] [--web] [--pages N]
[--max N] [--delay N] "<query>"
jaxen ai-hunt [category]
jaxen menlo-hunt [--org name]
jaxen buckets [--workers N] [--timeout N] <org-name>
jaxen profile [--org name] [<ip>]
jaxen cert-parse <path>
jaxen pivot <url>
jaxen aimap <ip|cidr> [hostname]
jaxen analyze [--fast]
jaxen graph
jaxen diff [--webhook <url>] [old.json] [new.json]
jaxen list [--org <filter>]
jaxen import [--no-lookup] [--delay N] [--source name] <file>
jaxen run <file.go> [args...]
jaxen nuke <ip> [ip...]
jaxen cheatsheet
jaxen --version
| Flag | Default | Effect |
|---|---|---|
--clean |
off | strip CDN/cloud noise (Google, Amazon, Microsoft, Cloudflare, Akamai, Fastly) |
--export |
off | write summary.csv |
--passive <domain> |
expand query via crt.sh CT logs for the domain | |
--web |
off | use browser session instead of API key; auto-enabled when no key |
--pages N |
1 |
pages to fetch in --web mode (10 IPs/page) |
--max N |
50 |
maximum hosts to paginate (100/page; raises query-credit cost) |
--delay N |
1.0 |
seconds between pages when paginating |
| Flag | Effect |
|---|---|
--no-lookup |
store entries at port 0 without Shodan enrichment (for manual exports) |
--delay N |
seconds between Shodan API calls |
--source name |
tag the notes field with the originating tool name |
Six pre-built Shodan dork workflows covering the AI/ML exposure surface:
| Category | Targets |
|---|---|
vector-db |
ChromaDB, Qdrant, Weaviate, Milvus |
inference |
Ollama, vLLM, LocalAI, LM Studio |
orchestration |
Flowise, Langflow, Dify, Open WebUI, AnythingLLM |
gpu |
NVIDIA DCGM, Triton inference |
mlops |
MLflow, Kubeflow, Label Studio |
gateway |
LiteLLM proxy, OpenRouter, PromptLayer |
jaxen ai-hunt all runs every category sequentially.
Two tables in empire.db:
assets (
id INTEGER PRIMARY KEY,
ip TEXT NOT NULL,
port INTEGER NOT NULL,
org TEXT,
isp TEXT,
hostname TEXT,
product TEXT,
version TEXT,
favicon_hash TEXT,
first_seen TEXT,
last_seen TEXT,
status TEXT DEFAULT 'active',
notes TEXT,
UNIQUE(ip, port)
)
cloud_assets (
id INTEGER PRIMARY KEY,
org TEXT,
provider TEXT,
bucket_name TEXT,
url TEXT,
status_code INTEGER,
public INTEGER DEFAULT 0,
first_seen TEXT,
UNIQUE(url)
)
favicon_hash stores Shodan's index-computed http.favicon.hash passively (no extra request). When the hash matches a known product default, JAXEN appends FAVICON-PRESENT DEFAULT-FAVICON:<product> to notes.
hunt writes recon_dump.json:
{
"query": string,
"timestamp": string,
"total_results_available": int,
"returned": int,
"hosts": [{ ...Shodan HostData fields... }]
}
The previous run rotates to recon_dump.old.json automatically, giving diff a baseline without manual management.
$ jaxen hunt --max 200 --clean 'product:"Ollama" port:11434'
[*] querying Shodan: product:"Ollama" port:11434
[+] page 1: cum= 100 / total=16473
[+] page 2: cum= 200 / total=16473
[*] truncated to --max 200 of 16473 available; raise --max to capture more
[*] --clean: dropped 12 CDN/cloud, 188 remain
[*] total available: 16473 | returning: 188
[+] saved 48291 bytes -> recon_dump.json
[+] empire.db: 188 assets upserted (141 with index favicon, 3 product-default matches)JAXEN is a harvest and correlation tool. It queries Shodan and writes to a local SQLite database. It does not fingerprint services at the HTTP response level (that is aimap). It does not run exploit chains. The hunt subcommand costs Shodan query credits; bound credit consumption with --max and --clean. Only target systems you own or have explicit written authorization to test.
- aimap — AI/ML infrastructure fingerprint scanner, the deep-enum stage after the harvest
- shodan-fetch — authenticated Shodan web UI scraper, no API credits
- scanner — active TCP+TLS banner stage between passive discovery and aimap
- VisorLog — finding ledger and ingest pipeline for AI-infra reports
- VisorGraph — cert-pivot to operator attribution
MIT. Part of the NuClide toolchain. Contact: nuclide-research.com