Skip to content

nuclide-research/JAXEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JAXEN

Stateful Shodan harvest platform for AI/ML infrastructure discovery.

release license go NuClide

FeaturesInstallationUsageAI HuntDatabaseScope


JAXEN is a single Go binary that harvests Shodan queries into a stateful SQLite asset database and drives AI/ML infrastructure discovery at scale. It issues queries, paginates the full population, and persists every discovered host into empire.db with IP, port, org, ISP, hostname, product, version, and Shodan index-computed favicon hash. Pre-built dork workflows cover six AI/ML exposure categories. The import subcommand ingests external tool output with optional Shodan enrichment. Forensic subcommands deep-inspect TLS certificates from live IPs, PEM files, or extracted firmware trees.

A raw Shodan query gives you a list of hits. JAXEN gives you a stateful corpus that survives across sessions, lets you diff against the previous harvest, and feeds the next stage of the assessment chain.

Features

  • Single Go binary, no CGo, no C compiler required (pure-Go SQLite via modernc.org/sqlite)
  • Stateful empire.db asset database, IP plus port primary key, idempotent upserts
  • Six pre-built AI/ML dork workflows (vector-db, inference, orchestration, gpu, mlops, gateway)
  • --web mode uses a browser session through shodan-fetch instead of an API key (zero query credits)
  • import subcommand ingests external tool output, with --no-lookup for manual Shodan web exports
  • Favicon hash enrichment passive (no extra request), product-default match flagged in notes
  • TLS cert forensics from live IPs, PEM files, or extracted firmware directory trees
  • CIDR diffing with Slack and Discord webhook alerting on new assets
  • run subcommand compiles and executes a Go file with direct empire.db access (per-survey analysis lives in version control)

Installation

go install -v github.com/nuclide-research/JAXEN@latest

Or build from source:

git clone https://github.com/nuclide-research/JAXEN
cd JAXEN
go build -o jaxen .

Requires Go 1.25 or later.

Configure the Shodan API key:

export SHODAN_API_KEY="your_key"
# or write it to ~/.config/shodan/api_key (Shodan CLI location)
# env var takes precedence; both are checked

Usage

jaxen hunt --max 200 --clean 'product:"Ollama" port:11434'
jaxen ai-hunt vector-db
jaxen import --no-lookup ips.txt
jaxen diff --webhook https://hooks.slack.com/...
jaxen aimap 192.0.2.10
Full command surface
jaxen hunt [--clean] [--export] [--passive <domain>] [--web] [--pages N]
           [--max N] [--delay N] "<query>"
jaxen ai-hunt [category]
jaxen menlo-hunt [--org name]
jaxen buckets [--workers N] [--timeout N] <org-name>
jaxen profile [--org name] [<ip>]
jaxen cert-parse <path>
jaxen pivot <url>
jaxen aimap <ip|cidr> [hostname]
jaxen analyze [--fast]
jaxen graph
jaxen diff [--webhook <url>] [old.json] [new.json]
jaxen list [--org <filter>]
jaxen import [--no-lookup] [--delay N] [--source name] <file>
jaxen run <file.go> [args...]
jaxen nuke <ip> [ip...]
jaxen cheatsheet
jaxen --version

hunt flags

Flag Default Effect
--clean off strip CDN/cloud noise (Google, Amazon, Microsoft, Cloudflare, Akamai, Fastly)
--export off write summary.csv
--passive <domain> expand query via crt.sh CT logs for the domain
--web off use browser session instead of API key; auto-enabled when no key
--pages N 1 pages to fetch in --web mode (10 IPs/page)
--max N 50 maximum hosts to paginate (100/page; raises query-credit cost)
--delay N 1.0 seconds between pages when paginating

import flags

Flag Effect
--no-lookup store entries at port 0 without Shodan enrichment (for manual exports)
--delay N seconds between Shodan API calls
--source name tag the notes field with the originating tool name

AI hunt categories

Six pre-built Shodan dork workflows covering the AI/ML exposure surface:

Category Targets
vector-db ChromaDB, Qdrant, Weaviate, Milvus
inference Ollama, vLLM, LocalAI, LM Studio
orchestration Flowise, Langflow, Dify, Open WebUI, AnythingLLM
gpu NVIDIA DCGM, Triton inference
mlops MLflow, Kubeflow, Label Studio
gateway LiteLLM proxy, OpenRouter, PromptLayer

jaxen ai-hunt all runs every category sequentially.

Database schema

Two tables in empire.db:

assets (
    id         INTEGER PRIMARY KEY,
    ip         TEXT NOT NULL,
    port       INTEGER NOT NULL,
    org        TEXT,
    isp        TEXT,
    hostname   TEXT,
    product    TEXT,
    version    TEXT,
    favicon_hash TEXT,
    first_seen TEXT,
    last_seen  TEXT,
    status     TEXT DEFAULT 'active',
    notes      TEXT,
    UNIQUE(ip, port)
)

cloud_assets (
    id          INTEGER PRIMARY KEY,
    org         TEXT,
    provider    TEXT,
    bucket_name TEXT,
    url         TEXT,
    status_code INTEGER,
    public      INTEGER DEFAULT 0,
    first_seen  TEXT,
    UNIQUE(url)
)

favicon_hash stores Shodan's index-computed http.favicon.hash passively (no extra request). When the hash matches a known product default, JAXEN appends FAVICON-PRESENT DEFAULT-FAVICON:<product> to notes.

hunt output

hunt writes recon_dump.json:

{
  "query":                  string,
  "timestamp":              string,
  "total_results_available": int,
  "returned":               int,
  "hosts":                  [{ ...Shodan HostData fields... }]
}

The previous run rotates to recon_dump.old.json automatically, giving diff a baseline without manual management.

Example

$ jaxen hunt --max 200 --clean 'product:"Ollama" port:11434'
[*] querying Shodan: product:"Ollama" port:11434
[+] page   1: cum=  100 / total=16473
[+] page   2: cum=  200 / total=16473
[*] truncated to --max 200 of 16473 available; raise --max to capture more
[*] --clean: dropped 12 CDN/cloud, 188 remain
[*] total available: 16473  |  returning: 188
[+] saved 48291 bytes -> recon_dump.json
[+] empire.db: 188 assets upserted (141 with index favicon, 3 product-default matches)

Scope

JAXEN is a harvest and correlation tool. It queries Shodan and writes to a local SQLite database. It does not fingerprint services at the HTTP response level (that is aimap). It does not run exploit chains. The hunt subcommand costs Shodan query credits; bound credit consumption with --max and --clean. Only target systems you own or have explicit written authorization to test.

Our other projects

  • aimap — AI/ML infrastructure fingerprint scanner, the deep-enum stage after the harvest
  • shodan-fetch — authenticated Shodan web UI scraper, no API credits
  • scanner — active TCP+TLS banner stage between passive discovery and aimap
  • VisorLog — finding ledger and ingest pipeline for AI-infra reports
  • VisorGraph — cert-pivot to operator attribution

License

MIT. Part of the NuClide toolchain. Contact: nuclide-research.com

Packages

 
 
 

Contributors