Skip to content

feat: --from-nuclei adapter with response-body ingestion#2

Merged
noahpotti merged 1 commit into
mainfrom
feat/from-nuclei
Jun 22, 2026
Merged

feat: --from-nuclei adapter with response-body ingestion#2
noahpotti merged 1 commit into
mainfrom
feat/from-nuclei

Conversation

@noahpotti

Copy link
Copy Markdown
Contributor

What

Adds geiger --from-nuclei -, a third scanner-ingest adapter alongside --from-gitleaks / --from-trufflehog. It reads nuclei JSONL output, re-types each value with geiger's own recognizer, and ranks by blast radius — closing the loop with the geiger-nuclei-templates extract-template pack:

nuclei -t templates/ -l targets.txt -j -irr | geiger --live --from-nuclei -

How it works

  • Re-typing, not trust. nuclei casts a wide net (it extracts anything credential-shaped); geiger is the authority — it re-validates every value with the full gitleaks detector and drops non-credentials.
  • matched-at as the source label drives three things for free: title provenance (from https://host/.env), cross-URL dedup + "also exposed in" correlation (one key exposed at N URLs collapses to one finding), and a new internet-exposed endpoint exposure class (FlagWarn).
  • Response-body ingestion. When nuclei includes the response body (-irr), geiger parses it through its existing .env/JSON/INI parsers, reassembling multi-field credentials — an AWS key+secret pair, a connection string — that the flat extracted-results array can't represent. Verified end-to-end: a .env exposing both AWS halves yields a fully SigV4-signable credential, not a lone access key.

Changes

  • internal/pipeline/batch.goFromNuclei + nucleiBody
  • internal/pipeline/exposure.gohttp(s)://internet-exposed endpoint class
  • cmd/geiger/main.go--from-nuclei flag, readSources branch, header case
  • tests (batch_test.go, exposure_test.go) + README

Tests

go test ./... passes. New coverage: value typing, multi-URL correlation, response-body AWS-pair reassembly, non-body extracted values, and the URL exposure class.

Known follow-on (not in this PR)

The AWS instance-metadata creds JSON shape (AccessKeyId/SecretAccessKey/Token) isn't reassembled by the generic path — only the live --metadata harvester maps it. Closing the SSRF-to-IMDS case fully would reuse imds.awsCredFromJSON as a recognizer.

Ingest nuclei JSONL: re-type each extracted value with geiger's recognizer,
use matched-at as provenance (title, cross-URL dedup/correlation, and a new
"internet-exposed endpoint" exposure class). Also parse the response body
when present so multi-field credentials (AWS key+secret, connection strings)
reassemble — the flat extracted-results array can only carry one value.

- internal/pipeline/batch.go: FromNuclei + nucleiBody
- internal/pipeline/exposure.go: http(s) URL exposure class
- cmd/geiger/main.go: --from-nuclei flag wiring
- tests + README
@noahpotti noahpotti merged commit 012f099 into main Jun 22, 2026
6 checks passed
@noahpotti noahpotti deleted the feat/from-nuclei branch June 22, 2026 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant