bumblebee

Bumblebee is a read-only inventory collector for package, extension, and developer-tool metadata on macOS and Linux developer endpoints.

It answers a narrow supply-chain response question: when an advisory names a package, extension, or version, which developer machines show a match in their on-disk metadata right now?

SBOMs help answer what shipped, and EDR helps answer what ran or touched the network, but supply-chain response often needs a different view: messy local state across lockfiles, package-manager metadata, extension manifests, and supported developer-tool configs.

Bumblebee turns that scattered on-disk state into structured NDJSON component records and, when given an exposure catalog, flags exact matches for fast, read-only exposure checks when responders already know what they are looking for.

Scope

Single static binary, Go 1.25+, zero non-stdlib dependencies.
Three scan profiles (baseline, project, deep) for different populations and cadences.
Reads only the lockfiles, package-manager install metadata, extension manifests, and supported MCP JSON configs listed in docs/inventory-sources.md. No package manager execution (npm ls, pip show, go list, ...) and no source-file reads. MCP host configs can carry environment values and credentials in their env blocks; Bumblebee parses these configs for the server inventory it needs but does not emit those values in its records.

Coverage

Family	Emitted `ecosystem`	Sources
npm	`npm`	`package-lock.json`, `npm-shrinkwrap.json`, `node_modules/.package-lock.json`, `node_modules/<pkg>/package.json`
pnpm	`npm`	`pnpm-lock.yaml`, `.pnpm/.../package.json`
Yarn	`npm`	`yarn.lock` (Classic + Berry)
Bun	`npm`	`bun.lock`; `bun.lockb` presence as diagnostic
PyPI	`pypi`	`.dist-info/METADATA`, `INSTALLER`, `direct_url.json`, `.egg-info/PKG-INFO`
Go modules	`go`	`go.sum`, `go.mod`
RubyGems	`rubygems`	`Gemfile.lock`, installed `*.gemspec`
Composer	`packagist`	`composer.lock`, `vendor/composer/installed.json`
MCP	`mcp`	JSON host configs: `mcp.json`, `.mcp.json`, `claude_desktop_config.json`, `mcp_config.json`, `mcp_settings.json`, `cline_mcp_settings.json`, plus `~/.gemini/settings.json` (Gemini CLI / Code Assist). Non-JSON configs (Codex `config.toml`, Continue YAML) are not parsed in v0.1.
Editor extensions	`editor-extension`	VS Code, Cursor, Windsurf, VSCodium manifests
Browser extensions	`browser-extension`	Chromium-family (`manifest.json`) and Firefox (`extensions.json`) per profile

Per-ecosystem detail: docs/inventory-sources.md.

Install

Requires Go 1.25+. Zero non-stdlib dependencies.

# Install the latest tagged release into $GOBIN.
go install github.com/perplexityai/bumblebee/cmd/bumblebee@latest

# Or pin a specific tag.
go install github.com/perplexityai/bumblebee/cmd/bumblebee@v0.1.1

To build from a checkout:

go build -o bumblebee ./cmd/bumblebee
go test ./...

Stamp an explicit version at build time:

go build -ldflags "-X main.Version=v0.1.1" -o bumblebee ./cmd/bumblebee

bumblebee version prints the version plus the VCS revision, build time, and Go runtime — so a record emitted in production can be traced back to a specific build. Version precedence: -ldflags override, module version recorded by go install, then the in-tree default tracked in VERSION.

Self-test

After installing, run a built-in end-to-end check against embedded fixtures:

bumblebee selftest
# selftest OK (2 findings in 1ms)

The fixtures live inside the binary, use deliberately fake package names (bumblebee-selftest-evil@0.0.0), and make no network calls. A non-zero exit means the local install can no longer detect what it should — a fast pre-deployment smoke test for fleet rollouts.

Profiles

Bumblebee is a one-shot scanner: each invocation performs a single scan and exits. Cadence is the runner's responsibility (cron, launchd, systemd, MDM, etc.). Each record carries profile and a per-root root_kind so receivers can keep populations separate.

Profile	Scans	Use for
`baseline`	Common global/user package roots, language toolchains, editor extensions, browser extensions, and MCP configs.	Recurring lightweight inventory via an external runner.
`project`	Configured development directories, such as `~/code`, `~/src`, or `~/work`.	Recurring inventory for known project workspaces.
`deep`	Explicit `--root` paths, including broad roots like `$HOME`.	On-demand incident or campaign checks, usually with `--ecosystem`, `--exposure-catalog`, and `--findings-only`.

baseline and project refuse bare-home roots; only deep walks them.

Quick start

# Baseline global inventory.
bumblebee scan --profile baseline > inventory.ndjson

# Daily project sweep with explicit roots.
bumblebee scan --profile project \
  --root "$HOME/code" \
  --root "$HOME/Developer"

# Limit a run to selected emitted ecosystems.
bumblebee scan --profile baseline \
  --ecosystem npm,pypi \
  --ecosystem go

# On-demand exposure scan against a published advisory.
bumblebee scan --profile deep \
  --root "$HOME" \
  --exposure-catalog ./catalog.json \
  --max-duration 10m

Preview the resolved roots without scanning:

bumblebee roots --profile baseline
# prints "<root_kind>\t<path>" lines

--root is a filesystem path to scan; repeatable, required for deep, optional for the other profiles. --ecosystem is repeatable and comma-separated. --exposure-catalog accepts a JSON file or a directory of *.json catalogs (merged non-recursively, all files must share schema_version). --findings-only requires --exposure-catalog and suppresses package records while keeping findings. bumblebee scan --help lists every flag.

Output

Records are NDJSON, one per line. Diagnostics go to stderr as NDJSON. Each run ends with a scan_summary record; receivers use it to decide whether to promote a run to current state. See docs/transport.md for HTTPS/file output and docs/state-model.md for the receiver-side current-state model.

Package record:

Example package record

{
  "record_type": "package",
  "record_id": "package:...",
  "schema_version": "0.1.0",
  "scanner_name": "bumblebee",
  "scanner_version": "v0.1.1",
  "run_id": "9b1f0c2e4d5a6b7c8d9e0f1a2b3c4d5e",
  "scan_time": "2026-05-15T18:22:01.482Z",
  "endpoint": {
    "hostname": "alex-mbp",
    "os": "darwin",
    "arch": "arm64",
    "username": "alex",
    "uid": "501",
    "device_id": "MDM-7F4A2B"
  },
  "profile": "project",
  "ecosystem": "npm",
  "package_name": "@tanstack/query-core",
  "normalized_name": "@tanstack/query-core",
  "version": "5.59.20",
  "project_path": "/Users/alex/code/web-app",
  "root_kind": "project_root",
  "package_manager": "pnpm",
  "source_type": "pnpm-lockfile",
  "source_file": "/Users/alex/code/web-app/pnpm-lock.yaml",
  "has_lifecycle_scripts": false,
  "confidence": "high"
}

confidence:

high — exact identity and version came from canonical metadata.
medium — identity is reliable, but version or source is partial.
low — config/path/spec reference only; not proof of an installed exact version.

Finding record (exposure-catalog match):

Example finding record

{
  "record_type": "finding",
  "record_id": "finding:...",
  "schema_version": "0.1.0",
  "scanner_name": "bumblebee",
  "scanner_version": "v0.1.1",
  "run_id": "3a8c7d1e9f0b2a4c6d8e0f1a2b3c4d5e",
  "scan_time": "2026-05-15T18:22:01.482Z",
  "endpoint": {
    "hostname": "alex-mbp",
    "os": "darwin",
    "arch": "arm64",
    "username": "alex",
    "uid": "501",
    "device_id": "MDM-7F4A2B"
  },
  "profile": "deep",
  "finding_type": "package_exposure",
  "severity": "critical",
  "catalog_id": "advisory-2026-0042",
  "catalog_name": "example-pkg 1.2.3 (compromised release)",
  "ecosystem": "npm",
  "package_name": "example-pkg",
  "normalized_name": "example-pkg",
  "version": "1.2.3",
  "root_kind": "deep_home_root",
  "project_path": "/Users/alex/code/web-app",
  "source_type": "pnpm-lockfile",
  "source_file": "/Users/alex/code/web-app/pnpm-lock.yaml",
  "confidence": "high",
  "evidence": "exact name+version match (version=1.2.3)"
}

record_id is a content-addressed hash of a canonical identity tuple per record type, stable across runs. Per-record-type field lists and dedupe guidance: docs/state-model.md.

Exposure Catalog Format

Minimal JSON, exact (ecosystem, name, version) matching only:

{
  "schema_version": "0.1.0",
  "entries": [
    {
      "id": "advisory-2026-0042",
      "name": "example-pkg 1.2.3 (compromised release)",
      "ecosystem": "npm",
      "package": "example-pkg",
      "versions": ["1.2.3"],
      "severity": "critical"
    }
  ]
}

The catalog must be a JSON object with schema_version and entries keys. Bare top-level arrays are rejected. Unsupported future schema_version values are rejected. Multiple catalog files can be loaded together by pointing --exposure-catalog at a directory; see the flag description above.

Sample exposure catalogs

The threat_intel/ directory holds maintained exposure catalogs built from public threat-intelligence reporting on recent supply-chain campaigns, assembled with Perplexity Computer and updated via PRs as new campaigns are reported. See threat_intel/README.md for the current catalog list and review guidance.

License

Apache License 2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
cmd/bumblebee		cmd/bumblebee
docs		docs
internal		internal
threat_intel		threat_intel
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
VERSION		VERSION
go.mod		go.mod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bumblebee

Scope

Coverage

Install

Self-test

Profiles

Quick start

Output

Exposure Catalog Format

Sample exposure catalogs

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bumblebee

Scope

Coverage

Install

Self-test

Profiles

Quick start

Output

Exposure Catalog Format

Sample exposure catalogs

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages