Skip to content

Hitansh1601/osint-entity-mapper

Repository files navigation

OSINT Entity Mapper

Automated open-source intelligence tool that investigates a target domain in under 60 seconds — querying 5 public data sources and modelling all discovered entity relationships in a Neo4j graph database.

Python Neo4j Shodan License


What It Does

Give it a domain. It automatically:

  • Pulls WHOIS registration data — registrar, org, country, creation date, contact emails
  • Enumerates DNS records — A, MX, NS, TXT (A records feed directly into Shodan)
  • Discovers subdomains via SSL certificate transparency logs on crt.sh — finds infrastructure that was never publicly advertised
  • Fingerprints each IP on Shodan — open ports, OS, organisation, detected CVEs
  • Searches GitHub for the organisation's public repositories
  • Writes all entity relationships into a Neo4j graph database using Cypher
  • Auto-generates a structured 6-section intelligence report in Markdown

Graph Model

Organization  ──[OWNS]──►  Domain  ──[RESOLVES_TO]──►  IP
                              │
                         [HAS_CERT]
                              │
                              ▼
                         Certificate

Node types: Domain · IP · Organization · Certificate
Relationships: OWNS · RESOLVES_TO · HAS_CERT


Data Sources

Source What It Returns Method
WHOIS Registrar, org, country, creation date, emails python-whois
DNS A, MX, NS, TXT records dnspython
crt.sh SSL subdomains (certificate transparency) Public API
Shodan Open ports, OS, org, CVEs per IP Shodan free API
GitHub Public repos, languages, star counts GitHub REST API

Sample Output

==================================================
  OSINT ENTITY MAPPER
  Target: python.org
==================================================

[*] Running WHOIS lookup...
    ↳ Org: Python Software Foundation
[*] Running DNS lookup...
    ↳ A records: ['151.101.0.223', '151.101.64.223']
[*] Running Shodan on 151.101.0.223...
    ↳ Ports: [80, 443]
[*] Pulling SSL certificate data from crt.sh...
    ↳ Subdomains found: 20
[*] Searching GitHub...
    ↳ Repos found: 10

[*] Generating intelligence report...
[+] Report saved: report_python.org_20260525_090000.md

==================================================
  [+] INVESTIGATION COMPLETE
  [+] Graph: console.neo4j.io
==================================================

Prerequisites

Before you start you need four things:

Requirement Where to Get It Free?
Python 3.8+ python.org
Neo4j AuraDB instance aura.neo4j.io ✅ Free tier
Shodan API key account.shodan.io ✅ Free tier
GitHub Personal Access Token github.com/settings/tokens

Installation & Setup

Step 1 — Clone the Repository

git clone https://github.com/Hitansh1601/osint-entity-mapper.git
cd osint-entity-mapper

Step 2 — Install Dependencies

pip install -r requirements.txt

Verify everything installed correctly:

python -c "import whois, dns, neo4j, shodan, requests, pdfplumber, pandas; print('All libraries OK')"

Step 3 — Create Your Neo4j AuraDB Instance

  1. Go to aura.neo4j.io and sign up for free
  2. Click Create Instance → select Free tier → name it osint-mapper
  3. Save the credentials immediately — the password is shown only once:
    • Connection URI (starts with neo4j+s://)
    • Username (neo4j)
    • Password

Step 4 — Get Your API Keys

Shodan:

  1. Register at shodan.io
  2. Go to account.shodan.io
  3. Copy the API key shown on your account page

GitHub Token:

  1. Go to github.com/settings/tokens
  2. Click Generate new token (classic)
  3. Check only the public_repo scope
  4. Copy the token — shown only once

Step 5 — Create config.py

Create a file called config.py in the project root (this file is gitignored — never committed):

# config.py — DO NOT COMMIT THIS FILE
SHODAN_API_KEY = "your_shodan_api_key_here"
GITHUB_TOKEN   = "your_github_token_here"
NEO4J_URI      = "neo4j+s://xxxxxxxx.databases.neo4j.io"
NEO4J_USER     = "neo4j"
NEO4J_PASSWORD = "your_auradb_password_here"

Replace each placeholder with your real credentials.

Step 6 — Test the Connection

Verify Neo4j is reachable before running the full tool:

python -c "
from neo4j import GraphDatabase
from config import NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
driver.verify_connectivity()
print('Neo4j connection: OK')
driver.close()
"

Should print: Neo4j connection: OK


Usage

Basic Investigation

python main.py <domain>

Examples

python main.py python.org
python main.py github.com
python main.py tesla.com

View the Graph

After running an investigation:

  1. Go to console.neo4j.io
  2. Connect to your AuraDB instance
  3. Run this Cypher query:
MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 100
  1. Switch to Graph view to see the visual entity map

Project Structure

osint-entity-mapper/
├── main.py               ← Orchestrates the full investigation pipeline
├── neo4j_handler.py      ← All Neo4j graph database operations
├── report_generator.py   ← Builds the structured Markdown intelligence report
├── config.py             ← API credentials (gitignored — never committed)
├── requirements.txt      ← Python dependencies
├── .gitignore            ← Keeps credentials out of version control
├── README.md
└── modules/
    ├── __init__.py       ← Makes modules/ a Python package
    ├── whois_lookup.py   ← WHOIS registration data
    ├── dns_lookup.py     ← DNS record enumeration
    ├── crtsh_lookup.py   ← SSL certificate transparency (subdomains)
    ├── shodan_lookup.py  ← Infrastructure fingerprinting
    └── github_lookup.py  ← GitHub organisation repositories

Intelligence Report Structure

Each investigation generates a report_<domain>_<timestamp>.md file with:

  1. WHOIS Registration Data — registrar, org, country, creation date, emails
  2. DNS Records — A, MX, NS records
  3. Infrastructure (Shodan) — per-IP: ports, OS, org, country, CVEs
  4. SSL Certificate Intelligence — subdomains discovered via crt.sh
  5. GitHub Presence — public repos with language and star counts
  6. Entity Relationship Graph — description of Neo4j relationships written

Troubleshooting

Error Cause Fix
ModuleNotFoundError: config Wrong working directory Run from project root: cd osint-entity-mapper
ModuleNotFoundError: modules Missing __init__.py touch modules/__init__.py
AuthError Wrong Neo4j password Re-check password in config.py
ServiceUnavailable AuraDB instance paused Go to aura.neo4j.io → click Resume
Shodan returns error IP not indexed or quota hit Normal — tool handles gracefully
crt.sh returns 0 results Temporary downtime Wait 30 seconds and rerun
Graph empty after run URI mismatch Confirm URI in config.py matches your AuraDB instance

Ethical Use

This tool queries only publicly available, unauthenticated data sources:

  • WHOIS, DNS, and crt.sh are designed to be public by definition
  • Shodan indexes data from its own independent internet scans
  • GitHub API is a public, rate-limited API

No target systems are contacted directly. All reconnaissance is entirely passive.
Use responsibly and only on domains you are authorised to investigate.

Author

Hitansh Waghela
Final-year Computer Engineering student · OSINT Engineer · Blue Team
GitHub · LinkedIn · TryHackMe

About

Automated OSINT tool that aggregates domain intelligence from 5 public sources and models entity relationships in Neo4j

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages