Automated open-source intelligence tool that investigates a target domain in under 60 seconds — querying 5 public data sources and modelling all discovered entity relationships in a Neo4j graph database.
Give it a domain. It automatically:
- Pulls WHOIS registration data — registrar, org, country, creation date, contact emails
- Enumerates DNS records — A, MX, NS, TXT (A records feed directly into Shodan)
- Discovers subdomains via SSL certificate transparency logs on crt.sh — finds infrastructure that was never publicly advertised
- Fingerprints each IP on Shodan — open ports, OS, organisation, detected CVEs
- Searches GitHub for the organisation's public repositories
- Writes all entity relationships into a Neo4j graph database using Cypher
- Auto-generates a structured 6-section intelligence report in Markdown
Organization ──[OWNS]──► Domain ──[RESOLVES_TO]──► IP
│
[HAS_CERT]
│
▼
Certificate
Node types: Domain · IP · Organization · Certificate
Relationships: OWNS · RESOLVES_TO · HAS_CERT
| Source | What It Returns | Method |
|---|---|---|
| WHOIS | Registrar, org, country, creation date, emails | python-whois |
| DNS | A, MX, NS, TXT records | dnspython |
| crt.sh | SSL subdomains (certificate transparency) | Public API |
| Shodan | Open ports, OS, org, CVEs per IP | Shodan free API |
| GitHub | Public repos, languages, star counts | GitHub REST API |
==================================================
OSINT ENTITY MAPPER
Target: python.org
==================================================
[*] Running WHOIS lookup...
↳ Org: Python Software Foundation
[*] Running DNS lookup...
↳ A records: ['151.101.0.223', '151.101.64.223']
[*] Running Shodan on 151.101.0.223...
↳ Ports: [80, 443]
[*] Pulling SSL certificate data from crt.sh...
↳ Subdomains found: 20
[*] Searching GitHub...
↳ Repos found: 10
[*] Generating intelligence report...
[+] Report saved: report_python.org_20260525_090000.md
==================================================
[+] INVESTIGATION COMPLETE
[+] Graph: console.neo4j.io
==================================================
Before you start you need four things:
| Requirement | Where to Get It | Free? |
|---|---|---|
| Python 3.8+ | python.org | ✅ |
| Neo4j AuraDB instance | aura.neo4j.io | ✅ Free tier |
| Shodan API key | account.shodan.io | ✅ Free tier |
| GitHub Personal Access Token | github.com/settings/tokens | ✅ |
git clone https://github.com/Hitansh1601/osint-entity-mapper.git
cd osint-entity-mapperpip install -r requirements.txtVerify everything installed correctly:
python -c "import whois, dns, neo4j, shodan, requests, pdfplumber, pandas; print('All libraries OK')"- Go to aura.neo4j.io and sign up for free
- Click Create Instance → select Free tier → name it
osint-mapper - Save the credentials immediately — the password is shown only once:
- Connection URI (starts with
neo4j+s://) - Username (
neo4j) - Password
- Connection URI (starts with
Shodan:
- Register at shodan.io
- Go to account.shodan.io
- Copy the API key shown on your account page
GitHub Token:
- Go to github.com/settings/tokens
- Click Generate new token (classic)
- Check only the
public_reposcope - Copy the token — shown only once
Create a file called config.py in the project root (this file is gitignored — never committed):
# config.py — DO NOT COMMIT THIS FILE
SHODAN_API_KEY = "your_shodan_api_key_here"
GITHUB_TOKEN = "your_github_token_here"
NEO4J_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
NEO4J_USER = "neo4j"
NEO4J_PASSWORD = "your_auradb_password_here"Replace each placeholder with your real credentials.
Verify Neo4j is reachable before running the full tool:
python -c "
from neo4j import GraphDatabase
from config import NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
driver.verify_connectivity()
print('Neo4j connection: OK')
driver.close()
"Should print: Neo4j connection: OK
python main.py <domain>python main.py python.org
python main.py github.com
python main.py tesla.comAfter running an investigation:
- Go to console.neo4j.io
- Connect to your AuraDB instance
- Run this Cypher query:
MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 100- Switch to Graph view to see the visual entity map
osint-entity-mapper/
├── main.py ← Orchestrates the full investigation pipeline
├── neo4j_handler.py ← All Neo4j graph database operations
├── report_generator.py ← Builds the structured Markdown intelligence report
├── config.py ← API credentials (gitignored — never committed)
├── requirements.txt ← Python dependencies
├── .gitignore ← Keeps credentials out of version control
├── README.md
└── modules/
├── __init__.py ← Makes modules/ a Python package
├── whois_lookup.py ← WHOIS registration data
├── dns_lookup.py ← DNS record enumeration
├── crtsh_lookup.py ← SSL certificate transparency (subdomains)
├── shodan_lookup.py ← Infrastructure fingerprinting
└── github_lookup.py ← GitHub organisation repositories
Each investigation generates a report_<domain>_<timestamp>.md file with:
- WHOIS Registration Data — registrar, org, country, creation date, emails
- DNS Records — A, MX, NS records
- Infrastructure (Shodan) — per-IP: ports, OS, org, country, CVEs
- SSL Certificate Intelligence — subdomains discovered via crt.sh
- GitHub Presence — public repos with language and star counts
- Entity Relationship Graph — description of Neo4j relationships written
| Error | Cause | Fix |
|---|---|---|
ModuleNotFoundError: config |
Wrong working directory | Run from project root: cd osint-entity-mapper |
ModuleNotFoundError: modules |
Missing __init__.py |
touch modules/__init__.py |
AuthError |
Wrong Neo4j password | Re-check password in config.py |
ServiceUnavailable |
AuraDB instance paused | Go to aura.neo4j.io → click Resume |
| Shodan returns error | IP not indexed or quota hit | Normal — tool handles gracefully |
| crt.sh returns 0 results | Temporary downtime | Wait 30 seconds and rerun |
| Graph empty after run | URI mismatch | Confirm URI in config.py matches your AuraDB instance |
This tool queries only publicly available, unauthenticated data sources:
- WHOIS, DNS, and crt.sh are designed to be public by definition
- Shodan indexes data from its own independent internet scans
- GitHub API is a public, rate-limited API
No target systems are contacted directly. All reconnaissance is entirely passive.
Use responsibly and only on domains you are authorised to investigate.
Hitansh Waghela
Final-year Computer Engineering student · OSINT Engineer · Blue Team
GitHub · LinkedIn · TryHackMe