RPDS – Real-Time Phishing & Web Threat Detection System

KAAD Cyber Intelligence Core · Mini Brain #1

Architect: KAAD · Model Version: KAAD-1.1.0

🏗 Project Structure

rpds/
├── backend/                    ← FastAPI Python backend
│   ├── config.py               ← Central configuration
│   ├── logging_config.py       ← Structured JSON logging (structlog)
│   ├── connectivity.py         ← Cached online/offline checker
│   ├── circuit_breaker.py      ← Async circuit breaker
│   ├── url_validator.py        ← URL validation & normalisation
│   ├── feature_extractor.py    ← Structural feature extractor
│   ├── main.py                 ← FastAPI app entry point
│   ├── api/
│   │   ├── routes.py           ← /analyze, /health, /status
│   │   ├── schemas.py          ← Pydantic v2 models
│   │   └── middleware.py       ← CORS + request logging
│   ├── engines/
│   │   ├── whitelist_engine.py ← Trusted domain bypass
│   │   ├── blacklist_engine.py ← OpenPhish feed + cache
│   │   ├── cnn_engine.py       ← Char-CNN inference (PyTorch CPU)
│   │   ├── tree_engine.py      ← LightGBM/RandomForest
│   │   └── orchestrator.py     ← KAAD Core: adaptive fusion
│   ├── training/               ← Standalone offline training scripts
│   │   ├── dataset_utils.py    ← Deduplicate, balance, split
│   │   ├── train_cnn.py        ← CNN training (FocalLoss + 5-fold CV)
│   │   ├── train_tree.py       ← Tree model training
│   │   ├── calibrate.py        ← Temperature scaling calibration
│   │   └── tune_threshold.py   ← ROC threshold tuning
│   ├── models/                 ← Trained model files (generated by training)
│   └── data/
│       ├── raw/                ← Place your phishing dataset CSV here
│       └── whitelists/whitelist.txt
├── frontend/                   ← Next.js 14 UI
│   ├── app/
│   │   ├── layout.tsx
│   │   ├── page.tsx            ← Main scan page
│   │   └── globals.css         ← Cyber dark theme
│   ├── components/
│   │   ├── StatusBar.tsx       ← Online/offline + KAAD branding
│   │   ├── ScanInput.tsx       ← Multi-URL scanner input
│   │   ├── RiskGauge.tsx       ← Animated SVG arc gauge
│   │   ├── EngineCards.tsx     ← Per-engine score cards
│   │   ├── ThreatReasoningPanel.tsx ← Explainability layer
│   │   ├── SystemLog.tsx       ← AI system log panel
│   │   └── ResultCard.tsx      ← Full result per URL
│   └── lib/api.ts              ← Typed fetch client
├── smoke_test.py               ← Quick orchestrator verification
├── start_backend.bat           ← Windows: install + start backend
└── start_frontend.bat          ← Windows: install + start frontend

🚀 Quick Start

Step 1 — Install Python deps & start backend

start_backend.bat

Or manually:

cd backend
pip install -r requirements.txt
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Step 2 — Start frontend

start_frontend.bat

Or manually:

cd frontend
npm install
npm run dev

Open http://localhost:3000 in your browser.

🤖 Training Your Own Models (Optional but recommended for full accuracy)

The system works without trained models — it runs structural risk analysis via entropy, TLD scoring, keyword detection etc. CNN and Tree show as "N/A" until trained.

1. Get a dataset

Download a phishing URL dataset:

PhishTank — Free CSV download
UCI Phishing URLs

Place the CSV in backend/data/raw/.

2. Prepare dataset

cd backend
python -m training.dataset_utils data/raw/phishing.csv url label 1

3. Train Char-CNN

python -m training.train_cnn
# Saves: backend/models/cnn_model.pt

4. Train Tree model

python -m training.train_tree
# Saves: backend/models/tree_model.pkl

5. Calibrate + tune threshold

python -m training.calibrate
python -m training.tune_threshold

Restart the backend server — engines will load automatically.

📡 API Reference

Base URL: http://localhost:8000

Endpoint	Method	Description
`/api/v1/analyze`	POST	Analyze 1–10 URLs
`/api/v1/health`	GET	System health + engine status
`/api/v1/status`	GET	Version + mode info
`/docs`	GET	Swagger UI

Analyze Request

POST /api/v1/analyze
{ "urls": ["https://suspicious-site.tk/login"] }

Response

{
  "engine": "RPDS – KAAD CORE",
  "architect": "KAAD",
  "mode": "online",
  "url": "http://suspicious-site.tk/login",
  "final_verdict": "HIGH RISK",
  "final_score": 0.73,
  "confidence_class": "HIGH",
  "engines": {
    "whitelist": {"hit": false, "matched": null},
    "blacklist": {"hit": false, "score": 0.0, "available": true},
    "cnn":       {"score": 0.0, "available": false},
    "tree":      {"score": 0.0, "available": false, "model_type": "none"}
  },
  "threat_reasoning": [
    {"factor": "High-risk TLD (risk=1.0)", "impact": "TLD commonly abused for phishing campaigns"},
    {"factor": "Suspicious keyword detected", "impact": "Common phishing pattern"}
  ],
  "structural_analysis": { "entropy": 3.8, "tld_risk_score": 1.0, ... },
  "threat_signature": "sha256hex...",
  "model_version": "KAAD-1.1.0",
  "analysis_time_ms": 1.3
}

🔒 Stability Guarantees

Rule	Implementation
Models load once	`@app.on_event("startup")`
No retraining in inference	Training scripts are standalone
Blacklist timeout + circuit breaker	`circuit_breaker.py`
All engines try/except wrapped	Returns degraded result, never crashes
Offline fallback	CNN + Tree only if no internet
Max 10 URLs / request	Pydantic `max_length=10`
Max URL length 2048	Validator + Pydantic
Global exception handler	500 → JSON error (never stack trace)

🎯 Target Performance (after training on quality dataset)

Metric	Target
Accuracy	> 99%
Precision	> 98%
Recall	> 97%
False Positive Rate	< 1%
ROC-AUC	> 0.995

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
backend		backend
deploy		deploy
extension		extension
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
THREAT_MODEL.md		THREAT_MODEL.md
docker-compose.yml		docker-compose.yml
enhanced_smoke_test.py		enhanced_smoke_test.py
prepare_and_train.py		prepare_and_train.py
smoke_test.py		smoke_test.py
start_backend.bat		start_backend.bat
start_extension.bat		start_extension.bat
start_frontend.bat		start_frontend.bat
verify_engines.py		verify_engines.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RPDS – Real-Time Phishing & Web Threat Detection System

KAAD Cyber Intelligence Core · Mini Brain #1

🏗 Project Structure

🚀 Quick Start

Step 1 — Install Python deps & start backend

Step 2 — Start frontend

🤖 Training Your Own Models (Optional but recommended for full accuracy)

1. Get a dataset

2. Prepare dataset

3. Train Char-CNN

4. Train Tree model

5. Calibrate + tune threshold

📡 API Reference

Analyze Request

Response

🔒 Stability Guarantees

🎯 Target Performance (after training on quality dataset)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RPDS – Real-Time Phishing & Web Threat Detection System

KAAD Cyber Intelligence Core · Mini Brain #1

🏗 Project Structure

🚀 Quick Start

Step 1 — Install Python deps & start backend

Step 2 — Start frontend

🤖 Training Your Own Models (Optional but recommended for full accuracy)

1. Get a dataset

2. Prepare dataset

3. Train Char-CNN

4. Train Tree model

5. Calibrate + tune threshold

📡 API Reference

Analyze Request

Response

🔒 Stability Guarantees

🎯 Target Performance (after training on quality dataset)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages