Skip to content

gopaljilab/Clinical-Insight-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

489 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Clinical Insight Engine Banner

🩺 Clinical Insight Engine

Clinical Decision Support for Preventive Diabetes Risk Assessment

Interpretable ML + Modern React β€” built for clinicians and patients alike

Python TypeScript React Express PostgreSQL scikit-learn GSSoC

Stars Forks Issues Contributors

A full-stack clinical decision support system that surfaces early diabetes risk signals from routine patient data.
Combines an interpretable ML model with a modern React frontend, presenting results tailored for both clinicians and patients.

Warning

Medical Disclaimer β€” This system is intended for educational and research purposes only. It does not provide medical diagnoses and should not be used as a substitute for professional medical advice.


πŸ“‘ Table of Contents


πŸ’‘ Why Clinical Insight Engine?

Diabetes affects over 500 million adults worldwide, yet early risk signals are often buried in routine clinical data. Clinical Insight Engine bridges that gap:

Problem Our Approach
Risk models are opaque black boxes Interpretable Logistic Regression with per-feature impact scores
Results are one-size-fits-all Dual-view output β€” detailed for clinicians, simplified for patients
Predictions lack context Confidence-aware assessments with actionable follow-up recommendations
Patient data sits in silos Longitudinal tracking with full assessment history

✨ Key Features

🧾 Risk Assessment Form

Collects clinically relevant inputs:

Age Β· Gender Β· Hypertension Β· Heart Disease Β· Smoking History Β· BMI Β· HbA1c Β· Blood Glucose

πŸ‘₯ Dual-View Results

🩻 Clinician View

  • Exact risk percentage (0–100%)
  • Top contributing factors with impact scores
  • Model confidence indicators
  • Suggested clinical follow-up actions
  • Interactive factor contribution charts

πŸ§‘β€βš•οΈ Patient View

  • Simplified category: LOW / MODERATE / HIGH
  • Plain-language explanation of risk drivers
  • Personalized preventive lifestyle guidance

πŸ•’ Assessment History

  • Stores assessments with full timestamps
  • Enables longitudinal patient risk tracking over time

πŸ“Š Data Visualization

  • Interactive bar charts for factor contributions
  • Diabetes correlation heatmap for data exploration

πŸ— Architecture

graph TB
    subgraph Client["πŸ–₯️ Client β€” React + TypeScript"]
        UI["Risk Assessment Form"]
        CV["Clinician View"]
        PV["Patient View"]
        VIZ["Data Visualizations"]
        HIST["Assessment History"]
    end

    subgraph Server["βš™οΈ Server β€” Express.js"]
        API["REST API Routes"]
        VAL["Zod Validation"]
        ORM["Drizzle ORM"]
        PY["Python Bridge"]
    end

    subgraph ML["🧠 ML Pipeline β€” Python"]
        PROC["Data Preprocessing"]
        MODEL["Logistic Regression"]
        INTERP["Feature Interpretation"]
        CACHE["Model Cache (pickle)"]
    end

    subgraph DB["πŸ—„οΈ PostgreSQL"]
        ASSESS["Assessments Table"]
    end

    Client -->|"HTTP Requests"| API
    API --> VAL --> ORM
    API --> PY -->|"spawn process"| ML
    ORM --> DB
    ML -->|"risk scores + factors"| PY
    CACHE -.->|"load cached model"| MODEL
Loading

πŸ›  Tech Stack

Layer Technology Purpose
Frontend React 18 + TypeScript UI framework with type safety
Vite Lightning-fast dev server & bundler
Tailwind CSS Utility-first styling with dark mode
TanStack Query Server state & cache management
React Hook Form + Zod Form handling with schema validation
Recharts Interactive data visualizations
Framer Motion Smooth UI animations
Backend Express.js REST API server
Drizzle ORM Type-safe database queries
PostgreSQL 14+ Relational data storage
Zod Runtime schema validation
ML Pipeline Python 3.10+ ML runtime environment
scikit-learn Logistic Regression model
pandas / NumPy Data manipulation & preprocessing
pickle Model & scaler caching

πŸš€ Getting Started

Prerequisites

Tool Version Check Download
Node.js 18+ LTS node -v nodejs.org
npm 9+ npm -v bundled with Node
Python 3.10+ python3 --version python.org
PostgreSQL 14+ psql --version postgresql.org
Git Any git --version git-scm.com
Docker 20+ docker --version docker.com
Docker Compose 2+ docker compose version bundled with Docker

🐳 Fast Setup with Docker (Recommended)

If you have Docker installed, you can skip the manual installation of Node.js, Python, and PostgreSQL entirely. Running the application requires just a single command.

1. Launching the App

Simply run the following command in the project root:

docker compose up

This command will:

  • Spin up a PostgreSQL 16 database container with persistent storage.
  • Build the app container including Node.js 20 and a Python 3 virtual environment with all scikit-learn/pandas dependencies.
  • Wait for the database to be healthy, then run migrations (npm run db:push).
  • Automatically seed the database with sample clinical assessments (in development mode).
  • Launch the full-stack server with live-reloading (HMR) enabled.

Once started, open your browser and navigate to:

2. Stop the App

To stop the services while preserving your data:

docker compose down

To stop the services and completely reset the database (deleting persistent volumes):

docker compose down -v

3. Rebuilding after Updates

If you update package.json or requirements.txt dependencies, trigger a clean rebuild:

docker compose up --build

βš™οΈ Manual Installation & Setup

1. πŸ“₯ Clone & Install

git clone https://github.com/gopaljilab/Clinical-Insight-Engine.git
cd Clinical-Insight-Engine
npm install

2. πŸ” Environment Configuration

Linux / macOS

cp .env.example .env

Windows (PowerShell)

Copy-Item .env.example .env

Windows (Command Prompt)

copy .env.example .env

If .env.example doesn't exist, create .env manually and add:

DATABASE_URL=postgresql://postgres:postgres@localhost:5432/clinical_insight_engine
πŸ§ͺ Developer Authentication Setup (optional)

For local frontend authentication testing, create a .env.local file (git-ignored):

NODE_ENV=development
NEXT_PUBLIC_APP_URL=http://localhost:3000

DEV_CLINICIAN_EMAIL=developer@cardioguard.local
DEV_CLINICIAN_PASSWORD=DevSecurePassword123!

NEXT_PUBLIC_LOCAL_ENCRYPTION_KEY=your_local_32_character_secret_key_here

Rules of thumb:

  • πŸ”’ .env β†’ database & server secrets only
  • πŸ”’ .env.local β†’ local seeded credentials only (never commit)
  • Restart the dev server after editing .env.local so Vite reloads variables
  • Never paste demo credentials into UI, docs, screenshots, or PRs

πŸ–₯️ Local Login Workflow

  1. Start the app with npm run dev
  2. Open http://localhost:5173
  3. Click Login or Go to App
  4. Enter your .env.local seeded credentials
  5. Complete the simulated OTP step
  6. You'll be redirected to /dashboard

In development mode, the login form shows a small amber notice reminding you to use local seeded credentials. This banner and the DEV_* variables are never exposed in production builds.

3. πŸ—„ Database Setup

🐧 Linux (Ubuntu / Debian)
# Install PostgreSQL
sudo apt update && sudo apt install postgresql postgresql-contrib

# Start & enable the service
sudo systemctl start postgresql
sudo systemctl enable postgresql

# Create database & set password
sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';"
sudo -u postgres psql -c "CREATE DATABASE clinical_insight_engine;"
🍎 macOS (Homebrew)
# Install PostgreSQL
brew install postgresql

# Start the service
brew services start postgresql

# Create database & set password
psql postgres -c "ALTER USER postgres WITH PASSWORD 'postgres';"
psql postgres -c "CREATE DATABASE clinical_insight_engine;"
πŸͺŸ Windows
  1. Download and install PostgreSQL from postgresql.org/download/windows
  2. During installation, use:
    • Username: postgres
    • Password: postgres
    • Port: 5432
  3. Create a database named clinical_insight_engine using pgAdmin or the PostgreSQL CLI.
  4. Update your .env file:
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/clinical_insight_engine

Push the database schema:

npm run db:push

The server runs a PostgreSQL preflight check on startup. If you see Database startup check failed, verify that:

  • PostgreSQL service is running
  • DATABASE_URL in .env is correct
  • The migration above has been run
  • Port 5432 is not blocked

4. 🐍 Python Environment

🐧 Linux / 🍎 macOS
# Create virtual environment
python3 -m venv .venv

# Activate
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt
πŸͺŸ Windows (PowerShell)
# Create virtual environment
py -m venv .venv

# Activate
.\.venv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

5. πŸ“Š Dataset Preparation

If the dataset already exists in the project:

# Linux / macOS
cp attached_assets/diabetes_dataset.csv ./diabetes_dataset.csv

# Windows (PowerShell)
Copy-Item attached_assets/diabetes_dataset.csv ./diabetes_dataset.csv

If the dataset is missing, generate synthetic data:

# Linux / macOS
python3 -c "from analyze import create_synthetic_data; create_synthetic_data()"

# Windows
py -c "from analyze import create_synthetic_data; create_synthetic_data()"

6. πŸš€ Launch

# Start the full-stack dev server
npm run dev
Service URL
Frontend http://localhost:5173
Backend API http://localhost:3000

7. πŸ›‘ Shutting Down

Stop the dev server:

Ctrl + C

Deactivate the Python environment:

deactivate

πŸ“ Project Structure

Clinical-Insight-Engine/
β”‚
β”œβ”€β”€ client/                        # React frontend
β”‚   └── src/
β”‚       β”œβ”€β”€ components/            # Reusable UI components
β”‚       β”œβ”€β”€ pages/                 # Route-level page components
β”‚       β”œβ”€β”€ hooks/                 # Custom React hooks
β”‚       β”‚   β”œβ”€β”€ use-assessments.ts # TanStack Query hooks for API calls
β”‚       β”‚   └── use-toast.ts       # Toast notification state
β”‚       β”œβ”€β”€ lib/                   # Utilities & API client
β”‚       β”‚   β”œβ”€β”€ queryClient.ts     # Global fetch config + React Query setup
β”‚       β”‚   └── utils.ts           # cn() Tailwind class merge utility
β”‚       └── utils/
β”‚           β”œβ”€β”€ search_filters.ts  # Patient search & filter logic
β”‚           └── date_fix.ts        # Safe date parser helper
β”‚
β”œβ”€β”€ server/                        # Express.js backend
β”‚   β”œβ”€β”€ index.ts                   # Server entry point & startup
β”‚   β”œβ”€β”€ routes.ts                  # API route definitions
β”‚   β”œβ”€β”€ storage.ts                 # Data access layer (DB queries)
β”‚   β”œβ”€β”€ db.ts                      # Drizzle ORM + PostgreSQL pool
β”‚   β”œβ”€β”€ static.ts                  # Serves built React frontend
β”‚   β”œβ”€β”€ vite.ts                    # Vite dev server integration (HMR)
β”‚   └── db_fix.ts                  # Clean process exit on DB errors
β”‚
β”œβ”€β”€ shared/                        # Shared between client & server
β”‚   β”œβ”€β”€ schema.ts                  # Drizzle DB schema + Zod types
β”‚   └── routes.ts                  # Shared API request/response schemas
β”‚
β”œβ”€β”€ script/
β”‚   └── build.ts                   # esbuild + Vite production build script
β”‚
β”œβ”€β”€ attached_assets/               # Static assets (dataset, images)
β”‚   └── diabetes_dataset.csv
β”‚
β”œβ”€β”€ analyze.py                     # ML pipeline β€” training & inference
β”œβ”€β”€ main.py                        # Python entry point
β”œβ”€β”€ diabetes_dataset.csv           # Training dataset (root copy)
β”œβ”€β”€ correlation_heatmap.png        # Diabetes feature correlation heatmap
β”œβ”€β”€ patient.json                   # Sample patient input for CLI prediction
β”‚
β”œβ”€β”€ drizzle.config.ts              # Drizzle ORM configuration
β”œβ”€β”€ vite.config.ts                 # Vite bundler configuration
β”œβ”€β”€ tailwind.config.ts             # Tailwind CSS configuration
β”œβ”€β”€ tsconfig.json                  # TypeScript configuration
β”œβ”€β”€ postcss.config.js              # PostCSS configuration
β”œβ”€β”€ components.json                # shadcn/ui component registry
β”œβ”€β”€ pyproject.toml                 # Python project metadata
β”œβ”€β”€ requirements.txt               # Python dependencies
β”œβ”€β”€ package.json                   # Node.js dependencies & scripts
β”œβ”€β”€ package-lock.json              # Locked dependency versions
β”œβ”€β”€ uv.lock                        # uv Python lock file
β”‚
β”œβ”€β”€ README.md                      # Project documentation
β”œβ”€β”€ ANALYSIS_README.md             # ML analysis documentation
β”œβ”€β”€ CONTRIBUTING.md                # Contribution guidelines
└── CODE_OF_CONDUCT.md             # Community code of conduct

πŸ“‘ API Reference

Method Endpoint Description
GET /health Application health check endpoint for monitoring
POST /api/assessments Submit a new risk assessment
GET /api/assessments Retrieve assessment history
GET /api/assessments/:id Get a specific assessment by ID

Example Request

# Health Check
curl -X GET http://localhost:3000/health

# Submit Assessment
curl -X POST http://localhost:3000/api/assessments \
  -H "Content-Type: application/json" \
  -d '{
    "gender": "Female",
    "age": 52,
    "hypertension": true,
    "heartDisease": false,
    "smokingHistory": "former",
    "bmi": 30.1,
    "hba1cLevel": 6.4,
    "bloodGlucoseLevel": 148
  }'

🧠 ML Pipeline

The machine learning pipeline (analyze.py) implements an interpretable risk assessment model:

graph LR
    A["πŸ“‚ Raw Data"] --> B["🧹 Cleaning & Validation"]
    B --> C["βš™οΈ Feature Engineering"]
    C --> D["πŸ“ StandardScaler"]
    D --> E["πŸ“Š Logistic Regression"]
    E --> F["🎯 Risk Score 0–100%"]
    E --> G["πŸ“‹ Feature Importance"]
    F --> H["πŸ’Ύ Cached Model"]
    G --> H
Loading
Step Details
Data Cleaning Filters unrealistic values (BMI < 10, glucose < 50, HbA1c < 3) and replaces with medians
Encoding Gender β†’ binary; Smoking history β†’ one-hot encoding
Scaling StandardScaler on age, BMI, HbA1c, blood glucose
Model LogisticRegression with balanced class weights
Caching Trained model + scaler serialized via pickle for fast inference

Train the Model (Optional)

# Linux/macOS
python3 analyze.py

# Windows
py analyze.py

πŸ”¬ Single-Patient Prediction (CLI)

Create a patient JSON file:

{
  "gender": "Female",
  "age": 52,
  "hypertension": true,
  "heartDisease": false,
  "smokingHistory": "former",
  "bmi": 30.1,
  "hba1cLevel": 6.4,
  "bloodGlucoseLevel": 148
}

Run prediction:

# Linux/macOS
python3 analyze.py predict_file patient.json

# Windows
py analyze.py predict_file patient.json

πŸ”‘ Environment Variables

Variable File Description
DATABASE_URL .env PostgreSQL connection string
NODE_ENV .env.local Set to development for local dev features
SESSION_SECRET .env Required in production for signed Express sessions
DEV_CLINICIAN_EMAIL .env.local Seeded clinician email (dev only)
DEV_CLINICIAN_PASSWORD .env.local Seeded clinician password (dev only)
NEXT_PUBLIC_LOCAL_ENCRYPTION_KEY .env.local Local encryption key (dev only)

Security: .env.local is git-ignored and should never be committed. Production builds do not expose dev credentials.

Request limits: JSON and URL-encoded API payloads are limited to 256kb by default. Add route-specific upload handling before increasing this global limit. Production sessions: When the app runs behind a TLS-terminating reverse proxy or load balancer, Express trusts one proxy hop in production so secure session cookies are issued from X-Forwarded-Proto: https requests.


❓ Troubleshooting

"PostgreSQL is unreachable"
  • Verify PostgreSQL is running: sudo systemctl status postgresql (Linux) or brew services list (macOS)
  • Confirm DATABASE_URL in .env matches your local credentials
  • Ensure port 5432 is not blocked by another process
  • Check that the clinical_insight_engine database exists
"Database startup check failed"
  • Run npm run db:push to create/update the required tables
  • Verify your .env file is in the project root (not inside server/ or client/)
Python model errors
  • Ensure the virtual environment is activated: source .venv/bin/activate
  • Verify dependencies: pip install -r requirements.txt
  • If diabetes_dataset.csv is missing, copy it: cp attached_assets/diabetes_dataset.csv ./
  • Or generate synthetic data: python3 -c "from analyze import create_synthetic_data; create_synthetic_data()"
Port conflicts
  • The dev server defaults to port 5173 (Vite)
  • If occupied, Vite will automatically pick the next available port
  • Check for processes: lsof -i :5173 (Linux/macOS) or netstat -ano | findstr :5173 (Windows)

πŸ—Ί Roadmap

  • πŸ“ˆ Longitudinal patient risk tracking across visits
  • πŸ’‘ Counterfactual reasoning β€” "What single change reduces risk most?"
  • πŸ”¬ Cohort discovery and population-level insights
  • πŸ₯ Integration with Electronic Health Records (EHR)
  • βš–οΈ Advanced bias detection and ML fairness metrics
  • ☁️ Cloud deployment (Vercel / Render)

🀝 Contributing

We love contributions! Whether it's a bug fix, a new feature, or improved docs β€” every PR makes a difference.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feat/amazing-feature)
  3. Commit your changes (git commit -m 'feat: add amazing feature')
  4. Push to the branch (git push origin feat/amazing-feature)
  5. Open a Pull Request

Please read our Contributing Guide and Code of Conduct before submitting.


πŸ‘₯ Contributors

Contributors

πŸ‘€ Author - GitHub

Gopal Gupta Computer Science Engineer Β· Full-Stack Developer Β· Data Science & ML Enthusiast

Built with ❀️ for better preventive healthcare

⭐ Star this repo if you find it useful β€” it helps others discover the project!

GSSoC Drizzle Migrations Policy

  • All schema changes must go through drizzle-kit generate.

✨ README Improvement Notes

πŸ“Œ Formatting Enhancements Needed

  • Improve heading hierarchy for better readability
  • Ensure consistent spacing between sections
  • Use proper Markdown formatting for code blocks and lists
  • Align all installation and usage steps properly

πŸš€ Suggested Structure Upgrade

  • Introduction
  • Features
  • Tech Stack
  • Installation
  • Usage
  • Project Structure
  • Contribution Guidelines
  • License

πŸ› οΈ Documentation Improvements

  • Add badges (optional): build, license, contributors
  • Add screenshots for better UI understanding
  • Standardize code blocks for commands

🎯 Goal

Improve onboarding experience for new contributors and users by making README more structured, readable, and professional.

About

Clinical Insight Engine is a full-stack clinical decision support system designed to surface early diabetes risk signals from routine patient data. It combines a Python-based interpretable machine learning model with a modern React frontend, presenting results differently for clinicians and patients.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors