Interpretable ML + Modern React β built for clinicians and patients alike
A full-stack clinical decision support system that surfaces early diabetes risk signals from routine patient data.
Combines an interpretable ML model with a modern React frontend, presenting results tailored for both clinicians and patients.
Warning
Medical Disclaimer β This system is intended for educational and research purposes only. It does not provide medical diagnoses and should not be used as a substitute for professional medical advice.
- Why Clinical Insight Engine?
- Key Features
- Architecture
- Tech Stack
- Getting Started
- Project Structure
- API Reference
- ML Pipeline
- Single-Patient Prediction
- Environment Variables
- Troubleshooting
- Roadmap
- Contributing
- Contributors
Diabetes affects over 500 million adults worldwide, yet early risk signals are often buried in routine clinical data. Clinical Insight Engine bridges that gap:
| Problem | Our Approach |
|---|---|
| Risk models are opaque black boxes | Interpretable Logistic Regression with per-feature impact scores |
| Results are one-size-fits-all | Dual-view output β detailed for clinicians, simplified for patients |
| Predictions lack context | Confidence-aware assessments with actionable follow-up recommendations |
| Patient data sits in silos | Longitudinal tracking with full assessment history |
Collects clinically relevant inputs:
Age Β· Gender Β· Hypertension Β· Heart Disease Β· Smoking History Β· BMI Β· HbA1c Β· Blood Glucose
|
π©» Clinician View
|
π§ββοΈ Patient View
|
- Stores assessments with full timestamps
- Enables longitudinal patient risk tracking over time
- Interactive bar charts for factor contributions
- Diabetes correlation heatmap for data exploration
graph TB
subgraph Client["π₯οΈ Client β React + TypeScript"]
UI["Risk Assessment Form"]
CV["Clinician View"]
PV["Patient View"]
VIZ["Data Visualizations"]
HIST["Assessment History"]
end
subgraph Server["βοΈ Server β Express.js"]
API["REST API Routes"]
VAL["Zod Validation"]
ORM["Drizzle ORM"]
PY["Python Bridge"]
end
subgraph ML["π§ ML Pipeline β Python"]
PROC["Data Preprocessing"]
MODEL["Logistic Regression"]
INTERP["Feature Interpretation"]
CACHE["Model Cache (pickle)"]
end
subgraph DB["ποΈ PostgreSQL"]
ASSESS["Assessments Table"]
end
Client -->|"HTTP Requests"| API
API --> VAL --> ORM
API --> PY -->|"spawn process"| ML
ORM --> DB
ML -->|"risk scores + factors"| PY
CACHE -.->|"load cached model"| MODEL
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | React 18 + TypeScript | UI framework with type safety |
| Vite | Lightning-fast dev server & bundler | |
| Tailwind CSS | Utility-first styling with dark mode | |
| TanStack Query | Server state & cache management | |
| React Hook Form + Zod | Form handling with schema validation | |
| Recharts | Interactive data visualizations | |
| Framer Motion | Smooth UI animations | |
| Backend | Express.js | REST API server |
| Drizzle ORM | Type-safe database queries | |
| PostgreSQL 14+ | Relational data storage | |
| Zod | Runtime schema validation | |
| ML Pipeline | Python 3.10+ | ML runtime environment |
| scikit-learn | Logistic Regression model | |
| pandas / NumPy | Data manipulation & preprocessing | |
| pickle | Model & scaler caching |
| Tool | Version | Check | Download |
|---|---|---|---|
| Node.js | 18+ LTS | node -v |
nodejs.org |
| npm | 9+ | npm -v |
bundled with Node |
| Python | 3.10+ | python3 --version |
python.org |
| PostgreSQL | 14+ | psql --version |
postgresql.org |
| Git | Any | git --version |
git-scm.com |
| Docker | 20+ | docker --version |
docker.com |
| Docker Compose | 2+ | docker compose version |
bundled with Docker |
If you have Docker installed, you can skip the manual installation of Node.js, Python, and PostgreSQL entirely. Running the application requires just a single command.
Simply run the following command in the project root:
docker compose upThis command will:
- Spin up a PostgreSQL 16 database container with persistent storage.
- Build the app container including Node.js 20 and a Python 3 virtual environment with all scikit-learn/pandas dependencies.
- Wait for the database to be healthy, then run migrations (
npm run db:push). - Automatically seed the database with sample clinical assessments (in development mode).
- Launch the full-stack server with live-reloading (HMR) enabled.
Once started, open your browser and navigate to:
- Web App & REST API: http://localhost:3000
To stop the services while preserving your data:
docker compose downTo stop the services and completely reset the database (deleting persistent volumes):
docker compose down -vIf you update package.json or requirements.txt dependencies, trigger a clean rebuild:
docker compose up --buildgit clone https://github.com/gopaljilab/Clinical-Insight-Engine.git
cd Clinical-Insight-Engine
npm installLinux / macOS
cp .env.example .envWindows (PowerShell)
Copy-Item .env.example .envWindows (Command Prompt)
copy .env.example .envIf .env.example doesn't exist, create .env manually and add:
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/clinical_insight_engineπ§ͺ Developer Authentication Setup (optional)
For local frontend authentication testing, create a .env.local file (git-ignored):
NODE_ENV=development
NEXT_PUBLIC_APP_URL=http://localhost:3000
DEV_CLINICIAN_EMAIL=developer@cardioguard.local
DEV_CLINICIAN_PASSWORD=DevSecurePassword123!
NEXT_PUBLIC_LOCAL_ENCRYPTION_KEY=your_local_32_character_secret_key_hereRules of thumb:
π .envβ database & server secrets onlyπ .env.localβ local seeded credentials only (never commit)- Restart the dev server after editing
.env.localso Vite reloads variables- Never paste demo credentials into UI, docs, screenshots, or PRs
- Start the app with
npm run dev - Open
http://localhost:5173 - Click Login or Go to App
- Enter your
.env.localseeded credentials - Complete the simulated OTP step
- You'll be redirected to
/dashboard
In development mode, the login form shows a small amber notice reminding you to use local seeded credentials. This banner and the
DEV_*variables are never exposed in production builds.
π§ Linux (Ubuntu / Debian)
# Install PostgreSQL
sudo apt update && sudo apt install postgresql postgresql-contrib
# Start & enable the service
sudo systemctl start postgresql
sudo systemctl enable postgresql
# Create database & set password
sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';"
sudo -u postgres psql -c "CREATE DATABASE clinical_insight_engine;"π macOS (Homebrew)
# Install PostgreSQL
brew install postgresql
# Start the service
brew services start postgresql
# Create database & set password
psql postgres -c "ALTER USER postgres WITH PASSWORD 'postgres';"
psql postgres -c "CREATE DATABASE clinical_insight_engine;"πͺ Windows
- Download and install PostgreSQL from postgresql.org/download/windows
- During installation, use:
- Username:
postgres - Password:
postgres - Port:
5432
- Username:
- Create a database named
clinical_insight_engineusing pgAdmin or the PostgreSQL CLI. - Update your
.envfile:
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/clinical_insight_enginePush the database schema:
npm run db:pushThe server runs a PostgreSQL preflight check on startup. If you see
Database startup check failed, verify that:
- PostgreSQL service is running
DATABASE_URLin.envis correct- The migration above has been run
- Port
5432is not blocked
π§ Linux / π macOS
# Create virtual environment
python3 -m venv .venv
# Activate
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txtπͺ Windows (PowerShell)
# Create virtual environment
py -m venv .venv
# Activate
.\.venv\Scripts\Activate.ps1
# Install dependencies
pip install -r requirements.txtIf the dataset already exists in the project:
# Linux / macOS
cp attached_assets/diabetes_dataset.csv ./diabetes_dataset.csv
# Windows (PowerShell)
Copy-Item attached_assets/diabetes_dataset.csv ./diabetes_dataset.csvIf the dataset is missing, generate synthetic data:
# Linux / macOS
python3 -c "from analyze import create_synthetic_data; create_synthetic_data()"
# Windows
py -c "from analyze import create_synthetic_data; create_synthetic_data()"# Start the full-stack dev server
npm run dev| Service | URL |
|---|---|
| Frontend | http://localhost:5173 |
| Backend API | http://localhost:3000 |
Stop the dev server:
Ctrl + C
Deactivate the Python environment:
deactivateClinical-Insight-Engine/
β
βββ client/ # React frontend
β βββ src/
β βββ components/ # Reusable UI components
β βββ pages/ # Route-level page components
β βββ hooks/ # Custom React hooks
β β βββ use-assessments.ts # TanStack Query hooks for API calls
β β βββ use-toast.ts # Toast notification state
β βββ lib/ # Utilities & API client
β β βββ queryClient.ts # Global fetch config + React Query setup
β β βββ utils.ts # cn() Tailwind class merge utility
β βββ utils/
β βββ search_filters.ts # Patient search & filter logic
β βββ date_fix.ts # Safe date parser helper
β
βββ server/ # Express.js backend
β βββ index.ts # Server entry point & startup
β βββ routes.ts # API route definitions
β βββ storage.ts # Data access layer (DB queries)
β βββ db.ts # Drizzle ORM + PostgreSQL pool
β βββ static.ts # Serves built React frontend
β βββ vite.ts # Vite dev server integration (HMR)
β βββ db_fix.ts # Clean process exit on DB errors
β
βββ shared/ # Shared between client & server
β βββ schema.ts # Drizzle DB schema + Zod types
β βββ routes.ts # Shared API request/response schemas
β
βββ script/
β βββ build.ts # esbuild + Vite production build script
β
βββ attached_assets/ # Static assets (dataset, images)
β βββ diabetes_dataset.csv
β
βββ analyze.py # ML pipeline β training & inference
βββ main.py # Python entry point
βββ diabetes_dataset.csv # Training dataset (root copy)
βββ correlation_heatmap.png # Diabetes feature correlation heatmap
βββ patient.json # Sample patient input for CLI prediction
β
βββ drizzle.config.ts # Drizzle ORM configuration
βββ vite.config.ts # Vite bundler configuration
βββ tailwind.config.ts # Tailwind CSS configuration
βββ tsconfig.json # TypeScript configuration
βββ postcss.config.js # PostCSS configuration
βββ components.json # shadcn/ui component registry
βββ pyproject.toml # Python project metadata
βββ requirements.txt # Python dependencies
βββ package.json # Node.js dependencies & scripts
βββ package-lock.json # Locked dependency versions
βββ uv.lock # uv Python lock file
β
βββ README.md # Project documentation
βββ ANALYSIS_README.md # ML analysis documentation
βββ CONTRIBUTING.md # Contribution guidelines
βββ CODE_OF_CONDUCT.md # Community code of conduct
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Application health check endpoint for monitoring |
POST |
/api/assessments |
Submit a new risk assessment |
GET |
/api/assessments |
Retrieve assessment history |
GET |
/api/assessments/:id |
Get a specific assessment by ID |
# Health Check
curl -X GET http://localhost:3000/health
# Submit Assessment
curl -X POST http://localhost:3000/api/assessments \
-H "Content-Type: application/json" \
-d '{
"gender": "Female",
"age": 52,
"hypertension": true,
"heartDisease": false,
"smokingHistory": "former",
"bmi": 30.1,
"hba1cLevel": 6.4,
"bloodGlucoseLevel": 148
}'The machine learning pipeline (analyze.py) implements an interpretable risk assessment model:
graph LR
A["π Raw Data"] --> B["π§Ή Cleaning & Validation"]
B --> C["βοΈ Feature Engineering"]
C --> D["π StandardScaler"]
D --> E["π Logistic Regression"]
E --> F["π― Risk Score 0β100%"]
E --> G["π Feature Importance"]
F --> H["πΎ Cached Model"]
G --> H
| Step | Details |
|---|---|
| Data Cleaning | Filters unrealistic values (BMI < 10, glucose < 50, HbA1c < 3) and replaces with medians |
| Encoding | Gender β binary; Smoking history β one-hot encoding |
| Scaling | StandardScaler on age, BMI, HbA1c, blood glucose |
| Model | LogisticRegression with balanced class weights |
| Caching | Trained model + scaler serialized via pickle for fast inference |
# Linux/macOS
python3 analyze.py
# Windows
py analyze.pyCreate a patient JSON file:
{
"gender": "Female",
"age": 52,
"hypertension": true,
"heartDisease": false,
"smokingHistory": "former",
"bmi": 30.1,
"hba1cLevel": 6.4,
"bloodGlucoseLevel": 148
}Run prediction:
# Linux/macOS
python3 analyze.py predict_file patient.json
# Windows
py analyze.py predict_file patient.json| Variable | File | Description |
|---|---|---|
DATABASE_URL |
.env |
PostgreSQL connection string |
NODE_ENV |
.env.local |
Set to development for local dev features |
SESSION_SECRET |
.env |
Required in production for signed Express sessions |
DEV_CLINICIAN_EMAIL |
.env.local |
Seeded clinician email (dev only) |
DEV_CLINICIAN_PASSWORD |
.env.local |
Seeded clinician password (dev only) |
NEXT_PUBLIC_LOCAL_ENCRYPTION_KEY |
.env.local |
Local encryption key (dev only) |
Security:
.env.localis git-ignored and should never be committed. Production builds do not expose dev credentials.
Request limits: JSON and URL-encoded API payloads are limited to
256kbby default. Add route-specific upload handling before increasing this global limit. Production sessions: When the app runs behind a TLS-terminating reverse proxy or load balancer, Express trusts one proxy hop in production so secure session cookies are issued fromX-Forwarded-Proto: httpsrequests.
"PostgreSQL is unreachable"
- Verify PostgreSQL is running:
sudo systemctl status postgresql(Linux) orbrew services list(macOS) - Confirm
DATABASE_URLin.envmatches your local credentials - Ensure port
5432is not blocked by another process - Check that the
clinical_insight_enginedatabase exists
"Database startup check failed"
- Run
npm run db:pushto create/update the required tables - Verify your
.envfile is in the project root (not insideserver/orclient/)
Python model errors
- Ensure the virtual environment is activated:
source .venv/bin/activate - Verify dependencies:
pip install -r requirements.txt - If
diabetes_dataset.csvis missing, copy it:cp attached_assets/diabetes_dataset.csv ./ - Or generate synthetic data:
python3 -c "from analyze import create_synthetic_data; create_synthetic_data()"
Port conflicts
- The dev server defaults to port 5173 (Vite)
- If occupied, Vite will automatically pick the next available port
- Check for processes:
lsof -i :5173(Linux/macOS) ornetstat -ano | findstr :5173(Windows)
- π Longitudinal patient risk tracking across visits
- π‘ Counterfactual reasoning β "What single change reduces risk most?"
- π¬ Cohort discovery and population-level insights
- π₯ Integration with Electronic Health Records (EHR)
- βοΈ Advanced bias detection and ML fairness metrics
- βοΈ Cloud deployment (Vercel / Render)
We love contributions! Whether it's a bug fix, a new feature, or improved docs β every PR makes a difference.
- Fork the repository
- Create your feature branch (
git checkout -b feat/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feat/amazing-feature) - Open a Pull Request
Please read our Contributing Guide and Code of Conduct before submitting.
Gopal Gupta Computer Science Engineer Β· Full-Stack Developer Β· Data Science & ML Enthusiast
Built with β€οΈ for better preventive healthcare
β Star this repo if you find it useful β it helps others discover the project!
- All schema changes must go through drizzle-kit generate.
- Improve heading hierarchy for better readability
- Ensure consistent spacing between sections
- Use proper Markdown formatting for code blocks and lists
- Align all installation and usage steps properly
- Introduction
- Features
- Tech Stack
- Installation
- Usage
- Project Structure
- Contribution Guidelines
- License
- Add badges (optional): build, license, contributors
- Add screenshots for better UI understanding
- Standardize code blocks for commands
Improve onboarding experience for new contributors and users by making README more structured, readable, and professional.
