- Overview
- Live Demo
- Quick Start
- API Reference
- Repository Structure
- Model Performance Snapshot
- Reproducibility
- Security & Privacy
- Limitations
- License
This project predicts the probability of loan approval from applicant and loan attributes, returns per-feature explanation values (SHAP contributions), and can recommend a revised interest rate to improve approval probability where applicable. It is intended as a demonstration and research codebase for interpretable ML and simple deployment patterns.
Core components:
- Notebook pipeline to reproduce dataset preparation, feature engineering, model training and explainability.
- Serialized model + explainer artifacts.
- Inference service:
FastAPI(app/api.py). - Web UI:
Streamlit(app/main.py).
https://credit-risk-analysis-ui.onrender.com/
Tested environment: Python 3.12.12 (use a venv or equivalent).
- Unix / macOS:
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip- Windows (PowerShell):
python -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip- For development (notebooks, training, explainers, app):
pip install -r requirements-dev.txt- For runtime (only serving the app):
pip install -r requirements.txtbash run.sh-
FastAPI: http://localhost:8000
-
FastAPI Swagger UI: http://localhost:8000/docs
-
Streamlit UI: http://localhost:8501
- Backend
uvicorn app.api:app --host 0.0.0.0 --port 8000 --reload- Frontend
export API_URL="http://localhost:8000/"
streamlit run app/main.py --server.port 8501 --server.address 0.0.0.0Endpoint: POST /
Accepts: application/json — a single applicant + loan record.
Returns: model prediction probability, per-feature explanation contributions, and optional recommendation fields.
{
"person_age": 35,
"person_education": "Bachelor",
"person_income": 50000.0,
"person_home_ownership": "Mortgage",
"loan_amount": 10000.0,
"loan_intent": "Personal",
"loan_interest_rate": 12.0,
"credit_score": 650,
"previous_loan_defaults": "No"
}{
"person_age":1.93,
"person_education":5.88,
"person_income":6.97,
"person_home_ownership":-0.44,
"loan_amount":1.94,
"loan_intent":12.61,
"loan_interest_rate":0.39,
"credit_score":6.05,
"previous_loan_defaults":22.83,
"current_pred":0.5817,
"new_pred":0.8374,
"rec_rate":14.02
}Note
- Feature contributions are explanation values (SHAP style), not raw feature values.
new_predandrec_ratemay be null when recommendation is not applicable.
Credit-Risk-Analysis/
├── app/
│ ├── api.py # FastAPI backend (prediction + explanations)
│ └── main.py # Streamlit frontend
├── datasets/
│ ├── raw_data.csv
│ ├── cleaned_data.csv
│ ├── train_data.csv
│ ├── validation_set.csv
│ ├── test_set.csv
│ └── sampled_train_data.csv # SMOTE-balanced training set
├── models/
│ ├── best_model.pkl # Final LightGBM model
│ ├── power_transformer.pkl
│ ├── shap_explainer.pkl
│ └── lime_config.pkl
├── notebooks/
│ ├── 01_dataset_selection.ipynb
│ ├── 02_cleaning_and_eda.ipynb
│ ├── 03_feature_engineering.ipynb
│ ├── 04_model_training.ipynb
│ ├── 05_model_selection.ipynb
│ └── 06_explainability.ipynb
├── utils/helpers.py
├── requirements.txt
├── requirements-dev.txt
├── run.sh
└── LICENSE
From notebooks/05_model_selection.ipynb:
- Validation (tuned models):
LightGBM: weighted F10.92, accuracy0.92XGBoost: weighted F10.92, accuracy0.92
LightGBM selected for efficiency/deployment fit
- Held-out test set (
n=4391):- Accuracy:
0.93 - Weighted F1:
0.93 - Class 1 (approved) F1:
0.84
- Accuracy:
- Data source: Loan Approval Classification Dataset (Kaggle)
- Training seed: 100 (Configurable via environment variable
RANDOM_STATE) - Training pipeline: notebooks 01 → 06
- Artifacts:
models/directory
To retrain:
- Run notebooks in order
- Export final model and transformers
- Replace artifacts in
models/
This project uses public dataset(s) for demonstration.
Do not use for real lending decisions without compliance and fairness review.
Remove PII if adapting to real data.- Educational/demo project
- Dataset bias possible
- Recommendation logic heuristic
Distributed under the terms of the LICENSE file in this repository.