Credit Risk Analysis

Overview

This project predicts the probability of loan approval from applicant and loan attributes, returns per-feature explanation values (SHAP contributions), and can recommend a revised interest rate to improve approval probability where applicable. It is intended as a demonstration and research codebase for interpretable ML and simple deployment patterns.

Core components:

Notebook pipeline to reproduce dataset preparation, feature engineering, model training and explainability.
Serialized model + explainer artifacts.
Inference service: FastAPI (app/api.py).
Web UI: Streamlit (app/main.py).

Live Demo

https://credit-risk-analysis-ui.onrender.com/

Quick Start

Tested environment: Python 3.12.12 (use a venv or equivalent).

1. Create & activate a virtual environment

Unix / macOS:

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

Windows (PowerShell):

python -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip

2. Install dependencies

For development (notebooks, training, explainers, app):

pip install -r requirements-dev.txt

For runtime (only serving the app):

pip install -r requirements.txt

3. Run everything (single command)

bash run.sh

FastAPI: http://localhost:8000
FastAPI Swagger UI: http://localhost:8000/docs
Streamlit UI: http://localhost:8501

4. Run services manually (optional)

Backend

uvicorn app.api:app --host 0.0.0.0 --port 8000 --reload

Frontend

export API_URL="http://localhost:8000/"
streamlit run app/main.py --server.port 8501 --server.address 0.0.0.0

API Reference

Endpoint: POST /

Accepts: application/json — a single applicant + loan record.

Returns: model prediction probability, per-feature explanation contributions, and optional recommendation fields.

Example request

{
  "person_age": 35,
  "person_education": "Bachelor",
  "person_income": 50000.0,
  "person_home_ownership": "Mortgage",
  "loan_amount": 10000.0,
  "loan_intent": "Personal",
  "loan_interest_rate": 12.0,
  "credit_score": 650,
  "previous_loan_defaults": "No"
}

Example response

{
  "person_age":1.93,
  "person_education":5.88,
  "person_income":6.97,
  "person_home_ownership":-0.44,
  "loan_amount":1.94,
  "loan_intent":12.61,
  "loan_interest_rate":0.39,
  "credit_score":6.05,
  "previous_loan_defaults":22.83,
  "current_pred":0.5817,
  "new_pred":0.8374,
  "rec_rate":14.02
}

Note

Feature contributions are explanation values (SHAP style), not raw feature values.
new_pred and rec_rate may be null when recommendation is not applicable.

Repository Structure

Credit-Risk-Analysis/
├── app/
│   ├── api.py                     # FastAPI backend (prediction + explanations)
│   └── main.py                    # Streamlit frontend
├── datasets/
│   ├── raw_data.csv
│   ├── cleaned_data.csv
│   ├── train_data.csv
│   ├── validation_set.csv
│   ├── test_set.csv
│   └── sampled_train_data.csv     # SMOTE-balanced training set
├── models/
│   ├── best_model.pkl             # Final LightGBM model
│   ├── power_transformer.pkl
│   ├── shap_explainer.pkl
│   └── lime_config.pkl
├── notebooks/
│   ├── 01_dataset_selection.ipynb
│   ├── 02_cleaning_and_eda.ipynb
│   ├── 03_feature_engineering.ipynb
│   ├── 04_model_training.ipynb
│   ├── 05_model_selection.ipynb
│   └── 06_explainability.ipynb
├── utils/helpers.py
├── requirements.txt
├── requirements-dev.txt
├── run.sh
└── LICENSE

Model Performance Snapshot

From notebooks/05_model_selection.ipynb:

Validation (tuned models):
- LightGBM: weighted F1 0.92, accuracy 0.92
- XGBoost: weighted F1 0.92, accuracy 0.92

LightGBM selected for efficiency/deployment fit

Held-out test set (n=4391):
- Accuracy: 0.93
- Weighted F1: 0.93
- Class 1 (approved) F1: 0.84

Reproducibility

Data source: Loan Approval Classification Dataset (Kaggle)
Training seed: 100 (Configurable via environment variable RANDOM_STATE)
Training pipeline: notebooks 01 → 06
Artifacts: models/ directory

To retrain:

Run notebooks in order
Export final model and transformers
Replace artifacts in models/

Security & Privacy

This project uses public dataset(s) for demonstration.  
Do not use for real lending decisions without compliance and fairness review.  
Remove PII if adapting to real data.

Limitations

Educational/demo project
Dataset bias possible
Recommendation logic heuristic

License

Distributed under the terms of the LICENSE file in this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Risk Analysis

Table of Contents

Overview

Live Demo

Quick Start

1. Create & activate a virtual environment

2. Install dependencies

3. Run everything (single command)

4. Run services manually (optional)

API Reference

Example request

Example response

Repository Structure

Model Performance Snapshot

Reproducibility

Security & Privacy

Limitations

License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
app		app
datasets		datasets
models		models
notebooks		notebooks
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

Credit Risk Analysis

Table of Contents

Overview

Live Demo

Quick Start

1. Create & activate a virtual environment

2. Install dependencies

3. Run everything (single command)

4. Run services manually (optional)

API Reference

Example request

Example response

Repository Structure

Model Performance Snapshot

Reproducibility

Security & Privacy

Limitations

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages