This project develops a modular machine learning framework for predicting bio-oil yield from Hydrothermal Liquefaction (HTL) experiments.
Unlike notebook-based implementations, the project provides a reusable software architecture for preprocessing, training, hyperparameter optimization, explainability, and visualization.
- Modular ML framework
- Six regression algorithms
- Hyperparameter optimization
- SHAP explainability
- Partial Dependence Plots
- Permutation Importance
- Model comparison dashboard
- Automated experiment logging
- Publication-quality visualizations
| Property | Value |
|---|---|
| Samples | 2284 |
| Features | 29 |
| Task | Regression |
| Target | Bio-Oil Yield (%) |
Dataset
│
▼
Preprocessing
│
▼
Model Training
│
▼
Hyperparameter Optimization
│
▼
Explainability
│
▼
Dashboard
| Model | Test R² | Test MAE |
|---|---|---|
| 🥇 Tuned XGBoost | 0.8689 | 4.2589 |
| Tuned Random Forest | 0.8659 | 4.2796 |
| XGBoost | 0.8634 | 4.4094 |
| Random Forest | 0.8624 | 4.4111 |
| Extra Trees | 0.8619 | 4.1621 |
| CatBoost | 0.8370 | 5.1441 |
Top predictive features:
- Lipids
- Temperature
- Higher Heating Value
- Proteins
- Fatty Acids
src/
├── core/
├── models/
├── experiments/
├── visualization/
└── legacy/
pip install -r requirements.txtpython -m src.models.xgboostpython -m src.visualization.model_dashboard- Bayesian Optimization
- LightGBM
- Deep Learning
- Streamlit Deployment
- Multi-output HTL prediction
Rohan
Indian Institute of Technology Kharagpur
