A multi-step ML web application that evaluates the green retrofit potential of existing buildings — from a facade photo and questionnaire to a scored PDF report in one click.
Most buildings were not designed to meet today's energy or carbon standards, but assessing their retrofit potential requires expensive expert audits. RetroGrade replaces that initial scoping stage with an automated pipeline: a building owner or architect uploads a facade photo and fills in a short questionnaire, and within seconds receives a 0–100 Feasibility Score across five categories (envelope, orientation, renewables, passive design, climate fit), plus a professionally formatted PDF report. The score directly indicates whether the building should be retrofitted, rebuilt, or left as-is — with supporting recommendations.
-
Step 01 — Image Segmentation (
Step-01-image-processing/integrated_analyzer.py)- SegFormer (ADE20K) + CLIP classify every pixel of the facade photo
- Outputs: Window-to-Wall Ratio (WWR), dominant facade material, vegetation coverage (%), shading coverage (%), and a colour-coded overlay image
- Optional: SAM (Segment Anything Model) provides a precise building silhouette mask
-
Step 01b — Geocoding (
backend/climate.py)- Nominatim geocodes the user-supplied location string → latitude / longitude
-
Step 01c — Climate Data (
backend/climate.py)- Two parallel Open-Meteo API calls (hourly 2022 + daily 2020–2022 averages)
- Outputs: Koppen climate class → mapped to
mediterranean/temperate/cold, HDD/CDD, max solar irradiance, wind and daylight indices
-
Step 02 — Energy & Carbon Regression (
backend/main.py)- A pre-trained
MultiOutputRegressor(RandomForestRegressor)loaded fromStep-02-regression-model/models/step1_energy_carbon_regressor.joblib - Inputs: 13 building features (type, age, orientation, glazing, insulation, HVAC, roof, solar panels, smart systems, rainwater harvesting, WWR, material, climate)
- Outputs:
predicted_energy_kwh_m2andpredicted_carbon_kgco2_m2
- A pre-trained
-
Step 03 — Feasibility Score (
Step-03-feasibility-score-calculator/scorer/)- Rule-based engine with compound scoring rules and interaction penalties
- Returns a 0–100 score, a 5-category breakdown, a list of recommendations, and a detailed per-criterion breakdown used in the report
- User can override category weights via the frontend sliders
-
Step 04 — PDF Report (
Step-04-report-generation/retrograde_report_1.py)- 8-section ReportLab PDF (cover → about → viability score → building profile → image processing → climate → energy & carbon → solutions → impact → score detail)
- Saved to
reports/{session_id}.pdfand served viaGET /api/report/{session_id}
All CSV column updates are batched and written once at the end to RetroGrade-Data.csv.
Green_Design_ML/
├── backend/
│ ├── main.py # FastAPI app — full Steps 00–04 pipeline
│ ├── climate.py # Geocoding (Nominatim) + climate data (Open-Meteo)
│ └── requirements.txt # Python dependencies for the backend
├── frontend/
│ ├── src/
│ │ ├── App.jsx # Route setup (Landing → Questionnaire → Success)
│ │ ├── main.jsx # React entry point
│ │ └── pages/
│ │ ├── Landing.jsx # Hero page
│ │ ├── Questionnaire.jsx # 4-step form: building info, systems, features, image
│ │ └── Success.jsx # Score display, category bars, PDF download
│ ├── vite.config.js # Vite config — proxies /api/* and /uploads/* to :8000
│ └── package.json
├── src/ # ML pipeline modules (imported by backend/main.py)
│ ├── Step-01-image-processing/
│ │ └── integrated_analyzer.py # SegFormer x2 + CLIP (+ optional SAM)
│ ├── Step-02-regression-model/
│ │ ├── src/train_regression.py # Run once to train and save the model
│ │ ├── models/ # git-ignored — regenerate with train_regression.py
│ │ └── data/ # Training CSV (not used for live scoring)
│ ├── Step-03-feasibility-score-calculator/
│ │ └── scorer/
│ │ ├── engine.py # calculate_viability_score() entry point
│ │ ├── criteria.py # Per-criterion scoring functions + compound rules
│ │ ├── inputs.py # BuildingInputs dataclass
│ │ ├── weights.py # Scoring constants and lookup tables
│ │ └── confidence.py # Data completeness + contradiction detection
│ └── Step-04-report-generation/
│ └── retrograde_report_1.py # generate_report(data_dict, output_path)
├── data/
│ └── RetroGrade-Data.csv # Shared data store — one row per submission
├── uploads/ # Facade images + overlay images (git-ignored, runtime)
├── reports/ # Generated PDFs (git-ignored, runtime)
├── .env.example # Environment variable template
└── .gitignore
- Python 3.10+
- Node.js 18+
- pip
# 1. Clone the repository
git clone <repo-url>
cd Green_Design_ML
# 2. Install backend dependencies
cd backend
pip install -r requirements.txt
# 3. Install heavy ML dependencies (install once — not in requirements.txt)
pip install torch transformers open_clip_torch opencv-python
# 4. Install frontend dependencies
cd ../frontend
npm installThe model file is git-ignored and must be generated before Step 02 will work:
cd src/Step-02-regression-model
python src/train_regression.py
# Saves to: models/step1_energy_carbon_regressor.joblibStart both servers simultaneously — Vite proxies API calls to the backend:
# Terminal 1 — Backend
cd backend
uvicorn main:app --reload --port 8000
# Terminal 2 — Frontend
cd frontend
npm run dev # http://localhost:5173On first startup the image models (~1.5 GB) load into memory, which takes ~60 seconds. If ML dependencies are not installed, Step 01 is skipped gracefully and image-derived values (WWR, material, vegetation, shading) default to zero/None.
All external APIs used (Open-Meteo, Nominatim) are public and unauthenticated. No API keys
are required for the current version. Copy .env.example to .env if you need to override
defaults.
| Variable | Required | Description |
|---|---|---|
BACKEND_PORT |
No | Backend port (default: 8000) |
VITE_PORT |
No | Vite dev server port (default: 5173) |
| Package | Version | Purpose |
|---|---|---|
fastapi |
latest | HTTP API server and routing |
uvicorn |
latest | ASGI server for FastAPI |
torch + transformers |
latest | SegFormer image segmentation model |
open_clip_torch |
latest | CLIP material/facade classification |
opencv-python |
latest | Image blending for segmentation overlay |
scikit-learn |
latest | RandomForest regression model |
joblib |
latest | Model serialisation and loading |
reportlab |
latest | PDF report generation |
react |
18 | Frontend UI |
vite |
5 | Frontend build tool + dev proxy |
| Score | Colour | Verdict |
|---|---|---|
| > 90 | Blue | Retrofitting Not Required |
| 50–90 | Green | Retrofit Recommended |
| < 50 | Red | Rebuilding Recommended |
The five scoring categories (Envelope & Energy 30%, Orientation & Solar 20%, Renewable Access 20%, Passive Design 15%, Climate-Design 15%) can be reweighted using sliders in the frontend before submission.
- Image models load on first request (~60 s cold start); plan accordingly for demos
- SAM checkpoint (~2.4 GB,
sam_vit_h_4b8939.pth) is optional — without it the analyser falls back to ADE20K sky-class masking for background separation - Climate data uses 2020–2022 historical records; post-2023 anomalies are not reflected
- Regression model must be trained locally before Step 02 works (
train_regression.py) green_architecture_dataset.csvis training data only — never used for live scoring- Vite may start on ports 5174/5175 if 5173 is occupied; add the new port to
allow_originsinbackend/main.pyif CORS errors appear