Galaxy Classifier AI is a machine learning system that classifies galaxies as Spiral or Elliptical using image features. The project features a custom AdaBoost implementation from scratch and an interactive Streamlit dashboard for live predictions and research insights.
- ML Framework: Custom AdaBoost, scikit-learn
- Image Processing: OpenCV, scikit-image
- Frontend: Streamlit
- Data: Pandas, NumPy, Matplotlib, Seaborn
- Upload galaxy images and get instant classification
- View extracted features (color stats, entropy, shape metrics)
- Feature Analysis: Distribution plots, feature importance
- Boosting Internals: Alpha decay, error evolution visualizations
- Model Comparison: Benchmark against Random Forest, Gradient Boosting, SVM
- Manual AdaBoost: Implemented from scratch with weighted decision stumps
- Feature Extraction: 11 features (RGB means, RGB stds, entropy, area, perimeter, circularity, eccentricity)
- Grid Search: Hyperparameter tuning with detailed logging
-
Clone & Install:
git clone https://github.com/OsherBerGit/Galaxy-Machine-Learning-Project.git cd Galaxy-Machine-Learning-Project pip install -r requirements.txt -
Download Data: Get
images_training_rev1.zipandtraining_solutions_rev1.csvfrom Kaggle Galaxy Zoo, extract todata/ -
Prepare & Train:
python src/1_prepare_data.py python src/2_feature_extraction.py python src/8_train_final_manual_model.py
-
Run App:
streamlit run main.py
├── main.py # Streamlit dashboard
├── src/
│ ├── galaxy_adaboost.py # Manual AdaBoost implementation
│ ├── 1_prepare_data.py # Data preprocessing
│ ├── 2_feature_extraction.py
│ └── ... # Analysis scripts (3-7)
├── data/ # Dataset & results
├── models/ # Trained models
└── plots/ # Generated visualizations
Data source: Galaxy Zoo Challenge on Kaggle