A beginner-friendly machine learning classification project that predicts Iris flower species using a Decision Tree Classifier.
This project demonstrates a complete ML workflow including data loading, exploratory data analysis (EDA), visualization, model training, evaluation, prediction, and performance analysis.
This project performs end-to-end classification using the Iris dataset and covers the complete machine learning pipeline.
The system:
- Loads and explores data
- Generates visual analysis
- Trains a classification model
- Evaluates prediction performance
- Produces graphs and insights
- Complete machine learning workflow
- Beginner-friendly implementation
- Uses Decision Tree Classification
- Multiple visualizations included
- Performance evaluation and reporting
- Feature importance analysis
| Property | Details |
|---|---|
| Dataset | Iris Dataset |
| Source | Scikit-learn |
| Total Records | 150 |
| Features | Sepal Length, Sepal Width, Petal Length, Petal Width |
| Classes | Setosa, Versicolor, Virginica |
Decision Tree Classifier
A supervised machine learning algorithm used to classify flower species based on feature values.
DecisionTreeClassifier(
max_depth = 3
)
Model depth is restricted to reduce complexity and minimize overfitting.
Load Dataset
↓
Data Exploration
↓
Data Visualization
↓
Train-Test Split
↓
Model Training
↓
Prediction
↓
Evaluation
↓
Performance Analysis
| Metric | Score |
|---|---|
| Training Accuracy | 95.83% |
| Testing Accuracy | 100.00% |
Additional evaluation includes:
- Classification Report
- Confusion Matrix
- Feature Importance
- Overfitting Check
| File | Purpose |
|---|---|
histogram.png |
Feature distribution |
boxplot.png |
Feature spread analysis |
heatmap.png |
Correlation analysis |
pairplot.png |
Feature relationship analysis |
scatterplot.png |
Petal comparison |
All graphs are stored inside the graphs/ folder.
Data-Classification-Using-AI/
│
├── classification.py
├── README.md
├── requirements.txt
│
└── graphs/
├── histogram.png
├── boxplot.png
├── heatmap.png
├── pairplot.png
└── scatterplot.png
Clone repository:
git clone https://github.com/anmol396/Data-Classification-Using-AI.gitMove to project folder:
cd Data-Classification-Using-AIInstall dependencies:
pip install -r requirements.txtpython classification.py| Category | Technology |
|---|---|
| Language | Python |
| Data Processing | Pandas, NumPy |
| Machine Learning | Scikit-learn |
| Visualization | Matplotlib, Seaborn |
After completing this project:
- Understand classification workflow
- Build Decision Tree models
- Perform exploratory data analysis
- Evaluate model performance
- Generate visual insights
- Interpret feature importance
- Add Random Forest comparison
- Hyperparameter tuning
- Interactive dashboard
- Model deployment
- Support additional datasets
Developed as part of Project 2 – Data Classification Using AI