Skip to content

anmol396/Data-Classification-Using-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data-Classification-Using-AI

Python Scikit-Learn Pandas NumPy Matplotlib Seaborn


A beginner-friendly machine learning classification project that predicts Iris flower species using a Decision Tree Classifier.

This project demonstrates a complete ML workflow including data loading, exploratory data analysis (EDA), visualization, model training, evaluation, prediction, and performance analysis.


Project Overview

This project performs end-to-end classification using the Iris dataset and covers the complete machine learning pipeline.

The system:

  • Loads and explores data
  • Generates visual analysis
  • Trains a classification model
  • Evaluates prediction performance
  • Produces graphs and insights

Key Highlights

  • Complete machine learning workflow
  • Beginner-friendly implementation
  • Uses Decision Tree Classification
  • Multiple visualizations included
  • Performance evaluation and reporting
  • Feature importance analysis

Dataset Information

Property Details
Dataset Iris Dataset
Source Scikit-learn
Total Records 150
Features Sepal Length, Sepal Width, Petal Length, Petal Width
Classes Setosa, Versicolor, Virginica

Classification Model

Algorithm Used

Decision Tree Classifier

A supervised machine learning algorithm used to classify flower species based on feature values.

Model Configuration

DecisionTreeClassifier(
max_depth = 3
)

Model depth is restricted to reduce complexity and minimize overfitting.


Project Workflow

Load Dataset
↓
Data Exploration
↓
Data Visualization
↓
Train-Test Split
↓
Model Training
↓
Prediction
↓
Evaluation
↓
Performance Analysis

Model Performance

Metric Score
Training Accuracy 95.83%
Testing Accuracy 100.00%

Additional evaluation includes:

  • Classification Report
  • Confusion Matrix
  • Feature Importance
  • Overfitting Check

Generated Visualizations

File Purpose
histogram.png Feature distribution
boxplot.png Feature spread analysis
heatmap.png Correlation analysis
pairplot.png Feature relationship analysis
scatterplot.png Petal comparison

All graphs are stored inside the graphs/ folder.


Project Structure

Data-Classification-Using-AI/
│
├── classification.py
├── README.md
├── requirements.txt
│
└── graphs/
    ├── histogram.png
    ├── boxplot.png
    ├── heatmap.png
    ├── pairplot.png
    └── scatterplot.png

Installation Guide

Clone repository:

git clone https://github.com/anmol396/Data-Classification-Using-AI.git

Move to project folder:

cd Data-Classification-Using-AI

Install dependencies:

pip install -r requirements.txt

Run Project

python classification.py

Technologies Used

Category Technology
Language Python
Data Processing Pandas, NumPy
Machine Learning Scikit-learn
Visualization Matplotlib, Seaborn

Learning Outcomes

After completing this project:

  • Understand classification workflow
  • Build Decision Tree models
  • Perform exploratory data analysis
  • Evaluate model performance
  • Generate visual insights
  • Interpret feature importance

Future Improvements

  • Add Random Forest comparison
  • Hyperparameter tuning
  • Interactive dashboard
  • Model deployment
  • Support additional datasets

Developed as part of Project 2 – Data Classification Using AI

About

A beginner-friendly AI classification project built using Python and the Iris dataset. Includes data understanding, EDA, visualizations, Decision Tree classification, model evaluation, confusion matrix, feature importance analysis, and graphical insights using supervised learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages