Wine Quality Classification

A machine learning project that predicts wine quality using physicochemical properties of Portuguese "Vinho Verde" red wine.

📊 Dataset

The dataset contains 1,599 samples of red wine with 11 physicochemical features and 1 quality rating:

Features:

Fixed acidity
Volatile acidity
Citric acid
Residual sugar
Chlorides
Free sulfur dioxide
Total sulfur dioxide
Density
pH
Sulphates
Alcohol

Target Variable:

Quality: Score between 0-10 (converted to binary: Good ≥7, Bad <7)

🚀 Models Implemented

1. Random Forest Classifier

Test Accuracy: 92.5%
Cross-validation Score: 86.4% (±0.016)
Optimized with GridSearchCV

2. Logistic Regression

Test Accuracy: 89.4%
Cross-validation Score: 87.0% (±0.019)
Optimized with GridSearchCV

3. Decision Tree Classifier

Test Accuracy: 90.3%
Cross-validation Score: 79.8% (±0.060)

🔧 Key Features

Data Preprocessing: Label binarization for quality classification
Feature Selection: Top 5 most important features identified
Hyperparameter Optimization: GridSearchCV for model tuning
Cross-validation: 5-fold CV for robust evaluation
Visualization: Correlation heatmaps and feature analysis

📈 Top 5 Most Important Features

Alcohol (17.7% importance)
Sulphates (11.7% importance)
Volatile Acidity (11.0% importance)
Citric Acid (9.7% importance)
Density (9.3% importance)

🛠️ Installation & Usage

Prerequisites

pip install numpy pandas matplotlib seaborn scikit-learn

Running the Code

# Clone the repository
git clone https://github.com/beater35/ml-classification-wine-quality.git
cd ml-classification-wine-quality

# Load and run the Jupyter notebook
jupyter notebook classification_model_comparison_wine_quality.ipynb

Making Predictions

# Example prediction with top 5 features
input_data = [10.0, 0.47, 0.65, 0.0, 0.9946]  # [alcohol, sulphates, volatile_acidity, citric_acid, density]
prediction = final_rf_model.predict([input_data])

if prediction[0] == 1:
    print("Good Quality Wine")
else:
    print("Bad Quality Wine")

📊 Model Performance Summary

Model	Test Accuracy	CV Score (Mean ± Std)
Random Forest	92.5%	86.4% ± 1.6%
Logistic Regression	89.4%	87.0% ± 1.9%
Decision Tree	90.3%	79.8% ± 6.0%

🔍 Project Structure

wine-quality-classification/
├── classification_model_comparison_wine_quality.ipynb
└── README.md

📝 Methodology

Data Exploration: Statistical analysis and visualization
Preprocessing: Binary classification setup (Good/Bad quality)
Model Training: Three different algorithms tested
Hyperparameter Tuning: GridSearchCV optimization
Feature Selection: Importance-based feature ranking
Model Evaluation: Cross-validation and test accuracy

🎯 Results

The Random Forest Classifier achieved the best performance with 92.5% test accuracy after hyperparameter optimization and feature selection. The model successfully identifies wine quality based on physicochemical properties, with alcohol content being the most influential factor.

📚 References

Dataset: Kaggle – Wine Quality Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
classification_model_comparison_wine_quality.ipynb		classification_model_comparison_wine_quality.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wine Quality Classification

📊 Dataset

Features:

Target Variable:

🚀 Models Implemented

1. Random Forest Classifier

2. Logistic Regression

3. Decision Tree Classifier

🔧 Key Features

📈 Top 5 Most Important Features

🛠️ Installation & Usage

Prerequisites

Running the Code

Making Predictions

📊 Model Performance Summary

🔍 Project Structure

📝 Methodology

🎯 Results

📚 References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Wine Quality Classification

📊 Dataset

Features:

Target Variable:

🚀 Models Implemented

1. Random Forest Classifier

2. Logistic Regression

3. Decision Tree Classifier

🔧 Key Features

📈 Top 5 Most Important Features

🛠️ Installation & Usage

Prerequisites

Running the Code

Making Predictions

📊 Model Performance Summary

🔍 Project Structure

📝 Methodology

🎯 Results

📚 References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages