Quantitative consumer insights analysis for Hip-Hop music data
Built with Python · Pandas · Scikit-learn · Plotly · Streamlit
This project applies quantitative research methods to Hip-Hop music data to extract consumer insights — combining audio feature analysis, NLP sentiment scoring, and machine learning to understand what drives track popularity.
| Metric | Value |
|---|---|
| ML Model R² (test set) | ~0.70 |
| CV R² (5-fold) | ~0.68 ± 0.04 |
| Top popularity driver | Danceability |
| Sentiment range | -0.43 (Dark) → +0.75 (Positive) |
| Tracks analysed | 500+ |
| Layer | Tools | Status |
|---|---|---|
| Data processing | pandas numpy scikit-learn |
✅ |
| Statistical analysis | scipy statsmodels |
✅ |
| NLP | vaderSentiment |
✅ |
| Machine Learning | GradientBoostingRegressor PCA |
✅ |
| Visualisation | plotly |
✅ |
| Dashboard | streamlit |
✅ |
| Data source | Kaggle Spotify Hip-Hop Dataset | ✅ |
To Get Your Real Streamlit app Run the appstreamlit.py along with the data that we shared.