🧠 Mental Wellness Analysis and Support Strategy

Machine Learning Approach to Understanding Mental Health in Tech

A Data Science Project by Javin Chutani

👨‍💻 About

Author: Javin Chutani
Project Type: Machine Learning & Data Analytics
Status: Active

📖 Overview

This project explores mental health challenges faced by employees in the tech industry using machine learning and data analytics. By analyzing survey data from tech workers, the project develops predictive models and insights to help organizations create better mental health support systems.

🎯 Key Objectives

Classification Task - Predict whether an individual is likely to seek mental health treatment based on workplace and personal factors
Regression Task - Predict age of individuals to design age-targeted interventions
Clustering Analysis - Segment tech employees into distinct groups based on mental health indicators for tailored HR policies

✨ Features

🔍 Exploratory Data Analysis (EDA) - Comprehensive visualization and statistical analysis
🤖 Machine Learning Models
- Random Forest Classifier
- XGBoost Classifier
- Logistic Regression
- Random Forest Regressor
- K-Means Clustering
📊 Interactive Dashboard - Streamlit web application for model predictions and insights
📈 Model Performance Metrics - ROC curves, confusion matrices, and detailed evaluation
🎨 Data Visualizations - Univariate, bivariate, and multivariate analysis
🐳 Docker Support - Containerized deployment for easy setup

📂 Project Structure

188nmv/
│
├── 📁 Images/                          # Visualization outputs
│   ├── bivariate1.png
│   ├── bivariate2.png
│   ├── cluster0.png
│   ├── cluster1.png
│   ├── cluster2.png
│   ├── cluster3.png
│   ├── dimred.png
│   ├── multivariate1.png
│   ├── multivariate2.png
│   ├── ROC Curve - Classification.png
│   ├── univariate1.png
│   └── univariate2.png
│
├── 📁 Models & Dataset/                # Trained models and processed data
│   ├── classification_model.pkl
│   ├── regression_model.pkl
│   └── df.pkl
│
├── 📁 Notebooks/                       # Jupyter notebooks for analysis
│   ├── EDA.ipynb                       # Exploratory Data Analysis
│   ├── classification_model.ipynb      # Classification model training
│   ├── regression_model.ipynb          # Regression model training
│   └── clustering.ipynb                # Clustering analysis
│
├── 📁 .devcontainer/                   # Development container config
│   └── devcontainer.json
│
├── 📄 app.py                           # Streamlit web application
├── 📄 survey.csv                       # Raw dataset
├── 📄 requirements.txt                 # Python dependencies
├── 📄 Dockerfile                       # Docker configuration
├── 📄 .dockerignore                    # Docker ignore file
└── 📄 README.md                        # Project documentation

🚀 Getting Started

Prerequisites

Python 3.11 or higher
pip package manager
Docker (optional, for containerized deployment)

Option 1: Local Installation

Clone the repository

git clone https://github.com/javin1106/188nmv.git
cd 188nmv

Install dependencies
```
pip install -r requirements.txt
```
Run the Streamlit app
```
streamlit run app.py
```
Access the application
- Open your browser and navigate to http://localhost:8501

Option 2: Using Docker

Pull from Docker Hub (Recommended)

The Docker image uses Python 3.11 as the base image for improved performance and compatibility.

Pull from Docker Hub (Recommended)

# Pull the latest image
docker pull javin1106/mental-health-app:latest

# Run the container
docker run -p 8501:8501 javin1106/mental-health-app:latest

Or Build Locally

# Build the image
docker build -t mental-health-app .

# Run the container
docker run -p 8501:8501 mental-health-app

Docker Compose (if available)

docker-compose up

Access the application:

Open your browser and navigate to http://localhost:8501

📊 Dataset

Source: Mental Health in Tech Survey - Kaggle

The dataset contains responses from tech employees regarding:

Demographics (age, gender, country)
Work environment characteristics
Mental health history
Workplace mental health benefits
Attitudes toward mental health treatment

🧪 Methodology

1. Data Preprocessing

Handling missing values
Feature engineering
Encoding categorical variables
Data normalization and scaling

2. Exploratory Data Analysis

Univariate, bivariate, and multivariate analysis
Correlation analysis
Distribution plots and statistical summaries

3. Model Development

Classification Model

Target Variable: Treatment seeking behavior
Algorithms: Logistic Regression, Random Forest, XGBoost
Evaluation Metrics: Accuracy, Precision, Recall, F1-Score, ROC-AUC

Regression Model

Target Variable: Age prediction
Algorithms: Linear Regression, Random Forest Regressor
Evaluation Metrics: RMSE, MAE, R² Score

Clustering Analysis

Algorithm: K-Means Clustering
Purpose: Segmentation of employees based on mental health patterns

📈 Results

The models demonstrate strong predictive capabilities in identifying:

Employees at risk of mental health issues
Key workplace factors influencing mental wellness
Distinct employee segments requiring different support strategies

For detailed results and insights, please refer to the Technical Report.

🌐 Live Demo

Experience the interactive dashboard: Launch App

🛠️ Technologies Used

Python - Core programming language
Pandas & NumPy - Data manipulation and analysis
Scikit-learn - Machine learning algorithms
XGBoost - Gradient boosting framework
Matplotlib & Seaborn - Data visualization
Streamlit - Web application framework
Joblib - Model serialization
Docker - Containerization and deployment

📝 Key Insights

Family history of mental health issues is a strong predictor of treatment seeking
Remote work policies impact mental wellness differently across demographics
Company size and benefits significantly influence employee mental health
Age-specific interventions can improve support program effectiveness

🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.

📜 License

This project is open source and available for educational and research purposes.

🙏 Acknowledgements

Dataset Source: OSMI Mental Health in Tech Survey
Kaggle Community - For making valuable datasets accessible

📧 Contact

Javin Chutani

Made with ❤️ for improving mental health awareness in tech

⭐ Star this repository if you find it helpful!

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.devcontainer		.devcontainer
.dvc		.dvc
Images		Images
Models & Dataset		Models & Dataset
Notebooks		Notebooks
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
data.dvc		data.dvc
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧠 Mental Wellness Analysis and Support Strategy

Machine Learning Approach to Understanding Mental Health in Tech

👨‍💻 About

📖 Overview

🎯 Key Objectives

✨ Features

📂 Project Structure

🚀 Getting Started

Prerequisites

Option 1: Local Installation

Option 2: Using Docker

Pull from Docker Hub (Recommended)

Pull from Docker Hub (Recommended)

Or Build Locally

Docker Compose (if available)

📊 Dataset

🧪 Methodology

1. Data Preprocessing

2. Exploratory Data Analysis

3. Model Development

Classification Model

Regression Model

Clustering Analysis

📈 Results

🌐 Live Demo

🛠️ Technologies Used

📝 Key Insights

🤝 Contributing

📜 License

🙏 Acknowledgements

📧 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages