Skip to content

GgauravJ05/Data_Visualization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Data Visualization in Python (Matplotlib & Seaborn) πŸ“Šβœ¨

Python Version Matplotlib Seaborn Jupyter Notebook License: MIT

Welcome to the Data Visualization with Python repository! This project is a comprehensive, step-by-step guide to mastering the fundamentals and advanced techniques of data visualization using Python's two most powerful plotting libraries: Matplotlib and Seaborn.

Whether you are a beginner looking to understand basic charts or an intermediate developer wanting to create complex subplots, heatmaps, and publication-ready figures, this repository covers the entire learning path.


πŸ“‚ Repository Structure

The project follows a clean, standardized layout designed for ease of use and clean version control:

Data_Visualization/
β”‚
β”œβ”€β”€ πŸ“ data/                        # Placeholder for external datasets (CSV, JSON, etc.)
β”‚
β”œβ”€β”€ πŸ“ notebooks/                   # Step-by-step Jupyter Notebook tutorials
β”‚   β”œβ”€β”€ πŸ““ 1-matplotlib_tutorial.ipynb  # Part 1 & Part 2: Lines, Bars, Scatters, Pies, Histograms, Box/Stack/Subplots
β”‚   └── πŸ““ 2-seaborn_tutorial.ipynb     # Part 2: Seaborn basics, Relational, Categorical, Distribution Plots & Heatmaps
β”‚
β”œβ”€β”€ πŸ“ outputs/                     # Directory where generated and saved plots will be stored
β”‚
β”œβ”€β”€ πŸ“„ .gitattributes               # Git attribute configurations
β”œβ”€β”€ πŸ“„ .gitignore                   # Standard gitignore for Python and Jupyter artifacts
β”œβ”€β”€ πŸ“„ LICENSE                      # Standard MIT open-source license
β”œβ”€β”€ πŸ“„ requirements.txt             # Dependencies needed to run the notebooks
└── πŸ“„ README.md                    # Main repository guide (you are here!)

🎯 Topics Covered

The tutorial is split across three notebooks, ordered sequentially to build your skills:

πŸ““ 1. Matplotlib Basics (matplotlib_tutorial.ipynb)

  • What is Data Visualization? & How to Plot Data.
  • Introduction to Matplotlib & Important Plot Methods.
  • Line Plots – Multiple datasets, format strings (colors/styles/markers), and grid settings.
  • Vertical & Horizontal Bar Charts – Bar labels, multiple datasets, and categories.
  • Scatter Plots – Customizations (colors/sizes), overlays, and annotations.
  • Pie Charts – Exploded slices, custom color palettes, shadows, and percentages.
  • Styling & Saving Plots – Implementing legends, grid lines, titles, and high-res exports (savefig()).

πŸ““ 2. Advanced Matplotlib (1-matplotlib_tutorial.ipynb)

  • Histograms – Understanding frequency distributions, plotting multiple datasets, and adding threshold markers (axvline).
  • Box Plots – Visualizing data quartiles, medians, and outliers, along with common box plot operations.
  • Stack Plots – Displaying cumulative changes over time for multiple variables.
  • Subplots – Creating grids of plots (multiple axes in a single figure) using plt.subplots().
  • Modern Matplotlib Styles – Incorporating styling rules and modern design aesthetics.
  • Practice Task (Weekly Temperature) – Practical exercise putting all Matplotlib concepts to work.

πŸ““ 3. Seaborn Tutorial (2-seaborn_tutorial.ipynb)

Part 2: Statistical Visualizations with Seaborn

  • Introduction to Seaborn – Why Seaborn is used, and how it simplifies plotting complex data frames.
  • Creating Plots with Seaborn – Custom styles, grid configurations, and built-in theme presets.
  • Relational Plots (relplot) – Visualizing statistical relationships (scatter plots and line plots) with color/size semantics.
  • Categorical Plots (catplot) – Bar plots, box plots, violin plots, and strip plots grouped by category.
  • Distribution Plots (displot) – KDE (Kernel Density Estimate) plots, histograms, and cumulative distributions.
  • Heatmaps (heatmap) – Visualizing correlation matrices, pivot tables, and relational grid weights with color bars.
  • Best Practices for Data Visualization – Final tips on layout structure, font scaling, palette choices, and statistical communication.

πŸš€ Getting Started

Follow these simple steps to run this project locally on your machine:

Prerequisites

Make sure you have Python 3.8+ installed. You can check your Python version by running:

python --version

Installation

  1. Clone this repository:

    git clone https://github.com/your-username/your-repo-name.git
    cd your-repo-name
  2. Create and activate a virtual environment (optional but recommended):

    • On macOS/Linux:
      python -m venv venv
      source venv/bin/activate
    • On Windows:
      python -m venv venv
      venv\Scripts\activate
  3. Install the required dependencies:

    pip install -r requirements.txt
  4. Launch Jupyter Notebook:

    jupyter notebook

    Navigate to the notebooks/ directory and open matplotlib_tutorial.ipynb, 1-matplotlib_tutorial.ipynb, or 2-seaborn_tutorial.ipynb to begin!


πŸ’‘ Quick Code Examples

1. Advanced Subplots in Matplotlib

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 10, 100)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5), dpi=300)

# Left Plot: Sine
ax1.plot(x, np.sin(x), color='#2ca02c', linewidth=2)
ax1.set_title('Sine Wave')
ax1.grid(True, linestyle=':', alpha=0.6)

# Right Plot: Cosine
ax2.plot(x, np.cos(x), color='#d62728', linewidth=2, linestyle='--')
ax2.set_title('Cosine Wave')
ax2.grid(True, linestyle=':', alpha=0.6)

plt.tight_layout()
plt.savefig('outputs/matplotlib_subplots.png')
plt.show()

2. Relational Plots in Seaborn

import seaborn as sns
import matplotlib.pyplot as plt

# Load a built-in dataset
tips = sns.load_dataset("tips")

# Create a relational scatter plot
plt.figure(figsize=(8, 5))
sns.scatterplot(
    data=tips, 
    x="total_bill", 
    y="tip", 
    hue="smoker", 
    style="time", 
    size="size",
    palette="Set2"
)

plt.title('Tips Analysis by Bill Amount & Smoker Status', fontsize=12, fontweight='bold')
plt.savefig('outputs/seaborn_relational_plot.png', bbox_inches='tight')
plt.show()

🎨 Visualization Best Practices

When styling your plots, always adhere to the following rules:

  • Colors & Contrast: Avoid high-contrast primary colors. Use custom hex codes or Seaborn's preset palettes (like muted, deep, or custom color maps).
  • Readability: Use plt.tight_layout() or bbox_inches='tight' to avoid cutting off axes labels when exporting figures.
  • Information Hierarchy: Highlight critical points (anomalies or limits) using line markers like axvline or text annotations.
  • Density plots: Use Seaborn's kdeplot or custom opacity markers (alpha) when plotting overlapping scatter distributions.

πŸ“š References & Learning Resources

For more detailed information, API limits, and plotting guides, refer to the following official resources:


πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A comprehensive, step-by-step guide to mastering data visualization in Python using Matplotlib and Seaborn. Includes Jupyter notebooks and code templates. πŸ“Šβœ¨

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors