Skip to content

ivansiase/Spotify_Snowflake_ETLProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎡 Spotify Data Pipeline with AWS & Snowflake

This project demonstrates a modern data engineering pipeline that integrates the Spotify API, processes data with Python, deploys it using AWS Lambda, and automates ingestion into Snowflake using Snowpipe.


πŸš€ Project Overview

This pipeline automates the process of extracting data from Spotify, transforming it in Python, and ingesting it into Snowflake for further analysis.

βœ… Key Components

  • Spotify API Integration: Extracts music-related data (e.g. tracks, playlists, artists).
  • AWS Lambda: Hosts the Python script to automate extraction and transformation.
  • AWS CloudWatch Trigger: Automatically invokes the Lambda function on a schedule.
  • Amazon S3: Stores the transformed data files (.csv or .json).
  • Snowpipe: Automatically ingests the files from S3 into Snowflake.

πŸ› οΈ Technologies Used

  • Programming: Python 3
  • Cloud Services: AWS Lambda, S3, IAM, CloudWatch
  • Data Warehouse: Snowflake
  • Data Integration: Spotify Web API, Snowpipe
  • Others: boto3, requests

πŸ“ Project Structure

spotify-data-pipeline/
β”œβ”€β”€ AWS lambda/
β”‚   β”œβ”€β”€ Extract_Spotify_Data.py
β”‚   └── Transform_Spotify_Data.py
β”œβ”€β”€ snowflake/
β”‚   β”œβ”€β”€ Snowflake_SQL.sql
β”‚   └── Storage_Integration.sql
β”œβ”€β”€ S3-Sample-Output/
β”œβ”€β”€ raw-data/                         ← stores unprocessed data pulled from Spotify API
β”‚   β”œβ”€β”€ processed/
β”‚   └── to_process/
β”œβ”€β”€ transform-data/                   ← stores cleaned, transformed datasets ready for loading into Snowflake
β”‚   β”œβ”€β”€ album_data/
β”‚   β”œβ”€β”€ artist_data/
β”‚   └── songs_data/

πŸ“½οΈ Demo Video

You can watch the demo video here: Spotify_Pipeline.mkv


πŸ“š Acknowledgement

This project was inspired by the "Data Warehouse for Data Engineering with Snowflake" by Darshil Parmar. The course provided the foundational concepts and structure for integrating Spotify, AWS, and Snowflake.

All implementation, customization, and additional enhancements in this repository are my own work and reflect my personal understanding and learning.

About

A modern data engineering pipeline that extracts data from the Spotify API, processes it with Python, deploys via AWS Lambda, and automates ingestion into Snowflake using Snowpipe.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages