Skip to content

Ppius6/dbt-ecommerce

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Looker E-commerce Analytics - dbt Project

A comprehensive dbt project for transforming and modeling Looker e-commerce data using DuckDB as the data warehouse.

Project Overview

This project implements a complete data transformation pipeline for e-commerce analytics, covering customer behavior, product performance, order fulfillment, and inventory management. The project follows dbt best practices with a layered architecture consisting of staging models, marts, snapshots, and seeds.

Project Structure

looker_ecommerce/
├── models/
│   ├── staging/           # Raw data transformations and standardization
│   │   ├── stg_looker__distribution_centers.sql
│   │   ├── stg_looker__events.sql
│   │   ├── stg_looker__inventory_items.sql
│   │   ├── stg_looker__order_items.sql
│   │   ├── stg_looker__orders.sql
│   │   ├── stg_looker__products.sql
│   │   └── stg_looker__users.sql
│   └── marts/             # Business-level analytics models
│       ├── customers.sql
│       └── dim_products.sql
├── seeds/                 # Static CSV data files
│   ├── events.csv
│   ├── inventory_items.csv
│   ├── looker__distribution_centers.csv
│   ├── order_items.csv
│   ├── orders.csv
│   ├── products.csv
│   └── users.csv
├── snapshots/            # Historical change tracking
│   ├── inventory_snapshot.sql
│   └── orders_snapshots.sql
├── tests/                # Custom data quality tests
├── macros/              # Reusable SQL macros
└── dbt_project.yml      # Project configuration

Data Models

Staging Layer

The staging layer (models/staging/) provides cleaned and standardized versions of source data:

  • stg_looker__distribution_centers: Distribution center locations and details
  • stg_looker__events: User event tracking data
  • stg_looker__inventory_items: Inventory item details and costs
  • stg_looker__order_items: Individual line items from orders
  • stg_looker__orders: Order header information and status
  • stg_looker__products: Product catalog with pricing and categorization
  • stg_looker__users: Customer demographic information

Marts Layer

The marts layer (models/marts/) contains business-level analytics models:

  • customers: Comprehensive customer analytics including:

    • Total amount spent per customer
    • Number of items purchased
    • Order counts by status (shipped, complete, processing, cancelled, returned)
    • First and last order timestamps
    • Customer demographics
  • dim_products: Product dimension table featuring:

    • Product details (name, category, brand, department, SKU)
    • Cost and retail pricing
    • Cost of goods sold (COGS)
    • Total sales amount
    • Distribution center associations

Snapshots

Historical change tracking for slowly changing dimensions:

  • inventory_snapshot: Tracks inventory item changes over time using timestamp strategy
  • orders_snapshot: Captures order status changes using timestamp strategy

Getting Started

Prerequisites

  • Python 3.12+
  • dbt-duckdb adapter installed
  • Virtual environment activated

Setup

  1. Activate the virtual environment:

    source dbt-env/bin/activate
  2. Install dependencies (if not already installed):

    pip install dbt-duckdb
  3. Load seed data:

    dbt seed
  4. Run the models:

    dbt run
  5. Execute tests:

    dbt test
  6. Run snapshots:

    dbt snapshot

🔧 Common Commands

# Run all models
dbt run

# Run specific model
dbt run --select customers

# Run models and dependencies
dbt run --select +customers

# Test all models
dbt test

# Generate documentation
dbt docs generate

# Serve documentation
dbt docs serve

# Full refresh (rebuild all incremental models)
dbt run --full-refresh

# Run only staging models
dbt run --select staging

# Run only marts models
dbt run --select marts

Key Metrics & Insights

This project enables analysis of:

  • Customer Metrics: Lifetime value, purchase frequency, order status distribution
  • Product Performance: Sales revenue, COGS, profit margins
  • Order Analytics: Order status tracking, fulfillment rates
  • Inventory Management: Stock levels, inventory turnover
  • Distribution: Geographic distribution of inventory and fulfillment

Database

This project uses DuckDB as the data warehouse, providing:

  • Fast analytical queries
  • Local file-based storage
  • ACID compliance
  • SQL compatibility

Database file: dev.duckdb

Data Quality

The project includes:

  • Source freshness checks
  • Data validation tests in _looker__models.yml and _looker__sources.yml
  • Custom test cases to ensure data integrity

Contributing

When adding new models:

  1. Follow the naming convention: stg_ for staging, descriptive names for marts
  2. Add appropriate tests in the corresponding YAML files
  3. Document model purposes and column descriptions
  4. Update this README with new model information

Resources

About

A dbt-based e-commerce analytics pipeline using DuckDB. Transforms raw Looker e-commerce data into staging models and dimensional marts with SCD Type 2 snapshots for product tracking.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors