A comprehensive dbt project for transforming and modeling Looker e-commerce data using DuckDB as the data warehouse.
This project implements a complete data transformation pipeline for e-commerce analytics, covering customer behavior, product performance, order fulfillment, and inventory management. The project follows dbt best practices with a layered architecture consisting of staging models, marts, snapshots, and seeds.
looker_ecommerce/
├── models/
│ ├── staging/ # Raw data transformations and standardization
│ │ ├── stg_looker__distribution_centers.sql
│ │ ├── stg_looker__events.sql
│ │ ├── stg_looker__inventory_items.sql
│ │ ├── stg_looker__order_items.sql
│ │ ├── stg_looker__orders.sql
│ │ ├── stg_looker__products.sql
│ │ └── stg_looker__users.sql
│ └── marts/ # Business-level analytics models
│ ├── customers.sql
│ └── dim_products.sql
├── seeds/ # Static CSV data files
│ ├── events.csv
│ ├── inventory_items.csv
│ ├── looker__distribution_centers.csv
│ ├── order_items.csv
│ ├── orders.csv
│ ├── products.csv
│ └── users.csv
├── snapshots/ # Historical change tracking
│ ├── inventory_snapshot.sql
│ └── orders_snapshots.sql
├── tests/ # Custom data quality tests
├── macros/ # Reusable SQL macros
└── dbt_project.yml # Project configuration
The staging layer (models/staging/) provides cleaned and standardized versions of source data:
- stg_looker__distribution_centers: Distribution center locations and details
- stg_looker__events: User event tracking data
- stg_looker__inventory_items: Inventory item details and costs
- stg_looker__order_items: Individual line items from orders
- stg_looker__orders: Order header information and status
- stg_looker__products: Product catalog with pricing and categorization
- stg_looker__users: Customer demographic information
The marts layer (models/marts/) contains business-level analytics models:
-
customers: Comprehensive customer analytics including:
- Total amount spent per customer
- Number of items purchased
- Order counts by status (shipped, complete, processing, cancelled, returned)
- First and last order timestamps
- Customer demographics
-
dim_products: Product dimension table featuring:
- Product details (name, category, brand, department, SKU)
- Cost and retail pricing
- Cost of goods sold (COGS)
- Total sales amount
- Distribution center associations
Historical change tracking for slowly changing dimensions:
- inventory_snapshot: Tracks inventory item changes over time using timestamp strategy
- orders_snapshot: Captures order status changes using timestamp strategy
- Python 3.12+
- dbt-duckdb adapter installed
- Virtual environment activated
-
Activate the virtual environment:
source dbt-env/bin/activate -
Install dependencies (if not already installed):
pip install dbt-duckdb
-
Load seed data:
dbt seed
-
Run the models:
dbt run
-
Execute tests:
dbt test -
Run snapshots:
dbt snapshot
# Run all models
dbt run
# Run specific model
dbt run --select customers
# Run models and dependencies
dbt run --select +customers
# Test all models
dbt test
# Generate documentation
dbt docs generate
# Serve documentation
dbt docs serve
# Full refresh (rebuild all incremental models)
dbt run --full-refresh
# Run only staging models
dbt run --select staging
# Run only marts models
dbt run --select martsThis project enables analysis of:
- Customer Metrics: Lifetime value, purchase frequency, order status distribution
- Product Performance: Sales revenue, COGS, profit margins
- Order Analytics: Order status tracking, fulfillment rates
- Inventory Management: Stock levels, inventory turnover
- Distribution: Geographic distribution of inventory and fulfillment
This project uses DuckDB as the data warehouse, providing:
- Fast analytical queries
- Local file-based storage
- ACID compliance
- SQL compatibility
Database file: dev.duckdb
The project includes:
- Source freshness checks
- Data validation tests in
_looker__models.ymland_looker__sources.yml - Custom test cases to ensure data integrity
When adding new models:
- Follow the naming convention:
stg_for staging, descriptive names for marts - Add appropriate tests in the corresponding YAML files
- Document model purposes and column descriptions
- Update this README with new model information
- dbt Documentation
- dbt-duckdb Adapter
- DuckDB Documentation
- dbt Discourse - Q&A and community support
- dbt Slack Community - Live discussions