Skip to content

Paulchen-git/my-exp-tracker

Repository files navigation

My Experiment Tracker

A lightweight, self-hosted experiment tracking system inspired by Weights & Biases (W&B). Built from scratch, this tool helps you log, visualize, and manage machine learning experiments with a simple API and intuitive CLI.

Features

  • 📊 Metric Logging: Log training metrics, validation scores, and custom metrics with step tracking
  • 🎯 Project Management: Organize experiments into projects
  • 🔄 Run Tracking: Track individual experiment runs with configurations
  • 🖥️ Web Dashboard: Visualize experiments in a clean, modern interface
  • 🐍 Python Client: Easy-to-use Python SDK for logging metrics
  • 💻 CLI Interface: Command-line tool for managing projects and runs
  • 🐳 Docker Support: One-command deployment with Docker Compose
  • 📦 PostgreSQL Backend: TimescaleDB for efficient time-series data storage

Architecture

┌─────────────────────────────────────────────────────┐
│                    User Machines                    │
│  ┌──────────────┐           ┌──────────────┐       │
│  │  Python CLI  │           │ Python Code  │       │
│  │   (experiments.py)       │  w/ Tracker  │       │
│  └──────┬───────┘           └──────┬───────┘       │
│         │                          │                │
└─────────┼──────────────────────────┼────────────────┘
          │   HTTP REST API         │
          │                         │
┌─────────▼─────────────────────────▼────────────────┐
│                   Tracker Host                     │
│                                                    │
│  ┌──────────────────────────────────────────┐    │
│  │  Dashboard (Port 8080)                   │    │
│  │  - Project management UI                 │    │
│  │  - Experiment visualization              │    │
│  │  - Metric charts                         │    │
│  └──────────────────────────────────────────┘    │
│                       │                           │
│  ┌────────────────────▼──────────────────────┐   │
│  │  API Server (Port 8500)                   │   │
│  │  - REST endpoints for projects/runs       │   │
│  │  - Metric ingestion                       │   │
│  │  - Data validation & business logic       │   │
│  └────────────────────┬──────────────────────┘   │
│                       │                           │
│  ┌────────────────────▼──────────────────────┐   │
│  │  PostgreSQL + TimescaleDB                 │   │
│  │  - Project metadata                       │   │
│  │  - Run configurations                     │   │
│  │  - Time-series metrics                    │   │
│  └──────────────────────────────────────────┘    │
│                                                    │
└────────────────────────────────────────────────────┘

Quick Start

For Users: Using an Already Deployed Tracker

If your tracking server is already deployed (e.g., running at http://tracker.example.com:8080), follow these steps to start logging experiments.

Installation

Clone the repository and install in development mode:

git clone https://github.com/yourusername/my-exp-tracker.git
cd my-exp-tracker
pip install -e .

This installs the my-exp-tracker package along with all dependencies, making the track CLI command available globally.

Using the Python Client

The easiest way to log metrics during an experiment is using the ExperimentTracker SDK:

from my_exp_tracker import ExperimentTracker

# Initialize tracker
# Tip: Get RUN_ID from the web dashboard after creating a new run
tracker = ExperimentTracker(
    tracker_url="http://tracker.example.com:8500",
    run_id=123,  # Your run ID
    api_token="your-secret-token"  # Required for authentication
)

# Log metrics during training
for step in range(1000):
    loss = calculate_loss()  # Your loss calculation
    accuracy = calculate_accuracy()  # Your accuracy calculation
    
    tracker.log_metrics(
        metric_name="train/loss",
        metric_value=loss,
        step=step
    )
    tracker.log_metrics(
        metric_name="train/accuracy",
        metric_value=accuracy,
        step=step
    )

Using the CLI

Manage experiments from the command line:

# List all projects
track project list --tracker-url http://tracker.example.com:8500 --api-token your-secret-token

# Create a new project
track project create \
  --tracker-url http://tracker.example.com:8500 \
  --api-token your-secret-token \
  --name "My Cool Experiment" \
  --description "Testing new architecture"

# Create a new run in a project
track run create \
  --tracker-url http://tracker.example.com:8500 \
  --project-id 1 \
  --name "Run 1" \
  --config "lr=0.001,batch_size=64"

# List all runs in a project
track run list --tracker-url http://tracker.example.com:8500 --project-id 1

Setting Default Tracker URL and API Token

To avoid passing --tracker-url and --api-token every time, set environment variables:

export TRACKER_URL=http://tracker.example.com:8500
export API_TOKEN=your-secret-token

# Now you can omit the flags
track project list
track project create --name "My Project"

Or create a .env file in your project directory with these variables, and they'll be loaded automatically.

Web Dashboard

Open your browser and navigate to your tracker URL:

http://tracker.example.com:8080

The dashboard provides:

  • 📋 Project overview
  • 📊 Real-time metric visualization

For Developers/Deployers: Setting Up Your Own Tracker

Prerequisites

  • Docker and Docker Compose

Quick Deploy with Docker Compose

This is the easiest way to get started:

# Clone the repository
git clone https://github.com/yourusername/my-exp-tracker.git
cd my-exp-tracker

# Create environment configuration
cp .env.example src/my_exp_tracker/.env

# Edit .env to set secure credentials for production
# ⚠️ IMPORTANT: Change API_TOKEN to a strong, randomly generated value
# Generate: openssl rand -base64 32
nano .env

# Start all services
docker-compose -f src/my_exp_tracker/docker-compose.yml up -d

# Verify services are running
docker-compose -f src/my_exp_tracker/docker-compose.yml ps

The tracker will be available at:

Environment Variables

Create a .env file in the root directory (copy from .env.example):

# Database credentials
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your_secure_password_here  # CHANGE THIS!
POSTGRES_DB=myexp_tracker_db
POSTGRES_HOST=myexp-tracker-db
POSTGRES_PORT=5432

# API Configuration
API_HOST=myexp-tracker-api
API_PORT=8500

# API Authentication Token (Bearer token)
# ⚠️ REQUIRED: Set to a strong, random value
# Generate: openssl rand -base64 32
API_TOKEN=your_strong_random_secret_token_here

# Tracker URL
TRACKER_URL=http://localhost:8500

# Dashboard Configuration
DASHBOARD_PORT=8080

⚠️ Security: In production, ensure you set:

  • A strong API_TOKEN (20+ characters, randomly generated)
  • A strong POSTGRES_PASSWORD
  • Use HTTPS instead of HTTP
  • Restrict network access with a firewall or reverse proxy

Usage Examples

Example 1: Training a Neural Network with Tracking

WARNING : to log metrics the run need to be created before (via CLI) and a run should belong to a project so you also need to create a project (via CLI). See the CLI section

import os
import torch
from torch import nn
from my_exp_tracker import ExperimentTracker

# Get configuration from environment
TRACKER_URL = os.environ.get("TRACKER_URL", "http://localhost:8500")
RUN_ID = int(os.environ.get("RUN_ID", "1"))
API_TOKEN = os.environ.get("API_TOKEN")  # Retrieved automatically from .env

# Initialize tracker
tracker = ExperimentTracker(tracker_url=TRACKER_URL, run_id=RUN_ID, api_token=API_TOKEN)

# Your training loop
model = nn.Linear(10, 1)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss_fn = nn.MSELoss()

for epoch in range(100):
    x = torch.randn(32, 10)
    y = torch.randn(32, 1)
    
    optimizer.zero_grad()
    pred = model(x)
    loss = loss_fn(pred, y)
    loss.backward()
    optimizer.step()
    
    # Log metrics to tracker
    tracker.log_metrics(
        metric_name="train/loss",
        metric_value=float(loss),
        step=epoch
    )
    
    print(f"Epoch {epoch}: Loss = {loss:.4f}")

Example 2: Using Configuration Files

# Create a run with configuration
track run create \
  --tracker-url http://localhost:8500 \
  --project-id 1 \
  --name "Experiment with Config" \
  --config-path configs/run.yaml

Example 3: Batch Experiments (SWEEP will arrive soon to replace bash scripts)

#!/bin/bash

# Run multiple experiments with different learning rates
for lr in 0.001 0.01 0.1; do
    echo "Starting experiment with lr=$lr"
    
    # Create a new run
    RUN_OUTPUT=$(track run create \
      --tracker-url http://localhost:8500 \
      --project-id 1 \
      --name "LR Sweep $lr" \
      --config "lr=$lr,batch_size=32")
    
    # Extract run ID and run your experiment
    RUN_ID=$(echo $RUN_OUTPUT | grep -o '"id": [0-9]*' | grep -o '[0-9]*')
    
    python train.py --lr $lr --run-id $RUN_ID
done

Example 4: Logging Multiple Metrics

from my_exp_tracker import ExperimentTracker

tracker = ExperimentTracker(tracker_url="http://localhost:8500", run_id=123)

# Log different types of metrics
metrics = {
    "train/loss": 0.45,
    "train/accuracy": 0.92,
    "val/loss": 0.48,
    "val/accuracy": 0.90,
    "learning_rate": 0.001,
    "gpu_memory_mb": 4096,
}

for metric_name, metric_value in metrics.items():
    tracker.log_metrics(
        metric_name=metric_name,
        metric_value=metric_value,
        step=epoch
    )

API Reference

Authentication

All API endpoints (except /health) require a Bearer token for authentication.

The API uses Bearer token authentication via the Authorization header:

Authorization: Bearer YOUR_API_TOKEN

The API_TOKEN is set in your .env file and should be a strong, randomly generated secret. Clients must provide this token with every request, or they will receive a 401 Unauthorized response.

Public Endpoints (no authentication required):

  • GET /health - Health check for monitoring

Endpoints

Projects

  • GET /projects - List all projects
  • POST /projects - Create a new project
  • GET /projects/<id> - Get project details
  • DELETE /projects/<id> - Delete a project

Runs

  • GET /runs - List all runs (optionally filter by project_id)
  • POST /runs - Create a new run
  • GET /runs/<id> - Get run details
  • DELETE /runs/<id> - Delete a run

Metrics

  • POST /metrics - Log a metric
  • GET /metrics/<run_id> - Get all metrics for a run

Example API Requests

# Set your API token
API_TOKEN="your-secret-token"

# Health check (no auth required)
curl http://localhost:8500/health

# Create a project (requires auth)
curl -X POST http://localhost:8500/projects \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Project", "description": "ML Experiments"}'

# Create a run (requires auth)
curl -X POST http://localhost:8500/runs \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"project_id": 1, "name": "Run 1", "config": "lr=0.001"}'

# Log a metric (requires auth)
curl -X POST http://localhost:8500/metrics \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"run_id": 1, "metric_name": "train/loss", "value": 0.45, "step": 100}'

# Get metrics for a run (requires auth)
curl -H "Authorization: Bearer $API_TOKEN" http://localhost:8500/metrics/1

Project Structure

my-exp-tracker/
├── README.md                        # This file
├── pyproject.toml                   # Project configuration
├── requirements.txt                 # Production dependencies
├── requirements-dev.txt             # Development dependencies
├── .env.example                     # Environment variable template
├── configs/
│   ├── run.yaml                     # Example run configuration
│   └── sweep.yaml                   # Example sweep configuration
├── scripts/
│   └── dummy_experiment.py          # Example training script
├── src/my_exp_tracker/
│   ├── __init__.py
│   ├── cli.py                       # CLI entry point
│   ├── docker-compose.yml           # Docker Compose configuration
│   ├── api/
│   │   ├── app.py                   # Flask API application
│   │   ├── Dockerfile               # API container configuration
│   │   ├── requirements.txt          # API dependencies
│   │   ├── db/
│   │   │   ├── __init__.py
│   │   │   └── connection.py        # Database connection pool
│   │   ├── services/
│   │   │   ├── projects.py          # Project business logic
│   │   │   ├── runs.py              # Run business logic
│   │   │   ├── metrics.py           # Metrics business logic
│   │   │   ├── errors.py            # Custom exceptions
│   │   │   └── constants.py         # Constants
│   │   └── validation/
│   │       ├── projects.py          # Project validation
│   │       ├── runs.py              # Run validation
│   │       └── metrics.py           # Metric validation
│   ├── dashboard/
│   │   ├── app.py                   # Flask dashboard application
│   │   ├── Dockerfile               # Dashboard container
│   │   ├── requirements.txt          # Dashboard dependencies
│   │   ├── api_client.py            # HTTP client for API
│   │   ├── static/
│   │   │   ├── app.js               # Dashboard JavaScript
│   │   │   └── style.css            # Dashboard styling
│   │   └── templates/
│   │       └── index.html           # Dashboard HTML
│   ├── database/
│   │   └── init.sql                 # Database schema initialization
│   ├── tracker/
│   │   └── tracker.py               # Python SDK
│   └── utils/
│       └── utils.py                 # Utility functions

Development

Setting Up Development Environment

# Clone the repository
git clone https://github.com/yourusername/my-exp-tracker.git
cd my-exp-tracker

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements-dev.txt
pip install -e .

Security Considerations

For Users

  • Keep your TRACKER_URL private
  • Don't log sensitive information in metrics
  • Use HTTPS in production

For Deployers

  • Change default credentials before deploying
  • Use environment variables for sensitive data
  • Run behind a reverse proxy with HTTPS
  • Implement authentication/authorization if needed
  • Regularly backup your database
  • Monitor and log API access
  • Use strong database passwords
  • Restrict database network access
  • Keep dependencies updated

Do NOT commit to version control

  • .env files with actual credentials
  • Database passwords
  • API keys
  • Any sensitive configuration

Use .env.example as a template instead.


Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.


Support

For issues, questions, or suggestions:

  • Open an issue on GitHub
  • Check existing documentation
  • Review example scripts in /scripts/

Roadmap

  • Run comparison and visualization
  • Experiment grouping (sweeps)
  • Data export (CSV, JSON)
  • Advanced filtering and search
  • Metric aggregation and statistics

Acknowledgments

Inspired by Weights & Biases, built as a learning exercise and self-hosted alternative for ML experiment tracking.

About

A lightweight, self-hosted experiment tracking system inspired by Weights & Biases (W&B). Built from scratch, this tool helps you log, visualize, and manage machine learning experiments with a simple API and intuitive CLI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors