Skip to content

dhruv-bhalodia/Model-Version-Control

Repository files navigation

mlvc — Semantic Version Control for ML Models

Git-style version control for machine learning models, with ML-specific intelligence built in: metric comparison, anomaly detection, data drift monitoring, and a natural language query layer over your version history.


What it does

When you train a new model, you mlvc commit it. mlvc stores the artifact in S3-compatible storage (MinIO locally), records every metric, hyperparameter, dataset, and environment snapshot in PostgreSQL, and lets you mlvc diff, mlvc rollback, mlvc deploy, and mlvc query your history — just like Git, but for models.


Features

Feature What it does
Commit Version any model file (sklearn, PyTorch, XGBoost, etc.) with metrics, hyperparameters, dataset info, and author
Log Full version history with metric columns
Show Detailed view of any version
Diff Side-by-side metric/feature/hyperparam comparison between two versions. Detects precision-recall tradeoffs automatically
Rollback Record a rollback intent with reason + audit trail. Gives you the artifact URI — you deploy it yourself
Deploy Mark a version as active in production / staging / canary
Anomaly detection Warns on commit if any metric falls outside the 2σ historical baseline
Slice evaluation Evaluate model performance per demographic/feature slice on commit
Drift detection Window-based data drift via Kafka + Evidently AI (PSI, KS tests). Alerts to Slack
Natural language query Ask questions about your version history in plain English using an LLM
React dashboard Web UI for all of the above

What rollback does NOT do

mlvc rollback is an audit trail command, not an auto-deployment trigger. It records who rolled back, from which version, to which version, and why. It then prints the artifact URI for the target version — you pull that artifact and redeploy it in your own pipeline. mlvc never touches your serving infrastructure.


Architecture

mlvc/
├── cli/          Click commands (init, commit, log, show, diff, rollback, deploy, query)
├── core/         CommitManager, MetricsEngine, BaselineTracker, SliceEvaluator, RollbackManager
├── storage/      ArtifactStore (S3/MinIO)  •  MetadataStore (PostgreSQL)  •  Cache (Redis)
├── monitoring/   DriftDetector (Kafka + Evidently)  •  ModelSidecar  •  Slack alerts
├── query/        QueryEngine (LLM + RAG over version history)
└── api/          FastAPI REST API

dashboard/        React + Vite web UI

Core services (required): PostgreSQL, Redis, MinIO Optional services: Kafka (drift monitoring), LLM API key (natural language query)

All optional services degrade gracefully — everything else keeps working if they are not configured.


Prerequisites

  • Python 3.10+
  • Docker + Docker Compose
  • Node.js 18+ (only if you want to run the dashboard)

Quick Start

1. Clone and install

git clone https://github.com/yourusername/mlvc.git
cd mlvc
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

2. Configure

cp .env.example .env
# Edit .env if needed — defaults work for the local Docker setup

3. Start infrastructure

# Core services — required
docker compose -f docker-compose.yml up -d postgres redis

# Optional services (MinIO, Kafka, Neo4j)
docker compose -f docker-compose.dev.yml up -d

Wait a few seconds for PostgreSQL to be ready, then run migrations:

alembic upgrade head

Create the MinIO bucket (first time only):

# Option A: via the MinIO container
docker exec <minio-container-name> sh -c "
  mc alias set local http://localhost:9000 minioadmin minioadmin &&
  mc mb local/mlvc-local-dev
"

# Option B: open http://localhost:9001 and create the bucket manually

4. Initialize a project

mlvc init --name fraud-detector --description "Credit card fraud model"

This creates .mlvc/config.json in your working directory and registers the project in the database. Subsequent commands auto-detect the project from this file.

5. Commit your first model

mlvc commit model.pkl \
  --message "Baseline logistic regression" \
  --accuracy 0.923 \
  --f1 0.871 \
  --precision 0.884 \
  --recall 0.859 \
  --dataset "transactions_v1" \
  --hyperparams '{"C": 1.0, "max_iter": 200}' \
  --features '["amount", "merchant_category", "hour_of_day"]'

You will see the version number, hash, and any anomaly warnings.


CLI Reference

mlvc init

mlvc init --name <project-name> [--description "..."] [--author "..."]

Registers the project in the database and writes .mlvc/config.json.


mlvc commit

mlvc commit <model_path> --message "..." [options]
Option Description
--message, -m Why was this version created? (required)
--project Project name (auto-detected from .mlvc/config.json)
--accuracy Accuracy metric (0–1)
--f1 F1 score
--precision Precision
--recall Recall
--auc-roc AUC-ROC
--metrics-json Any additional metrics as JSON: '{"mse": 0.04}'
--dataset Dataset name
--dataset-version Dataset version string
--hyperparams Hyperparameters as JSON: '{"lr": 0.001}'
--features Feature list as JSON array: '["age", "income"]'
--parent Parent version identifier (for fine-tuned models)
--eval-dataset Path to CSV for slice evaluation
--target-column Target column in eval dataset
--slice-columns Comma-separated columns to slice by
--author Author name (defaults to $USER)

On commit, mlvc:

  1. Detects the model format (sklearn/joblib, PyTorch, XGBoost, etc.)
  2. SHA-256 hashes the file and deduplicates in MinIO — no wasted storage if you commit the same weights twice
  3. Stores all metadata in PostgreSQL
  4. Updates metric baselines and warns if any metric is outside the 2σ historical range

mlvc log

mlvc log --project fraud-detector [--limit 20]

Lists version history with version number, hash, author, date, commit message, and key metrics.


mlvc show

mlvc show v3 --project fraud-detector
# Also accepts: version number (3), hash prefix (abc12345), or "latest"

Shows full details: all metrics, hyperparameters, feature list, environment snapshot, dataset info.


mlvc diff

mlvc diff v3 v4 --project fraud-detector

Side-by-side comparison of:

  • All metrics with direction arrows and regression flags
  • Features added / removed
  • Hyperparameters changed

If precision went up while recall went down, mlvc recognises this as an intentional tradeoff and does not flag either as a regression.


mlvc rollback

mlvc rollback v2 --reason "v3 caused 15% precision drop" --project fraud-detector

Records the rollback in the audit trail with who did it, when, and why. Prints the artifact URI of v2. Pull that file and redeploy it yourself — mlvc does not touch your serving infrastructure.

Optional flags:

  • --from-version — which version is being replaced (default: latest)
  • --author — who performed the rollback (default: $USER)

mlvc deploy

mlvc deploy v4 --project fraud-detector --env production --notes "A/B test winner"
# Environments: production (default), staging, canary

Marks a version as the active deployment for an environment. The previous active deployment for that environment is automatically retired in the tracking table.


mlvc query

mlvc query "which version had the best F1 score?" --project fraud-detector
mlvc query "why did we roll back v3?" --project fraud-detector

Sends your version history as context to an LLM and returns a plain-English answer. Requires an LLM API key (see Configuration). Without a key it falls back to keyword search over version metadata.


Dashboard (Web UI)

The dashboard is a React + Vite app with six pages:

Page What you see
Projects All registered projects
Version list All versions for a project with metric columns and sparklines
Version detail Full metadata, metrics, hyperparameters, environment for one version
Diff Side-by-side metric comparison between any two versions
Lineage Interactive graph showing parent–child version relationships and rollbacks
Drift Drift events table per version

Running the dashboard

# Start the API server
uvicorn mlvc.api.app:app --reload --port 8000

# In a separate terminal
cd dashboard
npm install
npm run dev     # opens at http://localhost:5173

Configuration

All settings live in .env with the MLVC_ prefix.

# Required
MLVC_DATABASE_URL=postgresql://mlvc:localpass@localhost:5432/mlvc
MLVC_S3_BUCKET=mlvc-local-dev

# S3 / MinIO credentials
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
AWS_ENDPOINT_URL=http://localhost:9000    # remove this line for real AWS S3
AWS_DEFAULT_REGION=us-east-1

# Optional: LLM for mlvc query (uses Gemini API)
MLVC_GEMINI_API_KEY=your_gemini_api_key_here
MLVC_GEMINI_MODEL=gemma-4-31b-it

# Optional: Slack drift alerts
MLVC_SLACK_WEBHOOK_URL=https://hooks.slack.com/...

# Optional: Kafka drift detection
MLVC_KAFKA_BOOTSTRAP_SERVERS=localhost:9092

# Optional: Redis query cache
MLVC_REDIS_URL=redis://localhost:6379

# Drift thresholds
MLVC_PSI_THRESHOLD=0.20
MLVC_KS_THRESHOLD=0.15
MLVC_ACCURACY_DROP_THRESHOLD=0.05

Optional Features

Natural Language Query

Set MLVC_GEMINI_API_KEY to your Gemini API key. The query engine uses gemma-4-31b-it by default (configurable via MLVC_GEMINI_MODEL). It builds a context from your version history and sends it to the model. Responses are Redis-cached for 5 minutes.

Install the query extra first:

pip install -e ".[query]"

Data Drift Monitoring

Requires Kafka (pip install -e ".[kafka]"). Wrap your model with the ModelSidecar to auto-publish predictions:

from mlvc.monitoring.sidecar import ModelSidecar

sidecar = ModelSidecar(model=your_model, version_id="<version-uuid>")
prediction = sidecar.predict(inputs)   # publishes to Kafka in the background

Run the drift detector as a background process:

python -m mlvc.monitoring.drift_detector

It consumes prediction events in 1000-sample windows, runs PSI + KS tests via Evidently AI, and writes drift events to the drift_events table. Optionally sends a Slack alert.


Running Tests

Unit tests require no infrastructure:

pytest tests/unit/

Integration tests require a running PostgreSQL:

pytest tests/integration/

Should you commit .mlvc/ to Git?

No. The .mlvc/config.json file is local project state (like .git/ itself). It is in .gitignore. Each developer runs mlvc init --name <project> in their working directory to connect to the shared database.


Contributing

See CONTRIBUTING.md.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors