mlvc — Semantic Version Control for ML Models

Git-style version control for machine learning models, with ML-specific intelligence built in: metric comparison, anomaly detection, data drift monitoring, and a natural language query layer over your version history.

What it does

When you train a new model, you mlvc commit it. mlvc stores the artifact in S3-compatible storage (MinIO locally), records every metric, hyperparameter, dataset, and environment snapshot in PostgreSQL, and lets you mlvc diff, mlvc rollback, mlvc deploy, and mlvc query your history — just like Git, but for models.

Features

Feature	What it does
Commit	Version any model file (sklearn, PyTorch, XGBoost, etc.) with metrics, hyperparameters, dataset info, and author
Log	Full version history with metric columns
Show	Detailed view of any version
Diff	Side-by-side metric/feature/hyperparam comparison between two versions. Detects precision-recall tradeoffs automatically
Rollback	Record a rollback intent with reason + audit trail. Gives you the artifact URI — you deploy it yourself
Deploy	Mark a version as active in production / staging / canary
Anomaly detection	Warns on commit if any metric falls outside the 2σ historical baseline
Slice evaluation	Evaluate model performance per demographic/feature slice on commit
Drift detection	Window-based data drift via Kafka + Evidently AI (PSI, KS tests). Alerts to Slack
Natural language query	Ask questions about your version history in plain English using an LLM
React dashboard	Web UI for all of the above

What rollback does NOT do

mlvc rollback is an audit trail command, not an auto-deployment trigger. It records who rolled back, from which version, to which version, and why. It then prints the artifact URI for the target version — you pull that artifact and redeploy it in your own pipeline. mlvc never touches your serving infrastructure.

Architecture

mlvc/
├── cli/          Click commands (init, commit, log, show, diff, rollback, deploy, query)
├── core/         CommitManager, MetricsEngine, BaselineTracker, SliceEvaluator, RollbackManager
├── storage/      ArtifactStore (S3/MinIO)  •  MetadataStore (PostgreSQL)  •  Cache (Redis)
├── monitoring/   DriftDetector (Kafka + Evidently)  •  ModelSidecar  •  Slack alerts
├── query/        QueryEngine (LLM + RAG over version history)
└── api/          FastAPI REST API

dashboard/        React + Vite web UI

Core services (required): PostgreSQL, Redis, MinIO Optional services: Kafka (drift monitoring), LLM API key (natural language query)

All optional services degrade gracefully — everything else keeps working if they are not configured.

Prerequisites

Python 3.10+
Docker + Docker Compose
Node.js 18+ (only if you want to run the dashboard)

Quick Start

1. Clone and install

git clone https://github.com/yourusername/mlvc.git
cd mlvc
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

2. Configure

cp .env.example .env
# Edit .env if needed — defaults work for the local Docker setup

3. Start infrastructure

# Core services — required
docker compose -f docker-compose.yml up -d postgres redis

# Optional services (MinIO, Kafka, Neo4j)
docker compose -f docker-compose.dev.yml up -d

Wait a few seconds for PostgreSQL to be ready, then run migrations:

alembic upgrade head

Create the MinIO bucket (first time only):

# Option A: via the MinIO container
docker exec <minio-container-name> sh -c "
  mc alias set local http://localhost:9000 minioadmin minioadmin &&
  mc mb local/mlvc-local-dev
"

# Option B: open http://localhost:9001 and create the bucket manually

4. Initialize a project

mlvc init --name fraud-detector --description "Credit card fraud model"

This creates .mlvc/config.json in your working directory and registers the project in the database. Subsequent commands auto-detect the project from this file.

5. Commit your first model

mlvc commit model.pkl \
  --message "Baseline logistic regression" \
  --accuracy 0.923 \
  --f1 0.871 \
  --precision 0.884 \
  --recall 0.859 \
  --dataset "transactions_v1" \
  --hyperparams '{"C": 1.0, "max_iter": 200}' \
  --features '["amount", "merchant_category", "hour_of_day"]'

You will see the version number, hash, and any anomaly warnings.

CLI Reference

`mlvc init`

mlvc init --name <project-name> [--description "..."] [--author "..."]

Registers the project in the database and writes .mlvc/config.json.

`mlvc commit`

mlvc commit <model_path> --message "..." [options]

Option	Description
`--message, -m`	Why was this version created? (required)
`--project`	Project name (auto-detected from `.mlvc/config.json`)
`--accuracy`	Accuracy metric (0–1)
`--f1`	F1 score
`--precision`	Precision
`--recall`	Recall
`--auc-roc`	AUC-ROC
`--metrics-json`	Any additional metrics as JSON: `'{"mse": 0.04}'`
`--dataset`	Dataset name
`--dataset-version`	Dataset version string
`--hyperparams`	Hyperparameters as JSON: `'{"lr": 0.001}'`
`--features`	Feature list as JSON array: `'["age", "income"]'`
`--parent`	Parent version identifier (for fine-tuned models)
`--eval-dataset`	Path to CSV for slice evaluation
`--target-column`	Target column in eval dataset
`--slice-columns`	Comma-separated columns to slice by
`--author`	Author name (defaults to `$USER`)

On commit, mlvc:

Detects the model format (sklearn/joblib, PyTorch, XGBoost, etc.)
SHA-256 hashes the file and deduplicates in MinIO — no wasted storage if you commit the same weights twice
Stores all metadata in PostgreSQL
Updates metric baselines and warns if any metric is outside the 2σ historical range

`mlvc log`

mlvc log --project fraud-detector [--limit 20]

Lists version history with version number, hash, author, date, commit message, and key metrics.

`mlvc show`

mlvc show v3 --project fraud-detector
# Also accepts: version number (3), hash prefix (abc12345), or "latest"

Shows full details: all metrics, hyperparameters, feature list, environment snapshot, dataset info.

`mlvc diff`

mlvc diff v3 v4 --project fraud-detector

Side-by-side comparison of:

All metrics with direction arrows and regression flags
Features added / removed
Hyperparameters changed

If precision went up while recall went down, mlvc recognises this as an intentional tradeoff and does not flag either as a regression.

`mlvc rollback`

mlvc rollback v2 --reason "v3 caused 15% precision drop" --project fraud-detector

Records the rollback in the audit trail with who did it, when, and why. Prints the artifact URI of v2. Pull that file and redeploy it yourself — mlvc does not touch your serving infrastructure.

Optional flags:

--from-version — which version is being replaced (default: latest)
--author — who performed the rollback (default: $USER)

`mlvc deploy`

mlvc deploy v4 --project fraud-detector --env production --notes "A/B test winner"
# Environments: production (default), staging, canary

Marks a version as the active deployment for an environment. The previous active deployment for that environment is automatically retired in the tracking table.

`mlvc query`

mlvc query "which version had the best F1 score?" --project fraud-detector
mlvc query "why did we roll back v3?" --project fraud-detector

Sends your version history as context to an LLM and returns a plain-English answer. Requires an LLM API key (see Configuration). Without a key it falls back to keyword search over version metadata.

Dashboard (Web UI)

The dashboard is a React + Vite app with six pages:

Page	What you see
Projects	All registered projects
Version list	All versions for a project with metric columns and sparklines
Version detail	Full metadata, metrics, hyperparameters, environment for one version
Diff	Side-by-side metric comparison between any two versions
Lineage	Interactive graph showing parent–child version relationships and rollbacks
Drift	Drift events table per version

Running the dashboard

# Start the API server
uvicorn mlvc.api.app:app --reload --port 8000

# In a separate terminal
cd dashboard
npm install
npm run dev     # opens at http://localhost:5173

Configuration

All settings live in .env with the MLVC_ prefix.

# Required
MLVC_DATABASE_URL=postgresql://mlvc:localpass@localhost:5432/mlvc
MLVC_S3_BUCKET=mlvc-local-dev

# S3 / MinIO credentials
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
AWS_ENDPOINT_URL=http://localhost:9000    # remove this line for real AWS S3
AWS_DEFAULT_REGION=us-east-1

# Optional: LLM for mlvc query (uses Gemini API)
MLVC_GEMINI_API_KEY=your_gemini_api_key_here
MLVC_GEMINI_MODEL=gemma-4-31b-it

# Optional: Slack drift alerts
MLVC_SLACK_WEBHOOK_URL=https://hooks.slack.com/...

# Optional: Kafka drift detection
MLVC_KAFKA_BOOTSTRAP_SERVERS=localhost:9092

# Optional: Redis query cache
MLVC_REDIS_URL=redis://localhost:6379

# Drift thresholds
MLVC_PSI_THRESHOLD=0.20
MLVC_KS_THRESHOLD=0.15
MLVC_ACCURACY_DROP_THRESHOLD=0.05

Optional Features

Natural Language Query

Set MLVC_GEMINI_API_KEY to your Gemini API key. The query engine uses gemma-4-31b-it by default (configurable via MLVC_GEMINI_MODEL). It builds a context from your version history and sends it to the model. Responses are Redis-cached for 5 minutes.

Install the query extra first:

pip install -e ".[query]"

Data Drift Monitoring

Requires Kafka (pip install -e ".[kafka]"). Wrap your model with the ModelSidecar to auto-publish predictions:

from mlvc.monitoring.sidecar import ModelSidecar

sidecar = ModelSidecar(model=your_model, version_id="<version-uuid>")
prediction = sidecar.predict(inputs)   # publishes to Kafka in the background

Run the drift detector as a background process:

python -m mlvc.monitoring.drift_detector

It consumes prediction events in 1000-sample windows, runs PSI + KS tests via Evidently AI, and writes drift events to the drift_events table. Optionally sends a Slack alert.

Running Tests

Unit tests require no infrastructure:

pytest tests/unit/

Integration tests require a running PostgreSQL:

pytest tests/integration/

Should you commit `.mlvc/` to Git?

No. The .mlvc/config.json file is local project state (like .git/ itself). It is in .gitignore. Each developer runs mlvc init --name <project> in their working directory to connect to the shared database.

Contributing

See CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
dashboard		dashboard
docker		docker
migrations		migrations
mlvc		mlvc
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
demo.py		demo.py
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

mlvc — Semantic Version Control for ML Models

What it does

Features

What rollback does NOT do

Architecture

Prerequisites

Quick Start

1. Clone and install

2. Configure

3. Start infrastructure

4. Initialize a project

5. Commit your first model

CLI Reference

mlvc init

mlvc commit

mlvc log

mlvc show

mlvc diff

mlvc rollback

mlvc deploy

mlvc query

Dashboard (Web UI)

Running the dashboard

Configuration

Optional Features

Natural Language Query

Data Drift Monitoring

Running Tests

Should you commit .mlvc/ to Git?

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`mlvc init`

`mlvc commit`

`mlvc log`

`mlvc show`

`mlvc diff`

`mlvc rollback`

`mlvc deploy`

`mlvc query`

Should you commit `.mlvc/` to Git?

Packages