Git-style version control for machine learning models, with ML-specific intelligence built in: metric comparison, anomaly detection, data drift monitoring, and a natural language query layer over your version history.
When you train a new model, you mlvc commit it. mlvc stores the artifact in S3-compatible storage (MinIO locally), records every metric, hyperparameter, dataset, and environment snapshot in PostgreSQL, and lets you mlvc diff, mlvc rollback, mlvc deploy, and mlvc query your history — just like Git, but for models.
| Feature | What it does |
|---|---|
| Commit | Version any model file (sklearn, PyTorch, XGBoost, etc.) with metrics, hyperparameters, dataset info, and author |
| Log | Full version history with metric columns |
| Show | Detailed view of any version |
| Diff | Side-by-side metric/feature/hyperparam comparison between two versions. Detects precision-recall tradeoffs automatically |
| Rollback | Record a rollback intent with reason + audit trail. Gives you the artifact URI — you deploy it yourself |
| Deploy | Mark a version as active in production / staging / canary |
| Anomaly detection | Warns on commit if any metric falls outside the 2σ historical baseline |
| Slice evaluation | Evaluate model performance per demographic/feature slice on commit |
| Drift detection | Window-based data drift via Kafka + Evidently AI (PSI, KS tests). Alerts to Slack |
| Natural language query | Ask questions about your version history in plain English using an LLM |
| React dashboard | Web UI for all of the above |
mlvc rollback is an audit trail command, not an auto-deployment trigger. It records who rolled back, from which version, to which version, and why. It then prints the artifact URI for the target version — you pull that artifact and redeploy it in your own pipeline. mlvc never touches your serving infrastructure.
mlvc/
├── cli/ Click commands (init, commit, log, show, diff, rollback, deploy, query)
├── core/ CommitManager, MetricsEngine, BaselineTracker, SliceEvaluator, RollbackManager
├── storage/ ArtifactStore (S3/MinIO) • MetadataStore (PostgreSQL) • Cache (Redis)
├── monitoring/ DriftDetector (Kafka + Evidently) • ModelSidecar • Slack alerts
├── query/ QueryEngine (LLM + RAG over version history)
└── api/ FastAPI REST API
dashboard/ React + Vite web UI
Core services (required): PostgreSQL, Redis, MinIO Optional services: Kafka (drift monitoring), LLM API key (natural language query)
All optional services degrade gracefully — everything else keeps working if they are not configured.
- Python 3.10+
- Docker + Docker Compose
- Node.js 18+ (only if you want to run the dashboard)
git clone https://github.com/yourusername/mlvc.git
cd mlvc
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"cp .env.example .env
# Edit .env if needed — defaults work for the local Docker setup# Core services — required
docker compose -f docker-compose.yml up -d postgres redis
# Optional services (MinIO, Kafka, Neo4j)
docker compose -f docker-compose.dev.yml up -dWait a few seconds for PostgreSQL to be ready, then run migrations:
alembic upgrade headCreate the MinIO bucket (first time only):
# Option A: via the MinIO container
docker exec <minio-container-name> sh -c "
mc alias set local http://localhost:9000 minioadmin minioadmin &&
mc mb local/mlvc-local-dev
"
# Option B: open http://localhost:9001 and create the bucket manuallymlvc init --name fraud-detector --description "Credit card fraud model"This creates .mlvc/config.json in your working directory and registers the project in the database. Subsequent commands auto-detect the project from this file.
mlvc commit model.pkl \
--message "Baseline logistic regression" \
--accuracy 0.923 \
--f1 0.871 \
--precision 0.884 \
--recall 0.859 \
--dataset "transactions_v1" \
--hyperparams '{"C": 1.0, "max_iter": 200}' \
--features '["amount", "merchant_category", "hour_of_day"]'You will see the version number, hash, and any anomaly warnings.
mlvc init --name <project-name> [--description "..."] [--author "..."]Registers the project in the database and writes .mlvc/config.json.
mlvc commit <model_path> --message "..." [options]| Option | Description |
|---|---|
--message, -m |
Why was this version created? (required) |
--project |
Project name (auto-detected from .mlvc/config.json) |
--accuracy |
Accuracy metric (0–1) |
--f1 |
F1 score |
--precision |
Precision |
--recall |
Recall |
--auc-roc |
AUC-ROC |
--metrics-json |
Any additional metrics as JSON: '{"mse": 0.04}' |
--dataset |
Dataset name |
--dataset-version |
Dataset version string |
--hyperparams |
Hyperparameters as JSON: '{"lr": 0.001}' |
--features |
Feature list as JSON array: '["age", "income"]' |
--parent |
Parent version identifier (for fine-tuned models) |
--eval-dataset |
Path to CSV for slice evaluation |
--target-column |
Target column in eval dataset |
--slice-columns |
Comma-separated columns to slice by |
--author |
Author name (defaults to $USER) |
On commit, mlvc:
- Detects the model format (sklearn/joblib, PyTorch, XGBoost, etc.)
- SHA-256 hashes the file and deduplicates in MinIO — no wasted storage if you commit the same weights twice
- Stores all metadata in PostgreSQL
- Updates metric baselines and warns if any metric is outside the 2σ historical range
mlvc log --project fraud-detector [--limit 20]Lists version history with version number, hash, author, date, commit message, and key metrics.
mlvc show v3 --project fraud-detector
# Also accepts: version number (3), hash prefix (abc12345), or "latest"Shows full details: all metrics, hyperparameters, feature list, environment snapshot, dataset info.
mlvc diff v3 v4 --project fraud-detectorSide-by-side comparison of:
- All metrics with direction arrows and regression flags
- Features added / removed
- Hyperparameters changed
If precision went up while recall went down, mlvc recognises this as an intentional tradeoff and does not flag either as a regression.
mlvc rollback v2 --reason "v3 caused 15% precision drop" --project fraud-detectorRecords the rollback in the audit trail with who did it, when, and why. Prints the artifact URI of v2. Pull that file and redeploy it yourself — mlvc does not touch your serving infrastructure.
Optional flags:
--from-version— which version is being replaced (default:latest)--author— who performed the rollback (default:$USER)
mlvc deploy v4 --project fraud-detector --env production --notes "A/B test winner"
# Environments: production (default), staging, canaryMarks a version as the active deployment for an environment. The previous active deployment for that environment is automatically retired in the tracking table.
mlvc query "which version had the best F1 score?" --project fraud-detector
mlvc query "why did we roll back v3?" --project fraud-detectorSends your version history as context to an LLM and returns a plain-English answer. Requires an LLM API key (see Configuration). Without a key it falls back to keyword search over version metadata.
The dashboard is a React + Vite app with six pages:
| Page | What you see |
|---|---|
| Projects | All registered projects |
| Version list | All versions for a project with metric columns and sparklines |
| Version detail | Full metadata, metrics, hyperparameters, environment for one version |
| Diff | Side-by-side metric comparison between any two versions |
| Lineage | Interactive graph showing parent–child version relationships and rollbacks |
| Drift | Drift events table per version |
# Start the API server
uvicorn mlvc.api.app:app --reload --port 8000
# In a separate terminal
cd dashboard
npm install
npm run dev # opens at http://localhost:5173All settings live in .env with the MLVC_ prefix.
# Required
MLVC_DATABASE_URL=postgresql://mlvc:localpass@localhost:5432/mlvc
MLVC_S3_BUCKET=mlvc-local-dev
# S3 / MinIO credentials
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
AWS_ENDPOINT_URL=http://localhost:9000 # remove this line for real AWS S3
AWS_DEFAULT_REGION=us-east-1
# Optional: LLM for mlvc query (uses Gemini API)
MLVC_GEMINI_API_KEY=your_gemini_api_key_here
MLVC_GEMINI_MODEL=gemma-4-31b-it
# Optional: Slack drift alerts
MLVC_SLACK_WEBHOOK_URL=https://hooks.slack.com/...
# Optional: Kafka drift detection
MLVC_KAFKA_BOOTSTRAP_SERVERS=localhost:9092
# Optional: Redis query cache
MLVC_REDIS_URL=redis://localhost:6379
# Drift thresholds
MLVC_PSI_THRESHOLD=0.20
MLVC_KS_THRESHOLD=0.15
MLVC_ACCURACY_DROP_THRESHOLD=0.05Set MLVC_GEMINI_API_KEY to your Gemini API key. The query engine uses gemma-4-31b-it by default (configurable via MLVC_GEMINI_MODEL). It builds a context from your version history and sends it to the model. Responses are Redis-cached for 5 minutes.
Install the query extra first:
pip install -e ".[query]"Requires Kafka (pip install -e ".[kafka]"). Wrap your model with the ModelSidecar to auto-publish predictions:
from mlvc.monitoring.sidecar import ModelSidecar
sidecar = ModelSidecar(model=your_model, version_id="<version-uuid>")
prediction = sidecar.predict(inputs) # publishes to Kafka in the backgroundRun the drift detector as a background process:
python -m mlvc.monitoring.drift_detectorIt consumes prediction events in 1000-sample windows, runs PSI + KS tests via Evidently AI, and writes drift events to the drift_events table. Optionally sends a Slack alert.
Unit tests require no infrastructure:
pytest tests/unit/Integration tests require a running PostgreSQL:
pytest tests/integration/No. The .mlvc/config.json file is local project state (like .git/ itself). It is in .gitignore. Each developer runs mlvc init --name <project> in their working directory to connect to the shared database.
See CONTRIBUTING.md.