Skip to content

feat(cli): add migrate-model command for legacy index migration#232

Open
moabualruz wants to merge 8 commits into
yoanbernabeu:mainfrom
moabualruz:feat/migrate-model-command
Open

feat(cli): add migrate-model command for legacy index migration#232
moabualruz wants to merge 8 commits into
yoanbernabeu:mainfrom
moabualruz:feat/migrate-model-command

Conversation

@moabualruz

@moabualruz moabualruz commented Apr 23, 2026

Copy link
Copy Markdown

Summary

Adds grepai migrate-model <provider/model> — a metadata-only command that stamps all chunks with an empty EmbedModel field with the given provider/model tag.

Depends on #200 (multi-model index support). Design discussion: #233.

What it does

grepai migrate-model ollama/nomic-embed-text
# Migrated 14 382 chunks with model tag "ollama/nomic-embed-text".
  • Validates provider/model format (must contain /, both parts non-empty)
  • Currently supports the GOB backend only (returns a clear error for postgres/qdrant)
  • No embedding API calls — pure metadata update
  • Required workflow when enabling store.multi_model: true on an existing index so legacy chunks become searchable under the new per-model filtering

Files changed

File Change
cli/migrate_model.go New command (106 lines)
docs/src/content/docs/backends/embedders.md Model Tagging tip referencing this command
CHANGELOG.md Unreleased entry

Related

Split from #199 (closed) as requested by maintainer. Part 3 of 3:

  1. fix(pathutil): resolve symlinks and junctions gracefully on all platforms #231 — symlink/junction fix
  2. feat(store): add embedding model tagging with multi-model index support #200 — multi-model index support (prerequisite)
  3. This PR — migrate-model command

Design and open questions tracked in #233.

Add failing tests that define the expected behavior for embedding model
tagging on chunks and the multi_model configuration option. Tests cover:
- EmbedModel filter returns only matching chunks
- EmbedModel filter excludes chunks with empty tags (strict)
- No filter returns all chunks (zero behavioral change)
- MultiModel defaults to false
- EmbedModelTag() helper returns provider/model format
- EmbedModelTag() guards against empty provider or model
…ict filtering

- Add EmbedModel string field to store.Chunk for provider/model tagging
- Add EmbedModel filter to SearchOptions; when set, GOBStore.Search
  returns only chunks with exact match (empty tags excluded)
- Add MultiModel bool to StoreConfig (defaults to false)
- Add Config.EmbedModelTag() helper that returns "provider/model"
  with empty guards for missing provider or model
- Add embedModelTag field and SetEmbedModelTag method to Indexer
- Pass embedModelTag through createStoreChunks to stamp on every chunk
- Wire in cli/watch.go: call SetEmbedModelTag when multi_model is true
- Update existing indexer tests for new createStoreChunks signature
- Add embedModelTag field and SetEmbedModelFilter method to Searcher
- Pass EmbedModel in SearchOptions for vector-only search
- Filter allChunks by model tag in hybridSearch before text search
- Wire in cli/search.go and mcp/server.go when multi_model is true
When multi_model is enabled, watch and search commands now check for
chunks with empty EmbedModel before proceeding. If untagged chunks are
found, the command exits with a message directing the user to run
'grepai migrate-model <provider/model>'.

Shared helper countUntaggedChunks in cli/multi_model.go.
New command `grepai migrate-model <provider/model>` stamps all chunks
with empty EmbedModel fields with the given tag. Metadata-only operation;
no embedding API calls. Validates provider/model format contains "/".
Currently supports only the GOB backend.
@moabualruz

Copy link
Copy Markdown
Author

Design discussion for the multi-model feature (prerequisite for this PR) opened in #233. This command is part of the migration story described there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant