Add embedding generation APIs#244
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## trunk #244 +/- ##
============================================
- Coverage 88.12% 87.39% -0.74%
- Complexity 1213 1258 +45
============================================
Files 60 61 +1
Lines 3934 4124 +190
============================================
+ Hits 3467 3604 +137
- Misses 467 520 +53
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
I know we already have #132 open to add embedding support and while that PR is fairly old now, it has already gone through an initial review. I've not compared the code there to here but curious if the approach taken here is different than that and if so, what the benefits are here as compared to that? Ideally we proceed with a single approach to solve the same problem to not overtax contributors here. |
|
Thanks for pointing that out. I honestly missed #132 before opening this, and there's definitely overlap. My apologies for the duplicated effort. The main distinction I see is that this PR is provider-agnostic, while #132 includes OpenAI-specific implementation that I think belongs in the OpenAI provider package, not Happy to close this if maintainers want #132 to remain canonical, or take ownership here if this narrower direction makes more sense. |
I'd defer to @JasonTheAdams and/or @felixarntz on that, I think ideal to have a single PR we're working from in order to not duplicate effort but I also know that original PR has sat for months and likely needs work to get it across the finish line so may be worth closing that out and proceeding here instead. |
JasonTheAdams
left a comment
There was a problem hiding this comment.
This is great! Thanks for the PR, @chubes4! One key suggestion:
Let's introduce an Embedding Value Object that's iterable and immutable, takes a list<int|float> of values and an integer for dimensions. This will contain all the validations (cleaning up EmbeddingResult and make it a first-class type.
Summary
Fixes #242.
Testing
composer phpcscomposer phpstancomposer test:unit -- --filter 'testGenerateEmbeddingResult|testGenerateEmbeddingReturnsFirstVector|testGenerateEmbeddingsReturnsBatchVectors|testGenerateEmbeddingResultWithInvalidModel|testGenerateEmbedding|testGenerateEmbeddings|EmbeddingResultTest'php -d error_reporting='E_ALL & ~E_DEPRECATED' vendor/bin/phpunit --testsuite unitAI assistance