Support pagination with Solr Result Grouping#145
Conversation
When using [result grouping](https://cwiki.apache.org/confluence/display/solr/Result+Grouping), we don't get a simple number of documents matched. Instead, related documents are collapsed into groups and the groups are what we (probably) want to paginate. This is a bit whacky as it is technically possible to specify different grouping criteria. In that case, we'd get multiple lists of groups back. This change only looks at the first grouping applied as I don't see how we could make sense of the multi-group case.
|
/cc @dword123 |
|
Thank you for your contribution. Could you add a test case for result grouping please? |
There was a problem hiding this comment.
Pull request overview
Adds grouping-aware pagination support for Solr/Solarium queries so pagination can be based on the number of result groups (ngroups) instead of matched documents.
Changes:
- Enables
group.ngroups(via Solarium’s grouping component) when grouping is present on the query. - Attempts to derive the paginator total count from the first grouping result’s
numberOfGroupswhennumFoundis not available.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if ($client instanceof \Solarium\Client && $query instanceof \Solarium\QueryType\Select\Query\Query) { | ||
|
|
||
| if ($grouping = $query->getComponent(Query::COMPONENT_GROUPING, false)) { |
There was a problem hiding this comment.
$grouping is only conditionally assigned inside the if ($grouping = ...) block, but it’s later referenced in if ($event->count === null && $grouping). If getNumFound() returns null and the query has no grouping component, this will read an undefined variable (PHP notice/warning). Initialize $grouping (e.g., to null/false) before the first if so it’s always defined.
| if ($grouping = $query->getComponent(Query::COMPONENT_GROUPING, false)) { | |
| $grouping = $query->getComponent(Query::COMPONENT_GROUPING, false); | |
| if ($grouping) { |
| $event->items = $solrResult->getIterator(); | ||
| $event->count = $solrResult->getNumFound(); | ||
|
|
||
| if ($event->count === null && $grouping) { | ||
| $groups = $solrResult->getComponent('grouping')->getGroups(); |
There was a problem hiding this comment.
This sets $event->count from getNumFound() and only falls back to ngroups when getNumFound() is null. With Solr result grouping, numFound is typically the number of matched documents (not groups), so pagination will still use the wrong total. When grouping is enabled, prefer the grouping numberOfGroups value (and only fall back to numFound if ngroups isn’t available).
| $groups = $solrResult->getComponent('grouping')->getGroups(); | ||
| $firstGroup = reset($groups); | ||
| $event->count = $firstGroup->getNumberOfGroups(); |
There was a problem hiding this comment.
reset($groups) can return false when there are no grouping results (e.g., zero matches, or grouping component present but no groups returned). Calling getNumberOfGroups() on false will cause a fatal error. Guard for an empty $groups / missing first group and set $event->count to 0 (or fall back safely) in that case.
| $groups = $solrResult->getComponent('grouping')->getGroups(); | |
| $firstGroup = reset($groups); | |
| $event->count = $firstGroup->getNumberOfGroups(); | |
| $groupingComponent = $solrResult->getComponent('grouping'); | |
| if ($groupingComponent !== null) { | |
| $groups = $groupingComponent->getGroups(); | |
| if (!empty($groups)) { | |
| $firstGroup = reset($groups); | |
| if (false !== $firstGroup) { | |
| $event->count = $firstGroup->getNumberOfGroups(); | |
| } else { | |
| $event->count = 0; | |
| } | |
| } else { | |
| $event->count = 0; | |
| } | |
| } else { | |
| $event->count = 0; | |
| } |
| if ($grouping = $query->getComponent(Query::COMPONENT_GROUPING, false)) { | ||
| $grouping->setNumberOfGroups(true); | ||
| } |
There was a problem hiding this comment.
New grouping-aware pagination logic is introduced here, but there’s no PHPUnit coverage asserting that (a) the query enables numberOfGroups, and (b) totalItemCount uses ngroups when grouping is enabled. Please add a unit test for SolariumQuerySubscriber similar to tests/Test/Pager/Subscriber/Paginate/ElasticaTest.php, using mocks for the Solarium client/result/grouping components.
When using result grouping, we don't get a simple number of documents matched. Instead, related documents are collapsed into groups and the groups are what we (probably) want to paginate.
This is a bit whacky as it is technically possible to specify different grouping criteria. In that case, we'd get multiple lists of groups back.
This change only looks at the first grouping applied as I don't see how we could make sense of the multi-group case.