fix: history query with non-matching @id returns all genesis facts#1252
Open
bplatz wants to merge 1 commit into
Open
fix: history query with non-matching @id returns all genesis facts#1252bplatz wants to merge 1 commit into
@id returns all genesis facts#1252bplatz wants to merge 1 commit into
Conversation
…resolvable When a history query's pattern binds the subject (or predicate) to a literal IRI/Sid that doesn't exist in the persisted store, `BinaryFilter` ends up with `s_id`/`p_id` = None. Since the persisted pass only constrains by those ids (object filtering is not applied there), the unconstrained filter walked every leaflet and emitted every base row whose `t` fell in range — surfacing as a history query for a non-matching `@id` returning every genesis fact at t:1. Guard the persisted pass: when a literal-bound component fails to resolve to a store id, skip the persisted branch entirely. Novelty still runs (so subjects present only in unindexed data continue to match), where `flake_matches_range_eq` correctly short-circuits to empty for absent subjects. Adds two regression tests covering both the indexed and novelty-only paths.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
History queries with a literal
@idthat doesn't match any subject silentlydropped the subject filter and returned every base row at
t:1. On largerledgers this blew past Lambda's 6 MB response limit and surfaced as a generic
upstream 502; on smaller ledgers it returned plausible-looking garbage.
Root cause
BinaryHistoryScanOperator::collect_history_flakesbuilds aBinaryFilterfromthe pattern. When the subject literal doesn't exist in the persisted store,
store.find_subject_id(iri)returnsNone, sofilter.s_id = None. Thepersisted pass only constrains by
s_id/p_id(object filtering isn't appliedthere), so an unconstrained filter walked every leaflet and emitted every base
row whose
tfell in range.Novelty was unaffected — it filters via
flake_matches_range_eqkeyed on thesnapshot-space
s_sid, which correctly short-circuits to empty.Fix
Before opening the persisted branch, detect literal-bound pattern components
(subject
Ref::Iri/Ref::Sid, same for predicate) that failed storeresolution and skip the persisted pass entirely. Novelty still runs so subjects
present only in unindexed data continue to match.