Skip to content

fix: history query with non-matching @id returns all genesis facts#1252

Open
bplatz wants to merge 1 commit into
mainfrom
fix/history-missing-iri
Open

fix: history query with non-matching @id returns all genesis facts#1252
bplatz wants to merge 1 commit into
mainfrom
fix/history-missing-iri

Conversation

@bplatz
Copy link
Copy Markdown
Contributor

@bplatz bplatz commented May 23, 2026

History queries with a literal @id that doesn't match any subject silently
dropped the subject filter and returned every base row at t:1. On larger
ledgers this blew past Lambda's 6 MB response limit and surfaced as a generic
upstream 502; on smaller ledgers it returned plausible-looking garbage.

Root cause

BinaryHistoryScanOperator::collect_history_flakes builds a BinaryFilter from
the pattern. When the subject literal doesn't exist in the persisted store,
store.find_subject_id(iri) returns None, so filter.s_id = None. The
persisted pass only constrains by s_id/p_id (object filtering isn't applied
there), so an unconstrained filter walked every leaflet and emitted every base
row whose t fell in range.

Novelty was unaffected — it filters via flake_matches_range_eq keyed on the
snapshot-space s_sid, which correctly short-circuits to empty.

Fix

Before opening the persisted branch, detect literal-bound pattern components
(subject Ref::Iri/Ref::Sid, same for predicate) that failed store
resolution and skip the persisted pass entirely. Novelty still runs so subjects
present only in unindexed data continue to match.

…resolvable

When a history query's pattern binds the subject (or predicate) to a literal
IRI/Sid that doesn't exist in the persisted store, `BinaryFilter` ends up with
`s_id`/`p_id` = None. Since the persisted pass only constrains by those ids
(object filtering is not applied there), the unconstrained filter walked every
leaflet and emitted every base row whose `t` fell in range — surfacing as a
history query for a non-matching `@id` returning every genesis fact at t:1.

Guard the persisted pass: when a literal-bound component fails to resolve to a
store id, skip the persisted branch entirely. Novelty still runs (so subjects
present only in unindexed data continue to match), where
`flake_matches_range_eq` correctly short-circuits to empty for absent subjects.

Adds two regression tests covering both the indexed and novelty-only paths.
@bplatz bplatz requested review from aaj3f and zonotope May 23, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant