You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Establish the convention that every ingested source ships its own fixture set under internal/sources/<name>/testdata/, covering both adapter behaviour and matcher realism on data shaped like that source.
Why
The source-agnostic fixtures in internal/identity/testdata/ (see #73) test the matcher algorithm. They cannot test:
Whether the source's adapter correctly translates upstream rows into identity.Record values (handling that source's quirks: name conventions, category strings, address-tag patterns, missing fields).
Whether the matcher gets that source's actual data right at acceptable precision/recall.
Centralising those tests under internal/identity would hide source-specific concerns in a shared file. Each source owns the tests for its own quirks.
Scope
Convention (to be applied per source when each source is added):
internal/sources/<name>/testdata/adapter_fixtures.json — given this upstream row, the adapter produces this identity.Record.
internal/sources/<name>/testdata/match_corpus.json — records sampled from the source paired with nearby OSM places + ground-truth labels. Used to measure aggregate precision/recall for that source.
Out of scope
Building these for any specific source today. This issue documents the expectation; each concrete source's implementation issue picks them up.
Goal
Establish the convention that every ingested source ships its own fixture set under
internal/sources/<name>/testdata/, covering both adapter behaviour and matcher realism on data shaped like that source.Why
The source-agnostic fixtures in
internal/identity/testdata/(see #73) test the matcher algorithm. They cannot test:identity.Recordvalues (handling that source's quirks: name conventions, category strings, address-tag patterns, missing fields).Centralising those tests under
internal/identitywould hide source-specific concerns in a shared file. Each source owns the tests for its own quirks.Scope
Convention (to be applied per source when each source is added):
internal/sources/<name>/testdata/adapter_fixtures.json— given this upstream row, the adapter produces thisidentity.Record.internal/sources/<name>/testdata/match_corpus.json— records sampled from the source paired with nearby OSM places + ground-truth labels. Used to measure aggregate precision/recall for that source.Out of scope
Acceptance
internal/sources/README.mdpoints to that source as the pattern to follow.