draft: D1 strawman — system-reference primitive + fidelity enum (RFC #4616) by EvgenyAndroid · Pull Request #4622 · adcontextprotocol/adcp

EvgenyAndroid · 2026-05-16T21:28:21Z

This is a draft / discussion artifact, NOT a merge bid. Filed proactively per the process proposed in #4616 (comment-4468168781) so the WG has something concrete to react to on the shape of decision D1 (shared primitive vs per-dimension types).

Closing this and re-drafting is the expected path if the WG counter-proposes a fundamentally different shape. No sunk cost — three files + a changeset.

Updated 2026-05-16 round 1 (commit 5a0ebfe7): folded @lukasz-pubx's review + #4616 D1 vote:

Renamed id → value (cross-axis neutrality — primitive is used for taxonomies & measurement currencies, not just identity systems)

Added converted to the fidelity enum (deterministic & lossless conversion case)

Added new core/system-reference-conversion.json schema factoring "how preserved?" (enum) separately from "via what?" ({from, to, method, method_details?})

Updated 2026-05-16 round 2 (commit ac4c6183): @lukasz-pubx's follow-up — three new method values + a vendor-identity field:

Method enum extends from 4 → 7: id_graph | name_match | crosswalk | upscaled | inferred | projected | custom

Added method_provider? to the conversion structure (opaque-by-convention string for vendor identity: LiveRamp, ID5, IAB, Nielsen, Comscore, etc.)

Updated 2026-05-17 round 3 (commit 9716bb15): @bokelley's third-sibling framing on #4616 (issuecomment-4470049814) — D1 covers measurement AS WELL AS signals; the primitive's design was already cross-axis but descriptions buried measurement as secondary. Promoting to first-class:

Schema descriptions rewritten to explicitly enumerate four primary axes (identity / taxonomy / geographic / measurement), with measurement currencies AND methodology sources called out distinctly.

New examples: nielsen_p_18_49:P18-49:2025 (measurement currency), measurement_source:set_top_box (the Measurement source attribution on delivery reports #2041 canonical use case), STB-to-currency projection (measurement-side conversion).

Description-only / examples-only change. Structural shape unchanged.

Customer-set update 2026-05-18: #4475 closed by @bokelley on 5/17 — "we have geo_metro already, and I think it makes sense to extend this to include MSAs vs creating a new concept here. The only one of these that doesn't fit is zip, but we also have geo_postal_areas which covers that." Per-dimension geo schemas already cover #4475's row. Re-centering D1's customer set on rows where per-dimension schemas DON'T already exist — see Cross-references below. D1's value prop sharpens here: stop dimension-schema proliferation before it starts on newly-modeled rows (ID graphs, measurement methodology, taxonomy versioning), rather than replace per-dimension schemas that already work. No structural change to the strawman.

Updated 2026-05-20 round 4 (commit ed382184): @SimonaNemes endorsed round 2 on #4616 (issuecomment-4493606229) AND offered a sharpening of the inferred vs projected distinction — both can use ML; the distinction is the LEVEL at which uncertainty operates:

inferred = entity-level attribution ("given clues, who/what is this entity?" — uncertainty per-record)

projected = population-level estimation ("given this sample, what should we expect at scale?" — uncertainty in the estimate, not individuals)

enumDescriptions rewritten with the entity-level vs population-level framing and cross-references. Paired example added: same device_graph input data routed through inferred (per-record IAB attribution confidence 0.78) and projected (population-level membership 95% CI ±0.4pp). Description + example only; no structural change.

Updated 2026-05-24 round 5 (commit 40cedb4f): @bokelley's WG-acceptance comment on #4616 (issuecomment-4526559566) — "Yes on D1 as a shared primitive for newly-modeled epistemic rows" — with four substantive tightening notes addressed:

Version semantics REVERSED: omitted version now means UNKNOWN / unpinned, NOT a wildcard. Exact equality requires (system, value, version) match for versioned systems; row-level schemas MAY declare a system version-insensitive (UID2 / RampID etc.). Comparators MUST NOT treat omitted version as "matches any version."

converted fidelity tightened — reserved for deterministic AND row-semantics-preserving mappings. Deterministic ≠ lossless preservation; sellers MUST NOT advertise converted when conversion changes the row's meaningful semantics.

upscaled and crosswalk cautions — both should typically pair with approximated fidelity (undefined inverse → lost granularity for upscaled; deterministic ≠ semantically lossless for crosswalk). converted only when the row explicitly says the lost granularity / semantic difference does not matter.

Interop caveat added to primitive description: "the primitive shape alone does NOT create interoperability — consuming row-level schemas MUST constrain or document which system values are meaningful for that row."

Plus scoping clarification: D1 explicitly applies to newly-modeled rows only and does NOT replace per-dimension schemas that already work (geo_metros, geo_postal_areas). Per bokelley: "After that, I'm comfortable using it as the D1 foundation for #2041, the identity-substrate RFC, and the #4472 split."

Updated 2026-05-24 round 6 (commit b49275923): Picks up the @bokelley / @lukasz-pubx / @SimonaNemes / Addie ads.txt-pattern thread on #4616 — protocol carries structured-where-verifiable, links-to-doc-where-it's-a-claim. The strawman primitives are already in the "keep structured" half; round 6 adds the link-out anchor:

method_doc_url? on system-reference-conversion.json — optional URI pointing at the seller's published methodology document (vendor identity-graph page, published crosswalk spec, IAB migration map). Strictly informational on the wire; buyers MAY follow out-of-band to verify but MUST NOT branch on its content programmatically. Consuming row-level schemas MAY require this field in their row's binding if methodology disclosure matters.

Description note that consuming rows adopting the primitive MAY add their own row-level last_updated field on the row itself for signal-record freshness (verifiable, per @SimonaNemes — seller published this record on this date, even if underlying methodology freshness isn't).

Net: +1 optional field, +1 description sentence, +1 example field. Gives downstream row-level RFCs (#4472, #2041, identity-substrate, @tescoboy's product-level) a canonical place to anchor link-out fields without each row reinventing.

Updated 2026-05-25 round 7 (commit ad712795): @bokelley's 5/25 line-level reviews went deeper than the round-5 tightening notes — abstraction-level questions about whether the primitive justifies its surface area. Description-only sharpening (no schema shape change):

Primitive description now leads with the union-axis value proposition — D1 earns its keep on rows where a single field can carry any of N systems with the same comparator semantics (identity substrate, measurement source, PBA taxonomy). For single-axis rows, inline per-dimension fields remain simpler. Explicitly does NOT replace existing per-dimension schemas (per @bokelley's "RampID is defined elsewhere" point).

Conversion structure description clarifies single-party observable scope — the structure describes ONE party's observable conversion (signals seller's in-agent translation, measurement vendor's projection), NOT the multi-hop chain (publisher / SSP / DSP / agency graphs) per @bokelley's review.

Naming note added acknowledging @lukasz-pubx's system → type suggestion but keeping system (less overloaded across AdCP).

Net: 3 description rewrites, zero shape changes. The schema is unchanged; descriptions catch up to what the structure actually means.

What's in the PR

Three reusable schema primitives, no row-level adoption:

1. `static/schemas/source/core/system-reference.json`

The canonical {system, value, version?, name?} shape for any value defined against an external identity / taxonomy / geographic / measurement system. Used wherever a row-level RFC needs to reference a Nielsen DMA market, an IAB Audience Taxonomy node, a UID2 identity, etc.

Design choices spelled out for review:

Field named value, not id. The primitive is cross-axis: identity systems issue IDs, taxonomies issue values/terms, measurement systems issue methodology labels. id is identity-axis-coded and collides with AdCP's existing *_id fields that point at AdCP-issued entities with AdCP lifecycle. value is the cross-vocabulary least common denominator.
system is an open string at the primitive level. Per-use constraints (closed enums, vendor allowlists) belong in the consuming schema's oneOf or enum, not here. Rationale: a closed enum at the primitive level forces every new system addition through a primitive bump; per-use constraints let row-RFCs add systems independently. Recommended convention: kebab_case or snake_case stable identifiers.
version? is RECOMMENDED, not REQUIRED. Some systems (UID2, RampID) are version-less; others (Nielsen DMA, IAB Audience Taxonomy) have boundary/semantic drift between versions. Omitting version implies "latest stable as known to the emitting party." Buyer-side comparators SHOULD treat omitted version as a wildcard against any version. Required wherever the system has version history is the kind of constraint that belongs in the row-RFC, not the primitive.
name? is informational only. Never used for equality or routing. Strictly for UI display. The canonical reference is (system, value, version?).
additionalProperties: false — primitives are tight; extensions belong on the consuming schema, not on the reference itself.

2. `static/schemas/source/enums/system-reference-fidelity.json`

exact | converted | approximated | unsupported — generalizes the market_fidelity mechanism originally proposed in #4475 (now closed; mechanism preserved here as a reusable primitive for the remaining customer rows) to all reference-system axes. Used as a property on deployments[] entries or analogous per-destination structures.

exact — destination uses the same system (and version when both are present); reference frame preserved byte-for-byte.
converted — destination uses a DIFFERENT system but the seller asserts the conversion is deterministic and lossless. Examples: Nielsen DMA → Comscore Market via canonical county crosswalk; UID2 → ID5 via direct identity-graph link; IAB v2 → v3 via published migration map. Functionally equivalent to exact for buyers who only need preservation; structurally distinct because the system axis changed. Buyers reasoning about the federation chain read the deployment's conversion block (file 3 below).
approximated — destination maps to a different system or different version; close but not byte-equivalent. Buyer-decision point. For modeled audiences whose statistical validity depends on the reference frame, approximated activation may invalidate the model.
unsupported — destination cannot resolve the system. Activation will fail or silently degrade. Buyer MUST NOT proceed without explicit opt-in to seller's advertised fallback.

Fidelity is per-deployment, not per-signal — the signal's reference is canonical; fidelity is a function of where it's activated.

3. `static/schemas/source/core/system-reference-conversion.json`

{from, to, method, method_details?} describing how a deployment converts between systems. REQUIRED when fidelity is converted; OPTIONAL surfacing when approximated (lets buyers see WHY a deployment is approximated, not just that it is).

from — signal's original reference system (a full system-reference)
to — system the destination natively uses (a full system-reference)
method enum — id_graph | name_match | crosswalk | upscaled | inferred | projected | custom (semantics in the schema's enumDescriptions)
method_provider? — optional opaque-by-convention vendor identity (LiveRamp, ID5, IAB, Nielsen, etc.). Consumers MAY branch on well-known providers; not a closed enum
method_details? — free-text vendor-specific elaboration; strictly informational; buyer agents MUST NOT branch on this field

Factors the question correctly: the enum says "how preserved?"; the conversion structure says "via what?". Buyers who want the simple signal stop at the enum; buyers who want to reason about the federation chain (e.g. for ID-graph translation correctness) read the conversion block.

What's NOT in this PR (deliberately)

No row-level adoption. RFC: Structured metadata fields for Projected Broadcast Audiences (PBAs) signals #4472 doesn't adopt system-reference for taxonomy or ranking_reference. Measurement source attribution on delivery reports #2041 doesn't adopt it for source_type. Identity-substrate RFC isn't filed yet. Each row adopts independently in its own RFC against whatever shape D1 settles on. (RFC: Structured geographic market identifiers for signals with market-bounded audiences #4475, the originally cited market-identifier customer, was closed on 5/17 in favor of extending existing geo_metro / geo_postal_areas — see customer-set update above.)
No closed system enum. This PR doesn't enumerate nielsen_dma | comscore_market | …. That's a per-row decision (some rows want closed enums for safety; others want open strings for extensibility).
No attestation hook. D3 (verifiability) is research-tier per the umbrella's suggested sequencing. The primitive has room to add attestations: [...] as a sibling field later without breaking, but doesn't ship it.

Open questions for WG review

Shape: is {system, value, version?, name?} the right field set? Alternatives raised in adjacent work include {kind, identifier, vocabulary?} and {type, code, scheme?}. Field names matter for downstream codegen.
system as open string vs closed enum at the primitive level. Open string is the strawman's choice (extensibility at row-RFC level); a closed enum at the primitive level is the more restrictive alternative. Trade-off explained above.
method enum scope. Round 2 extended the enum from 4 → 7 values per @lukasz-pubx feedback. Current set: id_graph | name_match | crosswalk | upscaled | inferred | projected | custom. Anything still missing for ID-graph / measurement / probabilistic cases?
Naming of the fidelity enum: system-reference-fidelity is the strawman's choice. system-fidelity, reference-fidelity, or domain-prefixed names (market-fidelity, taxonomy-fidelity per-dimension) are alternatives. Strawman picks the canonical one. (Originally mirrored RFC: Structured geographic market identifiers for signals with market-bounded audiences #4475's market_fidelity naming; that RFC is now closed but the mechanism generalizes cleanly to the remaining customer rows.)
3.1.x vs 4.0 landing window. Process proposal in RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616 argues for 3.1.x as foundational schema (additive, patch-eligible, unblocks 4.0 row-RFCs). Open to WG counter-proposal.

Test plan

All three JSON files parse as valid draft-07 schema
No existing schema modified — diff is purely additive
Changeset entry follows schemas(push-notification-config): state legacy/9421 precedence in authentication field #2506 / fix(schema,skill): align HMAC framing to RFC 9421 default in reporting-webhook, auth-scheme, call-adcp-agent SKILL.md #4271 precedent
CI: schema-validation, json-schema-validation, lint-schema-links (run on PR open)

Cross-references

RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616 (umbrella — the framing this strawman responds to; @bokelley's 5/17 update extends scope to measurement)

Signals row-level RFCs (D1 customers, pending):

RFC: Structured metadata fields for Projected Broadcast Audiences (PBAs) signals #4472 (PBA audience_model — adopts system-reference for taxonomy / ranking_reference in its own PR once D1 lands)
@lszczesiak's pending identity-substrate RFC (would adopt system-reference for identity + use the fidelity enum + conversion structure for ID-graph translation)
Federation across signals-only agents (us + Affinity Answers, the two signals agents currently in the AAO registry) — cross-provider signal discovery requires shared reference-frame primitives to compare catalogs; see RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616 issuecomment-4471004955

Closed / handled by existing per-dimension schemas (no longer a D1 customer):

RFC: Structured geographic market identifiers for signals with market-bounded audiences #4475 (market identifiers) — closed 2026-05-17. geo_metro already covers DMA/MSA; geo_postal_areas covers zip aggregates. Per-dimension geo schemas already exist, so D1 doesn't apply to this row.

Measurement row-level RFC (D1 customer, pending — added per @bokelley's third-sibling framing 5/17):

Measurement source attribution on delivery reports #2041 (measurement source attribution — adopts system-reference for source instead of a flat source_type enum; methodology blend becomes a conversion block)

Measurement prior art (already shipped, informs D3 direction):

feat(schemas): completion_source qualifier — seller_attested vs vendor_attested completion_rate (closes #3861) #3877 (completion_source: seller_attested | vendor_attested — pattern for "who attested?" as a structured field; precedent for the D3 attestation hook)
feat(schemas): delivery_measurement.vendors as BrandRef[] (closes #3860) #3885 (delivery_measurement.vendors as BrandRef[] — publishing/authorization-half infrastructure on the measurement side)
feat(schemas): measurement capability block + brand.json metric_categories (#3612) #3652 (measurement capability block — vocabulary-row partial shipped)

Adjacent (not blocked here):

Measurement vendor data latency profiles #2043 (measurement vendor data latency — delivery SLA, not epistemic; closed)
epic: broadcast platform support #1919 (epic: broadcast platform support — closed parent)

Precedents (PR shape):

schemas(push-notification-config): state legacy/9421 precedence in authentication field #2506 (precedent — same shape of edit on push-notification-config.json)
fix(schema,skill): align HMAC framing to RFC 9421 default in reporting-webhook, auth-scheme, call-adcp-agent SKILL.md #4271 (precedent — atomic schema + skill PR with changeset, merged 5/9)

I have read the IPR Policy

…rawman for adcontextprotocol#4616 NOT A MERGE BID — discussion artifact for the WG to react to on the shape of decision D1 (shared {system, id, version?} primitive vs per-dimension types) from the signal epistemic-model umbrella issue adcontextprotocol#4616. Two files added, no existing schema modified: static/schemas/source/core/system-reference.json The canonical {system, id, version?, name?} shape for any value defined against an external identity / taxonomy / geographic / measurement system. `system` is intentionally an open string at the primitive level — per-use constraints (closed enums, vendor allowlists) belong in the consuming schema's oneOf or enum, not here. `version` is optional but recommended whenever the system has a versioned definition history. `name` is informational only. static/schemas/source/enums/system-reference-fidelity.json exact | approximated | unsupported. Generalizes the market_fidelity enum proposed in adcontextprotocol#4475 to all reference-system axes (markets, ID graphs, taxonomies, measurement currencies). Used as a property on deployments[] entries or analogous per-destination structures. Plus changeset entry following adcontextprotocol#2506 / adcontextprotocol#4271 precedent. Non-normative on its own: neither primitive is referenced by any existing schema in this PR. Adoption happens row-by-row in the follow-up RFCs (adcontextprotocol#4472, adcontextprotocol#4475, identity-substrate) against whatever shape the WG settles on. If the WG counter-proposes, this PR is one file + one enum + this changeset — close and re-draft. No sunk cost. Closes (when decided): the D1 thread of adcontextprotocol#4616. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

EvgenyAndroid · 2026-05-16T21:28:34Z

I have read the IPR Policy

lukasz-pubx · 2026-05-16T22:38:51Z

+  "title": "System Reference Fidelity",
+  "description": "How faithfully a deployment can honor a value defined against an external reference system (see /schemas/core/system-reference.json). Surfaces the cross-system fidelity question at the deployment layer — buyer agents read this BEFORE activating a signal so they can make an informed decision about geographic, taxonomic, or identity drift introduced by the destination's mapping. Generalizes the `market_fidelity` mechanism proposed in #4475 to all reference-system axes (markets, ID graphs, taxonomies, measurement currencies). Used as a property on `deployments[]` entries or analogous per-destination structures; not used on the signal definition itself (the signal's reference is canonical; fidelity is a function of where it's activated).",
+  "type": "string",
+  "enum": [


Should there be a converted (or translated) option i.e. if nielsen has been (or can be) converted exactly into comscore, or UID2 has been translated to ID5, similar to temp from C to F. I'm a bit torn, but I don't think approximated covers this, especially if you can guarantee it or carry high degree of confidence.

Pushed converted + a new system-reference-conversion.json schema in 5a0ebfe. Factored your two questions ("should there be a converted option?" and "from what to what / how was it done?") as separate concerns:

Fidelity enum answers "how preserved?": exact | converted | approximated | unsupported. converted = different system, deterministic & lossless (your C→F analogy).

system-reference-conversion.json answers "via what?": {from, to, method, method_details?} with method enum.

The conversion structure is REQUIRED when fidelity is converted, OPTIONAL when approximated (so buyers can see WHY a deployment is approximated, not just that it is). Splitting the two means buyers who only need the simple signal stop at the enum; buyers who need to reason about the federation chain read the conversion block.

One question on the method enum — strawman picks four values:

id_graph — translation via identity-graph link (UID2 → ID5 via LiveRamp / TTD / similar)

name_match — shared external identifier present in both systems (e.g. hashed email)

crosswalk — published 1:1 mapping with no information loss (Nielsen DMA → Comscore Market via the canonical county crosswalk; IAB v2 → v3 via published migration map)

custom — vendor-specific mapping

Does that cover the ID-graph cases you have in mind? Anything missing that you'd want represented — e.g. inferred (ML-predicted match), aggregation (geographic upscaling), sampled (panel projection)? Easier to lock the method-enum scope now than after row-level RFCs adopt the primitive.

Thanks @EvgenyAndroid, AdCP never sleeps ;) To answer your follow up questions:

The upscaled for geographic aggregation is a good one for sure, inferred and projected (for sampled data) as well.

For the ID-graph case I would consider adding one more optional i.e. method_provider? -> {from, to, method, method_provider?, method_details?} to allow to specify the vendor.

All four accepted, pushed in ac4c618. Method enum extends to 7 values, ordered by semantic grouping:

Group Methods

Identity-domain id_graph, name_match

Structural-mapping crosswalk, upscaled (new)

Probabilistic inferred (new), projected (new)

Escape hatch custom

Semantics for the three new values (in the schema's enumDescriptions):

upscaled — deterministic aggregation of finer-grained units into a coarser system (zip → DMA, county → state, postcode → MSA). Forward-lossless; inverse undefined. Important nuance I added in the description: for modeled audiences whose statistical validity depends on the finer granularity, deployment fidelity SHOULD be approximated even when the method is upscaled. The method describes the transform; fidelity describes preservation for the use case.

inferred — ML-predicted or rule-based inference (device graph → IAB taxonomy node; behavior → demographic membership). Probabilistic, no per-record correctness guarantee. method_provider SHOULD identify the model or vendor.

projected — statistical projection from a sample to a population (panel-projected reach, currency-projected impression counts, audience extrapolation). Used primarily for measurement currencies and modeled audiences. method_details SHOULD describe sample size, weighting methodology, and confidence intervals where relevant.

method_provider? added as an optional opaque-by-convention string. Consumers MAY branch on well-known names (LiveRamp, ID5, IAB, Nielsen, Comscore, TTD) without it being a closed enum — new providers enter additively, same pattern as the top-level system axis. Most meaningful for id_graph / inferred where the vendor IS the meaningful identifier; less load-bearing for crosswalk where the published mapping is canonical regardless of who hosts it.

Updated examples now cover all 7 method values with realistic method_provider values — see commit ac4c6183 for the diff.

Anything else missing? With these additions the strawman is starting to feel covered for the ID-graph + measurement-currency cases; happy to keep iterating if you can think of method types the seven values still don't capture cleanly.

lukasz-pubx · 2026-05-16T22:40:39Z

+        "omb_msa"
+      ]
+    },
+    "id": {


I have a feeling that using the word id is confusing, should this be simply value?

Agreed — pushed the rename in 5a0ebfe. Same reasoning at three levels:

AdCP's existing *_id fields all point at AdCP-issued entities (account_id, brand_id, media_buy_id, signal_agent_segment_id) with AdCP lifecycle. Reusing id for external-system values invites buyer-agent code to conflate the two — same word, materially different semantics.

For taxonomies (IAB Audience Taxonomy paths like 4-1-2-3) and measurement currencies (methodology labels), value reads more naturally than id. IAB docs themselves say "values" / "terms."

value is cross-axis-neutral; id is identity-axis-coded and prejudices the primitive's read for non-identity uses.

The (system, value) tuple now replaces (system, id) throughout the schema description, the version description, the name description, the required array, and all examples.

@lukasz-pubx

Responding to @lukasz-pubx's review on this PR and his D1 vote in adcontextprotocol#4616: 1. id→value rename — "I have a feeling that using the word `id` is confusing" (PR comment r3253676569 + adcontextprotocol#4616 vote). Agreed. The primitive is intentionally cross-axis (markets, taxonomies, ID graphs, measurement currencies); `id` is identity-axis-coded and creates a connotation collision with existing AdCP `*_id` fields that point at AdCP-issued entities. `value` is system-axis-neutral and matches how taxonomy authors (IAB) refer to taxonomy terms. 2. Add `converted` to the fidelity enum (PR comment r3253674753 + adcontextprotocol#4616 vote). Covers the case where the destination uses a different system but the conversion is deterministic and lossless — Nielsen DMA → Comscore Market via canonical crosswalk, UID2 → ID5 via direct identity-graph link. Materially different from `approximated` (which implies drift) and from `exact` (which requires same system). Lukasz's framing: "similar to temp from C to F." 3. New `system-reference-conversion.json` (response to "I'm not sure if in this case it should be indicated somehow from what to what was the conversion and/or how was it done i.e. ID Graph / by name / custom" in adcontextprotocol#4616). Factors the question correctly: the fidelity enum says "how preserved?"; the conversion structure says "via what?". Buyers who want the simple signal stop at the enum; buyers who want to reason about the federation chain read the conversion block. Method enum: id_graph | name_match | crosswalk | custom. REQUIRED when fidelity is `converted`; OPTIONAL surfacing when `approximated` (so buyers can see WHY). All three changes preserve the strawman's "shape only, no row-level adoption" framing. Still a draft / discussion artifact, still cheap to re-draft if the WG counter-proposes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…lukasz-pubx

…lukasz-pubx Round 2 of @lukasz-pubx's feedback on adcontextprotocol#4616 / adcontextprotocol#4622 — three new method enum values + an optional method_provider field on the conversion structure. Method enum extends from 4 → 7 values, ordered by semantic grouping: identity-domain: id_graph, name_match structural-mapping: crosswalk, upscaled (new) probabilistic: inferred (new), projected (new) escape hatch: custom New value semantics: upscaled — Deterministic aggregation of finer-grained units into a coarser system (zip→DMA, county→state). Forward- lossless; inverse undefined. For modeled audiences where statistical validity depends on the finer granularity, deployment fidelity SHOULD be `approximated` even when the method is `upscaled`. inferred — ML-predicted or rule-based inference (device-graph → IAB taxonomy node; behavior → demographic membership). Probabilistic; no per-record correctness guarantee. method_provider SHOULD identify the model or vendor. projected — Statistical projection from a sample to a population (panel-projected reach, currency-projected impressions, audience extrapolation). Used primarily for measurement currencies and modeled audiences. method_details SHOULD describe sample size + weighting + confidence intervals where relevant. method_provider (new optional field): Opaque-by-convention string identifying the vendor or organization providing the conversion method (e.g. `LiveRamp`, `ID5`, `IAB`, `Nielsen`). Consumers MAY branch on well-known provider names but the field is not constrained to a closed enum — new providers enter the ecosystem additively. Particularly meaningful for `id_graph` and `inferred` methods; less load-bearing for `crosswalk` where the published mapping is canonical regardless of who hosts it. Updated examples cover all 7 method values with realistic method_provider values. Strawman remains shape-only — no row-level adoption, still draft, still cheap to re-draft. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lukasz-pubx

LGTM 👍

@bokelley

…umbrella update Round 3 of @bokelley's third-sibling framing on adcontextprotocol#4616 (issuecomment-4470049814): the epistemic-model table maps 1:1 onto delivery measurement, not just signals. The strawman's shape was already cross-axis, but the descriptions buried measurement as a secondary axis. Promoting it to first-class. Changes are description + examples only — no structural change to the primitive, the fidelity enum, or the conversion schema. system-reference.json: - Description rewritten to explicitly enumerate the four primary axes (identity / taxonomy / geographic / measurement). Calls out measurement currencies (Nielsen P18-49) AND methodology sources (panel / set_top_box / ACR / census / server_logs / SDK per the construction-methodology row in adcontextprotocol#4616) as distinct measurement- side use cases. - `system` examples list extended with `nielsen_p_18_49` and `measurement_source` to surface the measurement axis. - Two new top-level examples: a measurement currency (`nielsen_p_18_49:P18-49:2025`) and a measurement-source row (`measurement_source:set_top_box`) — the latter is the canonical adcontextprotocol#2041 use case. system-reference-fidelity.json: - Description expanded to name two consumer types: buyer agents activating signals, and measurement consumers accepting delivery reports. Adds "methodology sources" to the generalized axis list. system-reference-conversion.json: - Description expanded to name the two surfaces (signal deployments + delivery measurement). References adcontextprotocol#3877's `completion_source` qualifier as shipped prior art for the D3 seller-attested vs vendor-attested split. - New example: STB measurement projected to Nielsen P18-49 currency via Samba TV methodology — the measurement-side analog of the existing signal-side panel projection example. Changeset updated to reflect the broadened scope. Strawman remains shape-only — no row-level adoption, no structural change to the primitive. adcontextprotocol#2041 / adcontextprotocol#4472 / adcontextprotocol#4475 / identity-substrate RFCs still adopt independently against whatever shape D1 settles on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@SimonaNemes

@SimonaNemes endorsed round 2 of the strawman on adcontextprotocol#4616 (issuecomment-4493606229) and offered a sharpening of the inferred / projected distinction. Both methods can use ML; the distinction is the LEVEL at which uncertainty operates: inferred = ENTITY-LEVEL attribution from observed signals. Answers: "given these clues, who/what is this entity?" Uncertainty lies in the correctness of attributes assigned to individual entities. projected = POPULATION-LEVEL estimation from a sample. Answers: "given this sample, what should we expect at scale?" Uncertainty lies in the estimate itself, NOT in any individual entity's attributes. Same underlying data may drive both — the distinction is the question being answered and where the uncertainty lives. Changes: - enumDescriptions for `inferred` + `projected` rewritten with the entity-level vs population-level framing and explicit cross- reference to each other so consumers see the contrast. - New paired example: same device_graph input data routed through `inferred` (entity-level attribution with per-record confidence 0.78) and through `projected` (population-level estimate with 95% CI ±0.4pp). Makes the distinction concrete. - Changeset updated to reflect the round-4 sharpening + cite @SimonaNemes's framing. Strawman remains shape-only — no row-level adoption, no structural change. Sharpening the prose so the distinction survives consumer interpretation across implementer teams. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@bokelley

@bokelley posted WG-level acceptance of D1 on adcontextprotocol#4616 (issuecomment-4526559566): "Yes on D1 as a shared primitive for newly-modeled epistemic rows" — with four substantive tightening notes before he's comfortable merging. All four addressed in this revision; the changes are normative tightening, not structural. 1. VERSION SEMANTICS REVERSED (system-reference.json) Strawman previously said "treat omitted version as a wildcard match against any version." Bokelley: this recreates the fuzzy matching problem D1 was supposed to avoid. Reversed. Omitted `version` now means UNKNOWN / unpinned — NOT a wildcard. Exact equality requires (system, value, version) to all match for versioned systems. Row-level schemas MAY declare a system version-insensitive (UID2 / RampID etc.); otherwise omitted version is a buyer-decision point. 2. CONVERTED FIDELITY TIGHTENED (system-reference-fidelity.json) Reserved for deterministic AND row-semantics-preserving mappings. Deterministic ≠ lossless preservation: a crosswalk may be mathematically deterministic but lose semantic information the row's downstream consumers depend on (DMA vs Comscore Market methodologies; IAB v2 nodes that split or merge in v3). Sellers MUST NOT advertise `converted` when the conversion changes the row's meaningful semantics. 3. UPSCALED + CROSSWALK CAUTIONS (system-reference-conversion.json) `upscaled` typically pairs with `approximated` fidelity — undefined inverse means lost granularity. Only `converted` if the row explicitly declares granularity doesn't matter. Same caution for `crosswalk` — deterministic mapping is not automatically lossless semantic preservation. 4. INTEROP CAVEAT (system-reference.json description) Explicit note: the primitive alone does NOT create interop. Consuming row-level schemas MUST constrain or document which `system` values are meaningful for that row. D1's value is consistent SHAPE across rows, not a universal vocabulary. Plus updated the description to scope D1 explicitly to newly-modeled rows (per bokelley's "do not replace per-dimension schemas that already work" guardrail — geo_metros / geo_postal_areas stay where they are) and added @tescoboy's product-level audience-construction- metadata row as the fifth D1 customer per bokelley's endorsement. Strawman remains shape-only — no row-level adoption. After this round bokelley said he's "comfortable using it as the D1 foundation for adcontextprotocol#2041, the identity-substrate RFC, and the adcontextprotocol#4472 split." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@bokelley

Round 6 picks up the ads.txt-pattern thread on adcontextprotocol#4616 between @bokelley, @lukasz-pubx, @SimonaNemes, and Addie. Brian's philosophical question: "Should the protocol carry the weight of fully describing signals and methodology OR should this be something that the signal provider handles separately (thinking about how a media kit carries weight on the inventory side)?" Addie's recommendation in the thread: - Geo identifiers + activation fidelity: KEEP structured (bilaterally verifiable; ads.txt pattern). - Methodology fields: COLLAPSE to a pointer (link out to attested document). One field like methodology_url or audience_model_ref. The strawman primitives (system-reference + fidelity enum + conversion structure) already sit in the "keep structured" half of Addie's recommendation — bilaterally verifiable identifiers and activation- time facts. The conversion structure adds the round-6 anchor field for the "link out" half: method_doc_url (optional URI) — points at the seller's published methodology document (vendor's identity-graph page, published crosswalk specification, IAB v2-to-v3 migration map, etc.). Strictly informational; buyer agents MAY follow out-of-band to verify but MUST NOT branch on its content programmatically. Why at the primitive layer: - Gives downstream row-level RFCs (adcontextprotocol#4472, adcontextprotocol#2041, identity-substrate, tescoboy product-level) a canonical place to anchor link-out fields. Without it, each row reinvents the field name. - Picks up Addie's "collapse to a pointer" recommendation as an OPTION at the primitive layer; row-level RFCs that want to require methodology disclosure mark this required in their binding. - Complements existing method_details (free-text inline) — doc_url is the canonical-source pointer; method_details is the triage-time elaboration. Plus a description note that consuming rows MAY add their own row-level last_updated field on the row itself (per @SimonaNemes: signal-record freshness IS verifiable even though underlying methodology freshness isn't). Updated one example (LiveRamp UID2→ID5 conversion) to demonstrate the field in context. Strawman remains shape-only; no row-level adoption. Net delta: +1 optional field, +1 description sentence, +1 example field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bokelley · 2026-05-25T01:23:13Z

+      "method_details": "Canonical county-union mapping per Nielsen DMA 2024 spec + Comscore Market 2025-Q1 definition."
+    },
+    {
+      "from": { "system": "uid2", "value": "AAAA..." },


this conversion is not per-signal, it's happening upon signal activation. The problem is that there is not necessarily a signal conversion happening - in the programmatic context, LiveRamp has a graph, the publisher may have a graph, the SSP has a graph, the DSP has a graph. Crap, the agency has a graph. We don't have any idea how many times this conversion is happening, and honestly neither do the participants.

Main point is that we need to separate "what is this signal" from "how is this signal being connected to the delivery surface".

I'm not sure if I understand why does it matter that SSP, DSP and agency have a graph? isn't this information from singals provider's perspective? We can't control all the hops.

You're right that real chains have multiple parties (LiveRamp graph + publisher graph + SSP graph + DSP graph + agency graph), and no single party sees them all. The strawman as written reads like it's trying to model the full chain — which it can't, and shouldn't pretend to.

Round 7 (ad712795) tightens the description to single-party observable scope explicitly: this structure describes ONE party's observable conversion claim — typically the signals seller's in-agent translation, or the measurement vendor's projection. Downstream conversions are out of scope; the protocol intentionally doesn't pretend the seller can speak to what happens after activation handoff.

Lukasz's framing in this thread is the right one — "we can't control all the hops" is exactly the design constraint. The structure is useful for the conversions one party CAN describe (signal-side identity translation before destination handoff, measurement vendor's panel-to-currency projection). It's not useful for, and shouldn't try to model, the multi-hop chain that follows.

If even single-party conversion descriptions don't earn their keep, then this entire schema collapses to "trust the seller's free-text" and the structure is dead weight. I think the single-party case is genuinely useful (especially for measurement projection and signal-provider in-agent translation), but if you read the multi-hop reality as making even that case suspect, happy to drop the entire conversion structure in round 8 and keep just system-reference + system-reference-fidelity.

bokelley · 2026-05-25T01:24:52Z

+  "description": "Reference to a value within a named external system. Four primary axes of use: (1) **identity** — UID2, ID5, RampID, hashed-email, MAID; (2) **taxonomy** — IAB Audience Taxonomy, IAB Content Taxonomy, vendor-specific category trees; (3) **geographic** — Nielsen DMA, Comscore Market, OMB MSA, postcode aggregates (NOTE: where per-dimension geographic schemas already exist in AdCP — `geo_metros`, `geo_postal_areas` — they are the right shape for current geo targeting; D1 does not replace them. D1 applies to geographic surfaces that lack a per-dimension schema today); (4) **measurement** — measurement currencies (Nielsen P18-49), methodology sources (panel / set_top_box / ACR / census / server_logs / SDK per the construction-methodology row in #4616), vendor measurement schemes. Provides a single canonical shape `{system, value, version?, name?}` reused across signals, deployments, buy-terms, delivery reporting, and measurement attribution wherever a value is defined against an external reference frame. The `system` axis is intentionally an open string at the primitive level; per-use constraints (closed enums, vendor allowlists) belong in the consuming schema's `oneOf` or `enum`, not here. **Important: the primitive shape alone does NOT create interoperability — consuming row-level schemas MUST constrain or document which `system` values are meaningful for that row.** D1's value is consistent SHAPE across rows, not a universal cross-row vocabulary. Decision D1 from the signal-and-measurement epistemic-model umbrella (see issue #4616); per @bokelley's WG-acceptance comment, D1 is scoped to newly-modeled rows (identity substrate, taxonomy / ranking references, measurement source / methodology, product-level seller-built audience metadata) and does NOT replace per-dimension schemas that already work.",
+  "type": "object",
+  "properties": {
+    "system": {


calling these "systems" is confusing. Ramp ID is clearly an identifier and is defined elsewhere in the protocol. Taxonomies like are too. Not clear to me how this generalized system helps us. We do have enums with the various geo metros providers which makes sense: we know that geo metro must be a supported type there.

We could rename system to type? But I think system illustrates the fact that it's "more complicated" than just a label.

Fair pushback. The primitive's value isn't in wrapping RampID — it's in giving union-axis rows a single shape when the row can carry any of N systems.

Round 7 (ad712795) leads the description with that:

The primitive's purpose is union-axis rows (identity substrate, measurement source, PBA taxonomy) where a single field can carry any of N systems with the same comparator semantics. For single-axis rows where only one system applies, inline per-dimension fields remain simpler — the primitive is overhead. The strawman explicitly does NOT replace per-dimension schemas like geo_metros, geo_postal_areas, or existing ramp_id.

Where the primitive earns its keep (and where it doesn't):

Case Without primitive With primitive

Single-axis row: signal always references RampID → ramp_id field, full stop inline ramp_id: "..." — simple ✅ identity: {system: "ramp_id", value: "..."} — wrapper overhead ❌

Union-axis row: identity substrate can be UID2 OR RampID OR ID5 OR custom oneOf: [{ uid2: ... }, { ramp_id: ... }, { id5: ... }, { custom: ... }] — N inline shapes per row identity: SystemReference — ONE shape ✅

Union-axis row: measurement source can be set_top_box OR ACR OR census OR panel Same N-way oneOf repeated in #2041 Same SystemReference ✅

Union-axis row: PBA ranking reference can be IAB v3 OR vendor-X OR custom audience model Same N-way oneOf repeated in #4472 Same SystemReference ✅

The motivating cases that opened #4616 (#4472, #2041, identity-substrate, @tescoboy's product-level) are all union-axis. If the union-axis case turns out to be thinner than the thread assumed — i.e., if each row really wants a single dominant system and the union case is hypothetical — then the primitive's surface area isn't justified and the right answer is per-row inline shapes plus the row defining its own enum. In that case I'd close this PR and let row-RFCs proceed independently.

But I think the union case is real for at least identity-substrate (no consensus on which graph wins) and measurement-source (the construction-methodology row is genuinely heterogeneous). Curious whether you read those rows as union-axis or as eventually-single-axis when the WG picks winners.

On @lukasz-pubx's system → type rename suggestion: round 7 acknowledges it in a naming note but keeps system — type is overloaded across AdCP for general type discrimination on discriminated unions, and system carries the connotation we want (named external reference frame with its own lifecycle).

Agreed system carries the "more complicated than a label" connotation. Keeping it — round 7 (ad712795) adds a naming note in the description acknowledging the type alternative was considered but type is overloaded across AdCP for general type discrimination on discriminated unions, while system cleanly says "named external reference frame with its own lifecycle." Thanks for the framing.

@bokelley

…views (round 7) @bokelley posted two substantive line-level reviews on 2026-05-25 that went deeper than the round-5 tightening notes — they raised abstraction-level questions about whether the primitive justifies its surface area. 1. system-reference.json:8 — "calling these 'systems' is confusing. RampID is clearly an identifier and is defined elsewhere in the protocol. Taxonomies like are too. Not clear to me how this generalized system helps us." 2. system-reference-conversion.json:78 — "this conversion is not per-signal, it's happening upon signal activation. The problem is that there is not necessarily a signal conversion happening - in the programmatic context, LiveRamp has a graph, the publisher may have a graph, the SSP has a graph, the DSP has a graph. Crap, the agency has a graph. We don't have any idea how many times this conversion is happening, and honestly neither do they." Description-only sharpening (no schema shape changes): system-reference.json — Description rewritten to lead with the UNION-AXIS value proposition. The primitive's purpose is union- axis rows (identity substrate, measurement source, PBA taxonomy) where a single field can carry any of N systems with the same comparator semantics. Per @bokelley's "RampID is defined elsewhere" point: yes, and that's fine — D1 does NOT replace per-dimension schemas where they exist (geo_metros, ramp_id, existing taxonomy schemas). For single-axis rows, inline per- dimension fields are simpler. D1 earns its keep specifically on union-axis rows that don't yet have a shape. Without it, each union-axis row independently reinvents oneOf discriminators across N inline shapes; with it, ONE comparator + ONE extension story + ONE schema slot. Also addresses @lukasz-pubx's system → type rename suggestion in a naming note — keeping `system` because the connotation of "named external reference frame with its own lifecycle" is what we want vs. `type` which is overloaded across AdCP for general type discrimination on discriminated unions. system-reference-conversion.json — Description rewritten to clarify SINGLE-PARTY OBSERVABLE SCOPE. Real programmatic chains have multiple parties (publisher / SSP / DSP / agency / vendor) each potentially performing their own conversions; no single party observes the full chain. The structure describes ONE party's observable conversion (signals seller's in-agent translation in the deployment case; measurement vendor's projection in the reporting case), NOT the multi-hop chain. Downstream conversions are out of scope, observed by other parties; the protocol intentionally doesn't pretend the seller can speak to them. Buyer agents reading this should understand they're seeing one party's view, not the full provenance from origin to activation. Strawman remains shape-only — no row-level adoption, no field additions / removals / renames. The schema is unchanged; descriptions catch up to what the structure actually means. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

EvgenyAndroid mentioned this pull request May 16, 2026

RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616

Open

lukasz-pubx reviewed May 16, 2026

View reviewed changes

evgen and others added 2 commits May 16, 2026 19:06

lukasz-pubx approved these changes May 16, 2026

View reviewed changes

evgen and others added 4 commits May 17, 2026 07:08

bokelley reviewed May 25, 2026

View reviewed changes

Group	Methods
Identity-domain	`id_graph`, `name_match`
Structural-mapping	`crosswalk`, `upscaled` (new)
Probabilistic	`inferred` (new), `projected` (new)
Escape hatch	`custom`

Case	Without primitive	With primitive
Single-axis row: signal always references RampID → `ramp_id` field, full stop	inline `ramp_id: "..."` — simple ✅	`identity: {system: "ramp_id", value: "..."}` — wrapper overhead ❌
Union-axis row: identity substrate can be UID2 OR RampID OR ID5 OR custom	`oneOf: [{ uid2: ... }, { ramp_id: ... }, { id5: ... }, { custom: ... }]` — N inline shapes per row	`identity: SystemReference` — ONE shape ✅
Union-axis row: measurement source can be set_top_box OR ACR OR census OR panel	Same N-way `oneOf` repeated in #2041	Same SystemReference ✅
Union-axis row: PBA ranking reference can be IAB v3 OR vendor-X OR custom audience model	Same N-way `oneOf` repeated in #4472	Same SystemReference ✅

Conversation

EvgenyAndroid commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's in the PR

1. static/schemas/source/core/system-reference.json

2. static/schemas/source/enums/system-reference-fidelity.json

3. static/schemas/source/core/system-reference-conversion.json

What's NOT in this PR (deliberately)

Open questions for WG review

Test plan

Cross-references

Uh oh!

EvgenyAndroid commented May 16, 2026

Uh oh!

lukasz-pubx May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukasz-pubx left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukasz-pubx May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukasz-pubx May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

EvgenyAndroid commented May 16, 2026 •

edited

Loading

1. `static/schemas/source/core/system-reference.json`

2. `static/schemas/source/enums/system-reference-fidelity.json`

3. `static/schemas/source/core/system-reference-conversion.json`

lukasz-pubx May 16, 2026 •

edited

Loading

lukasz-pubx May 25, 2026 •

edited

Loading

lukasz-pubx May 25, 2026 •

edited

Loading