RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs

## Summary

AdCP's signal schema today models the **publishing** and **authorization** of a signal — who published it, who's allowed to resell it, what's discoverable about it. It does not model the **epistemic content** of the signal: what's being counted, how it was built, against what reference system, in whose vocabulary, with what verifiability.

Two in-flight RFCs (#4472, #4475) and a follow-up question from @lszczesiak on #4475 are each picking off one piece of this gap. This issue exists to frame the full picture so each row-shaped RFC can be evaluated in context, and so the WG can make a small number of shared decisions once instead of re-litigating them per-RFC.

This is not a competing RFC. It does not propose schemas. It exists to organize.

## What we already have

The publishing half of the framework is solid:

- [`signal-source`](static/schemas/source/enums/signal-source.json) — `catalog` (verifiable via the provider's adagents.json) vs `agent` (trust-based)
- [`signal-catalog-type`](static/schemas/source/enums/signal-catalog-type.json) — `marketplace` (resold third-party) / `custom` (agent-built composite) / `owned` (first-party)
- [`signal-definition`](static/schemas/source/core/signal-definition.json) — published via adagents.json; carries `value_type`, `restricted_attributes`, `policy_categories`, `allowed_values`, `range`
- `adagents.json` `authorized_agents` — the chain-of-authority primitive

A buyer agent can today verify *who is authorized to sell what*. That part is done and good.

## What's not modeled

| Layer | Question it answers | Status |
|---|---|---|
| Publisher identity | Who stands behind this signal? | ✅ `adagents.json` |
| Authorization chain | Who can sell / activate it? | ✅ `authorized_agents` |
| Signal definition | What does it claim to be? | ✅ structurally; ❌ shared vocabulary |
| **Identity substrate** | What unit is being counted — household, person, device, browser, hashed email, ID-graph node? | ❌ Not modeled |
| **Construction methodology** | Deterministic match, panel projection, modeled, lookalike off what seed, at what freshness, with what cell confidence? | ❌ Not modeled (half of #4472) |
| **Reference frame** | Against whose licensed system are boundaries defined — Nielsen DMA, IAB taxonomy v3, RampID, currency, panel? | ❌ Not modeled (#4475) |
| **Cross-system fidelity** | What survives the trip from discovery to activation when the destination uses a different reference frame? | ❌ Not modeled (deployment-side half of #4475) |
| **Verifiability of content** | Can a third party validate claimed counts, freshness, or identity unit? | ❌ Only authorization is verifiable today |

## How the in-flight RFCs slot in

- **#4472** — proposes \`audience_model\` covering model vintage, seed data range, ranking reference, cell confidence, market, station, daypart, demographic. Mixes the **construction methodology** row (general — applies to any modeled audience) with **broadcast-specific taxonomy** (station/daypart). Worth splitting.
- **#4475** — proposes structured market identifier \`{ system, id, name }\` on the signal item and \`market_fidelity\` (\`exact\` / \`approximated\` / \`unsupported\`) on \`deployments[]\`. Picks the **reference frame** row and the **cross-system fidelity** row, scoped to geography.
- **@lszczesiak's comment on #4475** — observes that the same fidelity mechanic applies to ID-graph translation. That's the **identity substrate** row.

Each is correct on its own row. The umbrella exists so they get factored consistently rather than producing three incompatible per-dimension types.

## Three decisions the WG needs to make once

Every row-level RFC will keep re-asking these until they're settled:

### 1. Reference-system shape

Do we adopt a single primitive — roughly \`{ system, id, version? }\` — reused across markets, ID graphs, taxonomies, measurement currencies, panel frames? Or do we ship per-dimension types (\`market\`, \`id_graph\`, \`taxonomy_version\`, ...) that happen to look similar?

A shared primitive forces a consistent enum-extension story across signals, deployments, buy terms, and delivery reporting. Per-dimension types are easier to ship one at a time but invite drift.

### 2. Vocabulary policy

When a signal claims "Female 25-54," whose vocabulary is that?

- **Adopt** a canonical taxonomy (e.g., IAB Audience Taxonomy) and require providers to map to it
- **Carry** the vocabulary by reference (\`{ taxonomy: { system, version }, value }\`) — the same shape as the reference-frame decision above
- **Accept fragmentation** and rely on description text (the status quo)

Without a decision here, structured fields just relocate the fragmentation from \`description\` to typed strings.

### 3. Verifiability model

Authorization is verifiable today (third-party fetch of adagents.json). Content claims — count, freshness, identity unit, model vintage — are self-reported and unfalsifiable. Is the long-term direction:

- **Declarative-only** (what every RFC in flight currently assumes)
- **Attestation-capable** — a primitive for signed claims from independent measurers, layered on top of the declarative fields

The row-level RFCs don't need to solve this, but the umbrella should declare which direction we're heading so we don't paint ourselves into a corner.

## Non-goals

- Not a redesign of \`adagents.json\`, \`signal-definition\`, or \`signal-source\` / \`signal-catalog-type\`. Those are the foundation; this fits underneath them.
- Not a proposal for new schemas — that's what the row-shaped RFCs (#4472, #4475, future ones) are for.
- Not a proposal to block the in-flight RFCs. They should proceed. The umbrella is context for evaluating them, not a gating dependency.

## Suggested sequencing

1. **Now** — WG agreement on the reference-system shape (Decision 1). This unblocks #4475 and any future RFC on identity substrate / taxonomy version.
2. **Now-ish** — split #4472 along the construction-methodology / broadcast-taxonomy seam. Land the general half against the construction-methodology row.
3. **Next** — identity-substrate RFC (@lszczesiak's prompt), using whatever shape Decision 1 produces.
4. **Later** — vocabulary policy (Decision 2) once we have ≥2 row-level RFCs in flight that hit the same vocabulary surface.
5. **Research** — verifiability / attestation model (Decision 3). Not for this cycle.

## Related and adjacent

- The same \`{ system, id, version }\` shape is already showing up in adjacent work — measurement currency in buy terms negotiation, ID-graph references in identity-match flows. Worth deciding the shape once across all of them.
- Cross-links: #4472, #4475

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616

Summary

What we already have

What's not modeled

How the in-flight RFCs slot in

Three decisions the WG needs to make once

1. Reference-system shape

2. Vocabulary policy

3. Verifiability model

Non-goals

Suggested sequencing

Related and adjacent

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Layer	Question it answers	Status
Publisher identity	Who stands behind this signal?	✅ `adagents.json`
Authorization chain	Who can sell / activate it?	✅ `authorized_agents`
Signal definition	What does it claim to be?	✅ structurally; ❌ shared vocabulary
Identity substrate	What unit is being counted — household, person, device, browser, hashed email, ID-graph node?	❌ Not modeled
Construction methodology	Deterministic match, panel projection, modeled, lookalike off what seed, at what freshness, with what cell confidence?	❌ Not modeled (half of #4472)
Reference frame	Against whose licensed system are boundaries defined — Nielsen DMA, IAB taxonomy v3, RampID, currency, panel?	❌ Not modeled (#4475)
Cross-system fidelity	What survives the trip from discovery to activation when the destination uses a different reference frame?	❌ Not modeled (deployment-side half of #4475)
Verifiability of content	Can a third party validate claimed counts, freshness, or identity unit?	❌ Only authorization is verifiable today

RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616

Description

Summary

What we already have

What's not modeled

How the in-flight RFCs slot in

Three decisions the WG needs to make once

1. Reference-system shape

2. Vocabulary policy

3. Verifiability model

Non-goals

Suggested sequencing

Related and adjacent

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions