Skip to content

RFC: Signal epistemic model — framing the questions behind in-flight signal RFCs #4616

@bokelley

Description

@bokelley

Summary

AdCP's signal schema today models the publishing and authorization of a signal — who published it, who's allowed to resell it, what's discoverable about it. It does not model the epistemic content of the signal: what's being counted, how it was built, against what reference system, in whose vocabulary, with what verifiability.

Two in-flight RFCs (#4472, #4475) and a follow-up question from @lszczesiak on #4475 are each picking off one piece of this gap. This issue exists to frame the full picture so each row-shaped RFC can be evaluated in context, and so the WG can make a small number of shared decisions once instead of re-litigating them per-RFC.

This is not a competing RFC. It does not propose schemas. It exists to organize.

What we already have

The publishing half of the framework is solid:

  • signal-sourcecatalog (verifiable via the provider's adagents.json) vs agent (trust-based)
  • signal-catalog-typemarketplace (resold third-party) / custom (agent-built composite) / owned (first-party)
  • signal-definition — published via adagents.json; carries value_type, restricted_attributes, policy_categories, allowed_values, range
  • adagents.json authorized_agents — the chain-of-authority primitive

A buyer agent can today verify who is authorized to sell what. That part is done and good.

What's not modeled

Layer Question it answers Status
Publisher identity Who stands behind this signal? adagents.json
Authorization chain Who can sell / activate it? authorized_agents
Signal definition What does it claim to be? ✅ structurally; ❌ shared vocabulary
Identity substrate What unit is being counted — household, person, device, browser, hashed email, ID-graph node? ❌ Not modeled
Construction methodology Deterministic match, panel projection, modeled, lookalike off what seed, at what freshness, with what cell confidence? ❌ Not modeled (half of #4472)
Reference frame Against whose licensed system are boundaries defined — Nielsen DMA, IAB taxonomy v3, RampID, currency, panel? ❌ Not modeled (#4475)
Cross-system fidelity What survives the trip from discovery to activation when the destination uses a different reference frame? ❌ Not modeled (deployment-side half of #4475)
Verifiability of content Can a third party validate claimed counts, freshness, or identity unit? ❌ Only authorization is verifiable today

How the in-flight RFCs slot in

Each is correct on its own row. The umbrella exists so they get factored consistently rather than producing three incompatible per-dimension types.

Three decisions the WG needs to make once

Every row-level RFC will keep re-asking these until they're settled:

1. Reference-system shape

Do we adopt a single primitive — roughly `{ system, id, version? }` — reused across markets, ID graphs, taxonomies, measurement currencies, panel frames? Or do we ship per-dimension types (`market`, `id_graph`, `taxonomy_version`, ...) that happen to look similar?

A shared primitive forces a consistent enum-extension story across signals, deployments, buy terms, and delivery reporting. Per-dimension types are easier to ship one at a time but invite drift.

2. Vocabulary policy

When a signal claims "Female 25-54," whose vocabulary is that?

  • Adopt a canonical taxonomy (e.g., IAB Audience Taxonomy) and require providers to map to it
  • Carry the vocabulary by reference (`{ taxonomy: { system, version }, value }`) — the same shape as the reference-frame decision above
  • Accept fragmentation and rely on description text (the status quo)

Without a decision here, structured fields just relocate the fragmentation from `description` to typed strings.

3. Verifiability model

Authorization is verifiable today (third-party fetch of adagents.json). Content claims — count, freshness, identity unit, model vintage — are self-reported and unfalsifiable. Is the long-term direction:

  • Declarative-only (what every RFC in flight currently assumes)
  • Attestation-capable — a primitive for signed claims from independent measurers, layered on top of the declarative fields

The row-level RFCs don't need to solve this, but the umbrella should declare which direction we're heading so we don't paint ourselves into a corner.

Non-goals

Suggested sequencing

  1. Now — WG agreement on the reference-system shape (Decision 1). This unblocks RFC: Structured geographic market identifiers for signals with market-bounded audiences #4475 and any future RFC on identity substrate / taxonomy version.
  2. Now-ish — split RFC: Structured metadata fields for Projected Broadcast Audiences (PBAs) signals #4472 along the construction-methodology / broadcast-taxonomy seam. Land the general half against the construction-methodology row.
  3. Next — identity-substrate RFC (@lszczesiak's prompt), using whatever shape Decision 1 produces.
  4. Later — vocabulary policy (Decision 2) once we have ≥2 row-level RFCs in flight that hit the same vocabulary surface.
  5. Research — verifiability / attestation model (Decision 3). Not for this cycle.

Related and adjacent

Metadata

Metadata

Assignees

No one assigned

    Labels

    claude-triagedIssue has been triaged by the Claude Code triage routine. Remove to re-triage.rfcProtocol change — auto-adds to roadmap boardsignalsIssue concerns the signals protocol domain

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions