Skip to content

RFC 026: Semi-structured variant logical values #52

@dannymeijer

Description

@dannymeijer

Use this issue to track RFC 026 for semi-structured variant logical values.

If accepted, we typically ask for a PR adding an RFC under docs/rfcs/: copy TEMPLATE.md, name it NNN_short_slug.md, and follow that structure. For workflow and conventions, see Writing InQL RFCs.

InQL RFCs cover the relational layer, dataset carriers, Substrait contract, query {} syntax, execution context, and related docs.
Pure Incan language/compiler changes without InQL impact usually belong in the Incan repository instead.

Area

  • Specification (RFCs)
  • Package & tests
  • Documentation

Summary

RFC 026 defines a first-class semi-structured variant logical value model for InQL. Variant values are distinct from ordinary str or bytes payloads and give predicates such as typeof, is_array, is_object, is_integer, is_timestamp, and is_null_value a precise semantic home.

Motivation

RFC 022 added string-backed JSON, CSV, URL, and hash format helpers. During review, the variant-style predicate names were identified as important but unsafe to expose as JSON-text parser shortcuts. Without a variant logical value model, accepting those names would either bake in string-specific behavior or let backend adapters define incompatible semantics. This RFC records the required future work instead of leaving vague "future scope" in the format-function reference.

Proposal sketch

Add RFC 026 under docs/rfcs/026_semi_structured_variant_values.md. The RFC proposes:

  • a semi-structured variant logical value family with at least null, boolean, integer, floating point, string, timestamp, array, and object kinds;
  • predicate and inspection helpers that operate on variant expressions, not raw strings;
  • explicit separation between SQL null and semi-structured variant null;
  • explicit variant parse/cast helpers for starting from JSON text;
  • Prism/Substrait/backend rules that preserve variant type identity or reject unsupported operations.

Example shape from the draft RFC:

from pub::inql.functions import col, is_array, is_null_value, parse_variant_json, typeof, variant_get

events_with_payload = events.with_column("payload_value", parse_variant_json(col("payload")))

projected = (
    events_with_payload
        .with_column("payload_kind", typeof(col("payload_value")))
        .with_column("is_items_array", is_array(variant_get(col("payload_value"), "$.items")))
        .with_column("deleted_was_json_null", is_null_value(variant_get(col("payload_value"), "$.deleted_at")))
)

Alternatives considered

  • Make typeof and is_array parse strings directly. Rejected because it couples predicate semantics to JSON text parsing and conflicts with typed variant values.
  • Change RFC 022 JSON helpers to return variants. Rejected because it would silently change the meaning of string-backed payload helpers.
  • Expose backend-native variant functions directly. Rejected because backend-native null rules, path syntax, and type names differ.

Impact / compatibility

This RFC is additive. Existing RFC 022 helpers remain string-backed. Authors who want variant semantics would opt into variant-returning helpers or explicit casts. Backend adapters must preserve SQL-null versus variant-null behavior and variant logical type identity, or reject unsupported operations.

Implementation notes (optional)

The draft RFC is linked from PR #49. Implementation should happen separately from RFC 022 because it needs type metadata, registry metadata, Prism validation, Substrait extension type handling, backend capability checks, and focused tests for null and type-predicate behavior.

Checklist

  • I checked for an existing RFC or issue covering this.
  • I can describe how this impacts existing code and how to migrate (if needed).

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCRFC design and planningdocumentationImprovements or additions to documentationpackageLibrary source, tests, incan.tomlspecificationdocs/rfcs/ normative RFCs

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions