Skip to content

feat(executor): physical planning bridge to EPIC-03 backend + local query executor (STORY-04.6.2)#414

Merged
khaines merged 6 commits into
mainfrom
khaines/feat-04.6.2-physical-planning
Jul 3, 2026
Merged

feat(executor): physical planning bridge to EPIC-03 backend + local query executor (STORY-04.6.2)#414
khaines merged 6 commits into
mainfrom
khaines/feat-04.6.2-physical-planning

Conversation

@khaines

@khaines khaines commented Jul 3, 2026

Copy link
Copy Markdown
Owner

Summary

STORY-04.6.2 (#174) — the physical-planning bridge that maps Core's LogicalPlan → Engine's EPIC-03 executable operators, making a DeltaSharp query run end-to-end for the first time.

  • PhysicalPlanner + PhysicalPlan model (src/DeltaSharp.Executor/Physical/): optimized/analyzed LogicalPlan → tree of physical operators, one strategy per supported M1 node.
  • Execution via Engine's IExecutionBackend (interpreted vectorized default per ADR-0001), pulling ColumnBatches.
  • ColumnBatchRow materialization (RowMaterializer): null-aware, DataType-mapped per ADR-0002.
  • LocalQueryExecutor : IQueryExecutor registered into SparkSession via a [ModuleInitializer], so a session in any app referencing DeltaSharp.Executor executes for real.
  • In-memory relation/scan fixture for end-to-end tests (public read-door createDataFrame is [Story] STORY-04.1.2: Read door and DataFrame creation from local inputs #158, deferred).
  • Design doc: docs/engineering/design/physical-planning.md (doc-first).

LogicalPlan node → EPIC-03 operator mapping

LogicalPlan node Physical node Engine operator
relation / scan ScanPlan InMemoryScanOperator
Project ProjectPlan ProjectOperator
Filter FilterPlan FilterOperator
Aggregate AggregatePlan AggregateOperator
Join (equi) JoinPlan JoinOperator
Sort SortPlan SortOperator
Limit LimitPlan (bridge) batch truncation via SelectionVector
Distinct lowered → ProjectPlan(AggregatePlan(group-by-all, COUNT(*))) AggregateOperator
Union UnionPlan (bridge) batch concat (+ identity reschema ProjectOperator)
unsupported / theta / cross join UnsupportedPlanException (deterministic)

AC → test map

  • AC: supported node → executable operatorPhysicalPlanShapeTests (shape per node) + EndToEndExecutionTests (FilterThenProject, GroupByAgg, InnerJoin, OrderByDescThenLimit, Limit, Distinct, Union, compose) asserting exact Row values + schema over the real backend.
  • AC: unsupported node → deterministic diagnosticUnsupportedPlanTests (cross join, no-condition cartesian, theta predicate; determinism check).
  • Count parityEndToEndExecutionTests.Count_MatchesCollectCount_*.
  • Backend parity (ADR-0001)InterpretedAndDefaultBackends_ProduceIdenticalRows.
  • Registration seamSessionRegistrationTests (session resolves LocalQueryExecutor; executes end-to-end through it).

Validation

  • dotnet build -c Release -warnaserror — clean (0/0).
  • dotnet testall green: Executor 29, Engine 2781, Core 573 (×net8.0/net10.0). No regressions.
  • dotnet format --verify-no-changes — clean.
  • Executor is net10.0, non-packable (no PublicAPI baseline there).

⚠️ #172/#173 rebase reconciliation points (seam is stubbed here; reconcile on merged origin/main)

Closes #174

…uery executor (STORY-04.6.2)

Rebased onto merged main (#172 optimizer, #173/#177 actions+Row+seam). This lane
now consumes the REAL merged Core seam instead of its earlier stand-ins:

- LocalQueryExecutor implements DeltaSharp.Execution.IQueryExecutor
  (Collect(LogicalPlan)->IReadOnlyList<Row>, Count(LogicalPlan)->long).
- RowMaterializer builds the real DeltaSharp.Row from ColumnBatch results using
  the analyzed plan's output StructType, null-aware and DataType-mapped.
- ExecutorRegistration wires the real SparkSession.RegisterQueryExecutorFactory
  hook via InternalsVisibleTo("DeltaSharp.Executor").

The #414 Core-seam stand-ins (Row, IQueryExecutor, UnregisteredQueryExecutor,
SparkSession hook edits, InternalsVisibleTo, PublicAPI.Unshipped Row entries)
were dropped; those types are owned by merged #173. This PR adds only the
Executor-side physical-planning bridge and its tests — no Core changes.

The optimizer is intentionally NOT wired into execution (respects #415);
LocalQueryExecutor.Collect receives the analyzed plan and does physical planning
only.

Closes #174

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Ken Haines <1144092+khaines@users.noreply.github.com>
khaines and others added 3 commits July 3, 2026 07:15
…dup-name diagnostic, coverage + docs (STORY-04.6.2 council)

MEDIUM dup-name->UnsupportedPlanException(#419); decimal new-decimal(scale-preserving)+reject scale>28/>96-bit; Date/Timestamp->DateOnly/DateTime(UTC) roundtrip; batch-ownership invariant(#420); LogicalOutput self-check strengthen(name/type, #421); type-matrix + selection + all-null/empty + unsupported-expr tests; nits (dead AnsiMode, TryGetBatches nullable out, no-op discard, ConcurrentDictionary); doc Row->#177 provenance + xrefs + backend-parity reframe

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Ken Haines <1144092+khaines@users.noreply.github.com>
…ccurate backend-parity framing + timestamp guard (STORY-04.6.2 council r2)

MEDIUM: PlanDistinct now derives a collision-proof internal probe name (UniqueProbeName, reserved '__distinct_count' with a numeric suffix on the improbable child-schema collision) instead of a hardcoded 'count', so df.GroupBy(x).Count().Distinct() dedups and returns rows (Spark parity) rather than throwing SchemaValidationException on the intermediate [x, count, count] schema.

MEDIUM: reframed the backend-parity check (physical-planning.md §10, LocalQueryExecutor.OptionsFor, EndToEndExecutionTests) as a real interpreted-vs-compiled EXPRESSION-evaluation differential: both selections share InterpretedOperators dispatch, Default resolves to CompiledBackend (ADR-0001 codegen tier, STORY-03.4.2) which fuses scalar expressions via Expression.Compile under dynamic code (identical under AOT). Dropped every closed-#148 citation; operator-level codegen referenced as out of scope (ADR-0001 Follow-ups / EPIC-13, #309/#310).

Hardening: ReadTimestamp now guards the epoch-micros -> DateTime conversion (checked *10 + range check) and throws a deterministic UnsupportedPlanException instead of a raw ArgumentOutOfRangeException / silent mis-decode, mirroring the decimal path. Added a multi-batch accumulation test enforcing the PhysicalRuntime.Run batch-ownership invariant (2+ source batches -> all rows in global order after drain+dispose). Tightened PhysicalRuntime wording (fresh, independently-owned output; ExecutionContext owns an inert lazy spill store today, #420).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Ken Haines <1144092+khaines@users.noreply.github.com>
…p out-of-range guard

Round-3 council fix for PR #414 (STORY-04.6.2 / #174).

Writer and QueryExec re-score seats both independently held at 4/5 on the
round-2 backend-parity reframe, which overstated the executor end-to-end
check (InterpretedAndDefaultBackends_ProduceIdenticalRows) as a "genuine
interpreted-vs-compiled expression differential." That is false against the
code: both CompiledBackend.Open and InterpretedVectorizedBackend.Open delegate
to the same InterpretedOperators.Open, which always builds interpreted
ExpressionEvaluators (backend name only attributes exceptions). CompiledBackend's
Expression.Compile scalar fusion (STORY-03.4.2) is not wired into the operator
Open() path, so both selections currently run byte-identical interpreted code —
a plumbing/smoke cross-check, not a differential. The genuine expression-level
differential lives in the Engine BackendParityOracle (#154), which calls
BuildExpressionEvaluator directly. Reverts the three framing sites to the
accurate wording (design doc §10, LocalQueryExecutor.OptionsFor comment,
EndToEndExecutionTests comment), keeping the corrected forward trackers
(EPIC-13 / #309/#310, no stale #148).

Also documents the already-implemented TimestampType out-of-range guard
(RowMaterializer.ReadTimestamp -> deterministic UnsupportedPlanException, backed
by Timestamp_OutOfDateTimeRange_ThrowsDeterministicUnsupported) in design doc
§7/§8, mirroring the decimal not-representable diagnostic (Writer Finding 2).

Docs/comments only — no code behavior change. Build clean both TFMs (-warnaserror),
format clean, Executor 47/47, PublicAPI delta empty.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Ken Haines <1144092+khaines@users.noreply.github.com>
khaines and others added 2 commits July 3, 2026 08:45
…och-day (raw-exception leak)

Red-team MISS-FOUND on PR #414 (STORY-04.6.2 / #174).

RowMaterializer.ReadDate had the same raw-exception-leak class the council fixed
for Timestamp and Decimal, but the guard was never applied to Date: an epoch-day
whose date falls outside DateOnly's representable range (e.g. int.MaxValue,
~5.9M years past 1970) made UnixEpochDate.AddDays(epochDay) throw a raw
System.ArgumentOutOfRangeException straight to the caller, breaking the bridge's
contract that an unrepresentable materialized value must surface a deterministic
UnsupportedPlanException (as Timestamp and Decimal already do).

Fix mirrors the ReadTimestamp guard exactly: catch the ArgumentOutOfRangeException
from DateOnly.AddDays and rethrow OutOfRangeDate -> deterministic
UnsupportedPlanException naming the offending epoch-day and DateOnly. Materializer
sweep confirms this was the only remaining unguarded path: Boolean/Byte/numeric
are exact primitive reads, Binary is a byte copy, String uses
Encoding.UTF8.GetString (replacement fallback, never throws on bad bytes), and
Decimal/Timestamp were already guarded.

Adds Date_OutOfDateOnlyRange_ThrowsDeterministicUnsupported (int.MaxValue) and
Date_MinIntEpochDay_ThrowsDeterministicUnsupported (int.MinValue) proving the
guard, and documents the Date out-of-range diagnostic in design doc §7/§8
alongside the Timestamp/Decimal ones.

Executor-only, no Core/Engine change. Build clean both TFMs (-warnaserror),
format clean, Executor 49/49 (was 47), PublicAPI delta empty.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Ken Haines <1144092+khaines@users.noreply.github.com>
…-over-eager)

Codifies the positive-boundary property the council + red-team verified for the
ReadDate out-of-range guard (Quality re-score nit): the extreme IN-RANGE
epoch-days materialize to the correct calendar date, while one day past either
bound is a deterministic UnsupportedPlanException.

DateOnly's representable window (0001-01-01..9999-12-31) is exactly epoch-days
[-719162, 2932896] (days since 1970-01-01); the test asserts both extremes
round-trip to DateOnly.Min/MaxValue and that maxEpochDay+1 / minEpochDay-1 each
throw. Guards against a future refactor making the guard over-eager (rejecting
valid dates) — the exact concern several council seats verified analytically
(QueryExec: full-int-range proof; Columnar: exhaustive 2^32 sweep, 0 silent
in-range returns).

Test-only, no production change (production behavior certified at 0b84922:
unanimous 7x5/5 council + red-team NO-MISS-CERTIFIED). Build clean both TFMs
(-warnaserror), Executor 50/50 (was 49), format clean, PublicAPI delta empty.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Ken Haines <1144092+khaines@users.noreply.github.com>
@khaines

khaines commented Jul 3, 2026

Copy link
Copy Markdown
Owner Author

🟢 Review-Fix-Loop — PASS (unanimous 7×5/5 + red-team NO-MISS-CERTIFIED, orchestrator anti-forgery re-verified)

PR #414feat(executor): physical planning bridge to EPIC-03 backend + local query executor (STORY-04.6.2 / Closes #174)
Run identity: base origin/main a810e31 → final HEAD 9443cd5 · Executor-only (Core/Engine untouched, PublicAPI delta empty) · 23 files, first end-to-end DeltaSharp query execution (Collect/Count over an in-memory relation).

This PR delivers the physical-planning bridge that lowers the Core logical plan → EPIC-03 columnar operators and materializes results back to Core Rows — the last of Batch N (after #411 optimizer + #412 actions/Row). The council found no wrong-result bug (execution is correct), but drove out a series of robustness/determinism and doc-accuracy defects across multiple fix rounds, culminating in a genuine red-team MISS (a raw-exception leak) that the voting council had missed.


Progression

Round HEAD Event Council state
R0 dda7aee Initial bridge (rebased on merged #411/#412)
R1 7469001 Fixed: dup-name → deterministic UnsupportedPlanException (#419); decimal scale-preserving + overflow guard; Date→DateOnly/Timestamp→UTC DateTime; 12-type matrix; nits 5/7
R2 b03ecc4 Fixed: PlanDistinct UniqueProbeName (groupBy(x).count().distinct() collision); timestamp guard; backend-parity reframe 5/7
R2b b3f7d1b Fixed: reverted backend-parity over-correction → accurate smoke-test framing; timestamp-guard doc §7/§8 7/7 ✅
R3 0b84922 RED-TEAM MISS-FOUND (HIGH) → guarded ReadDate raw-exception leak + materializer sweep + 2 tests + doc re-verify
R3b 9443cd5 Added exact in-range Date boundary regression test (Quality nit) 7/7 ✅
Gate 9443cd5 Red-team NO-MISS-CERTIFIED + orchestrator anti-forgery (mutation C7) PASS

Council composition (verified)

Seat Role Model agent_type Verdict
Architect Deep reasoning / subtle bugs claude-opus-4.8 general-purpose 5/5
Balanced Maintainability / pragmatism claude-opus-4.8 general-purpose 5/5
Security Isolation / robustness / supply-chain claude-opus-4.8 cloud-native-security-sme 5/5
Quality Testability / reliability gpt-5.5 general-purpose 5/5
Columnar ⭐ Materialization / ColumnBatch (scout specialist) claude-opus-4.8 dotnet-vectorized-columnar-compute-engineer 5/5
QueryExec ⭐ Execution correctness (scout specialist) claude-opus-4.8 query-execution-engine-engineer 5/5
Writer ⭐ Design-doc accuracy (scout specialist) claude-opus-4.8 general-purpose (technical-writer lens) 5/5
Red-team Decorrelated adversarial gate gemini-3.1-pro-preview general-purpose (shell) NO-MISS-CERTIFIED

Spine = Opus 4.8 ×5 + GPT-5.5 (Quality); red-team decorrelated on Gemini (a family used by no voting seat). All execution-eligible (C7) claims were RUN in scratch clones outside the worktree.


Findings (all resolved)

🟠 HIGH — Raw-exception leak in ReadDate (red-team MISS-FOUND, the council's own miss)

RowMaterializer.ReadDate did not guard out-of-range epoch-days: UnixEpochDate.AddDays(int.MaxValue) (~5.9M years past 1970) leaked a raw System.ArgumentOutOfRangeException to the caller — an asymmetry vs the already-guarded Timestamp and Decimal paths, breaking the bridge's contract that every unmaterializable value surfaces a deterministic UnsupportedPlanException. The 7-seat council missed it; the decorrelated red-team proved it by execution.
Fix (0b84922): wrapped AddDays in try/catch(ArgumentOutOfRangeException)OutOfRangeDate → deterministic UnsupportedPlanException, mirroring ReadTimestamp. Materializer sweep confirmed this was the only remaining unguarded path (primitives = exact reads; Binary = byte copy; String = Encoding.UTF8.GetString replacement-fallback, never throws; Decimal/Timestamp already guarded).
Tests: Date_OutOfDateOnlyRange_… (int.MaxValue), Date_MinIntEpochDay_… (int.MinValue), Date_ExactInRangeBoundaries_RoundTrip_AndOnePastEachBoundThrows (exact [-719162, 2932896] window round-trips; ±1 past each bound throws).

🟡 MEDIUM — groupBy(x).count().distinct() probe-name collision

PlanDistinct hard-coded a __distinct_count COUNT(*) probe; a child schema already carrying that name threw a raw StructType SchemaValidationException.
Fix: UniqueProbeName — an ordinal name set (consistent with StructType's own ordinal dup-check) that emits __distinct_count else __distinct_count_{N}, provably collision-proof and bounded by column count. Test: Distinct_OverUserColumnNamedCount_Works (+ QueryExec independently wrote 5 edge cases incl. a user column literally named __distinct_count_0).

🟡 MEDIUM — Duplicate output column names leaked SchemaValidationException

Select(col, col) / equi-join sharing a column name leaked a raw Engine exception instead of a deterministic diagnostic.
Fix: SchemaOf pre-detects duplicates (Ordinal HashSet) → deterministic UnsupportedPlanException naming the duplicate; full Spark duplicate-name support deferred to #419. Tests: SelectSameColumnTwice_…, EquiJoinWithSharedColumnName_….

🟡 MEDIUM — Decimal/Date/Timestamp materialization correctness

Decimal wasn't scale-preserving (decimal(5,2) 100.00 rendered 100) and could overflow; Date/Timestamp surfaced the raw epoch int/long.
Fix: decimal via new decimal(lo,mid,hi,neg,scale) (96-bit split, scale-preserving; scale>28/>96-bit → deterministic UnsupportedPlanException); Date→DateOnly, Timestamp→UTC DateTime (+ range guard) — inverses of lit(). Tests: 12-type matrix + LitDate/LitTimestamp round-trip + out-of-range guards.

🟡 MEDIUM — Backend-parity doc/comment over-correction (doc-vs-code accuracy)

A round-2 reframe wrongly claimed the executor end-to-end check was a "genuine interpreted-vs-compiled expression differential." Verified false: both CompiledBackend.Open and InterpretedVectorizedBackend.Open delegate to the same InterpretedOperators.Open (interpreted evaluators); CompiledBackend.BuildExpressionEvaluator has no operator-path callers (only the Engine BackendParityOracle, #154).
Fix (b3f7d1b): reverted all 3 sites (doc §10, OptionsFor comment, test comment) to the accurate smoke/plumbing cross-check framing; genuine differential located in the Engine oracle (#154); forward operator-fusion wiring cited as EPIC-13 / #309/#310 (stale #148 removed).

🔵 LOW / nits (fixed)

Positive-boundary regression test (9443cd5, Quality); §7/§8 Date & Timestamp out-of-range diagnostics documented; dead AnsiMode/TryGetBatches nits; LogicalOutput self-check strengthened to reject same-id name/type mismatch; Row→#177 provenance & cross-refs.


Design-doc conformance

docs/engineering/design/physical-planning.md (341 lines) — ✅ Full. Writer re-verified line-by-line (node→operator table, Distinct lowering, LogicalOutput §5, materialization §7, diagnostics §8, backend-selection §10) against code with zero unresolved divergences.

Validation evidence (final HEAD 9443cd5)

  • dotnet build -c Release -warnaserror (net8.0;net10.0) → 0 warnings / 0 errors
  • dotnet test Executor.Tests → 50/50 (Core unaffected, 684×2 on merged main)
  • dotnet format --verify-no-changes → clean · PublicAPI delta vs origin/mainempty
  • CI green: build-test-format ✓ · dco ✓ · pack-validate ✓ · publish-aot ✓ · build-samples

Red-team gate + orchestrator anti-forgery

  • Red-team (gemini-3.1-pro-preview, shell) — NO-MISS-CERTIFIED with a fully-populated Falsification-Attempts block: certified the Date fix (exact DateOnly.MaxValue epoch-day 36520589999-12-31; int extremes → deterministic) and adversarially swept every reader (String malformed UTF-8 0xFF 0xFE 0xFD\uFFFD no-throw; Decimal Int128.MinValue negation → bit-preserving deterministic guard; Byte signed cast; all primitives/Binary). C7: build 0/0, 50/50, custom adversarial tests green.
  • Orchestrator anti-forgery (independent, out-of-tree): re-ran the shipped suite in a /tmp git archive clone → 50/50, then mutated the guard away and confirmed all 3 Date tests redden (Failed: 3 — raw ArgumentOutOfRangeException from ReadDate:91 reproduced) — proving the tests are non-vacuous and the guard is load-bearing. The red-team's certification is independently corroborated, not forged.

Deferrals (all tracked + verified open)

Commits (6)

dda7aee bridge · 7469001 materialization + dup-name diagnostic + docs · b03ecc4 Distinct collision + timestamp guard · b3f7d1b backend-parity framing + timestamp doc · 0b84922 ReadDate guard (red-team fix) · 9443cd5 Date boundary regression test


Recommendation: APPROVE (PASS). Unanimous 7×5/5 across all voting seats, decorrelated red-team NO-MISS-CERTIFIED, orchestrator anti-forgery independently re-verified, CI green, every deferral filed and open. Merge gate satisfied.

@khaines khaines merged commit ed1aa5f into main Jul 3, 2026
5 checks passed
@khaines khaines deleted the khaines/feat-04.6.2-physical-planning branch July 3, 2026 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Story] STORY-04.6.2: Physical planning bridge to EPIC-03 backend

1 participant