Skip to content

eval: bound serialized Trace.Base and decode Value in a single pass#659

Open
borisschlosser wants to merge 3 commits into
mainfrom
boris/trace-mode-and-single-pass-unmarshal
Open

eval: bound serialized Trace.Base and decode Value in a single pass#659
borisschlosser wants to merge 3 commits into
mainfrom
boris/trace-mode-and-single-pass-unmarshal

Conversation

@borisschlosser

Copy link
Copy Markdown
Contributor

Summary

Two compounding sources of CPU cost in Value.UnmarshalJSON made decoding deeply merged opened-environment payloads scale poorly. This PR fixes both.

Action 1 — bound the serialized Trace.Base chain

Object merges set Trace.Base on every produced value, and serialization recurses through it. A leaf at the end of a D-deep merge chain therefore carried D partial copies of its ancestors on the wire — opened payloads grew super-linearly in import depth for the same logical content.

This PR introduces eval.TraceMode and an EvalOptions argument on the three eval entry points (EvalEnvironment, CheckEnvironment, RotateEnvironment):

  • TraceModeCollapsed — the zero value, and therefore the default — keeps each value's immediate parent (one level of Trace.Base) and drops the rest. Bounds payload size at O(1) per leaf regardless of merge depth.
  • TraceModeFull — preserves the entire chain (the prior behavior).
  • TraceModeNone — drops Trace.Base entirely.

The mode is applied by a small post-evaluation pass that walks Environment.Properties and ExecutionContext.Properties, rewriting Trace.Base in place. The exported esc.Trace struct is unchanged — Base *Value stays declared with omitempty, so consumers parse correctly whether Base is present or absent.

Action 2 — single-pass Value.UnmarshalJSON

The previous implementation json.Unmarshal'd the input into a RawMessage and then json.Unmarshal'd the captured value subtree a second time. Each level paid a double scan, so decode cost was O(size × depth).

(*Value).UnmarshalJSON now uses one json.Decoder for the entire subtree:

  • Streams the outer object field-by-field rather than capturing it as a RawMessage.
  • For nested children (array elements, map values) and Trace.Base, calls a shared decodeFrom(*json.Decoder) directly instead of routing through dec.Decode(&Value{}). The latter would dispatch back through json.Unmarshaler, allocating a fresh decoder per child and reintroducing the per-level double scan.

The function's external contract is unchanged: it still implements json.Unmarshaler, accepts the same wire format, and remains permissive about unknown top-level fields.

Performance

BenchmarkValueUnmarshalJSON_Depth (in value_test.go) decodes a map of ten string leaves whose Trace.Base chain has the given depth. Numbers below are on an Apple M4 Pro.

Depth Time before Time after Speedup Memory before Memory after
1 34 µs 23 µs 1.5× 36 KB 14 KB
5 213 µs 113 µs 1.9× 181 KB 57 KB
20 1.66 ms 457 µs 3.6× 730 KB 220 KB
50 8.13 ms 1.27 ms 6.4× 1.86 MB 545 KB

Throughput is now ~constant at ~55 MB/s regardless of depth; before, it dropped from 38 MB/s at depth=1 to 8 MB/s at depth=50. Action 1 additionally reduces input size on read paths that decode cached opened-environment blobs, compounding the win.

Backward compatibility

  • esc.Trace and esc.Value types are unchanged.
  • EvalEnvironment, CheckEnvironment, and RotateEnvironment gain a final EvalOptions argument. Callers should pass eval.EvalOptions{} to adopt the new safe default, or eval.EvalOptions{TraceMode: eval.TraceModeFull} to keep prior full-chain serialization.
  • Existing snapshot tests opt into TraceModeFull so recorded fixtures stay stable; the new behavior is covered by dedicated tests.

Test plan

  • make format
  • make lint — 0 issues
  • make test
  • make test_cover — full suite with race detection
  • New unit tests for applyTraceMode under each mode (data dimension via map and array, kept-parent recursion, nil base, ExecutionContext)
  • End-to-end tests through EvalEnvironment, including a 5-environment deep import chain that makes the collapse cap observable
  • Characterization tests for UnmarshalJSON (scalars, nested arrays/maps, Trace.Base chains, unknown fields, errors, deep nesting)
  • Depth-scaling benchmark

`Value.UnmarshalJSON` json.Unmarshal'd the input twice per nesting level —
once into a RawMessage and once over the captured "value" subtree — so cost
scaled as O(size × depth). The hot path now reuses one json.Decoder for the
entire subtree, including Trace.Base, decoding each byte exactly once per
level.

Object merges set Trace.Base on every produced value, and serialization
recurses through it. As import-merge depth grew, opened payloads serialized
many partial copies of the same logical content. New eval.TraceMode +
EvalOptions arguments control this:

  - TraceModeCollapsed (default): keep the immediate parent only.
  - TraceModeFull: preserve the entire chain.
  - TraceModeNone: drop Trace.Base entirely.

esc.Trace is unchanged (Base stays omitempty). The three eval entry points
gain a final EvalOptions parameter; the zero value selects the safe default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@borisschlosser borisschlosser force-pushed the boris/trace-mode-and-single-pass-unmarshal branch from 3b565af to 7c0428b Compare May 28, 2026 13:02
@borisschlosser borisschlosser marked this pull request as ready for review May 28, 2026 16:05
@borisschlosser borisschlosser requested review from nyobe and pgavlin May 28, 2026 16:05
@tehsis tehsis requested review from a team, Copilot and pulumi-eon June 8, 2026 13:43
@eon-pulumi

eon-pulumi Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review complete — Important: 2 · Nit: 2 · Pre-existing: 0

This PR delivers two well-motivated improvements: a TraceMode/EvalOptions API to bound Trace.Base chain depth, and a single-pass streaming rewrite of Value.UnmarshalJSON. The TraceMode design, its test coverage, and the performance story are all solid. Two behavioral regressions in the new decoder need fixes before merge.

Important

1. (*Trace).decodeFrom rejects "trace":null (value.go)The previous implementation decoded into a raw struct; json.Unmarshal silently sets struct fields to their zero value when the JSON field is null. The new streaming decoder reads the null token (Go nil) and hits the delimiter assertion, returning "esc.Trace: expected JSON object, got <nil>". Any proxy that normalises unknown fields to null will now receive a parse error. The fix is a two-line nil check before the delimiter guard — the same pattern already used for Trace.Base a few lines below.

2. Empty array round-trip regression (value.go)decodeValueField initialises var arr []Value (nil). For "value":[] the append loop never runs, leaving v.Value as a nil []Value. json.Marshal serialises a nil slice as null, not [], so a decode→encode round trip silently changes "value":[] to "value":null. The previous json.Unmarshal path consistently produced []Value{}. Fix: initialise arr := []Value{}.

Nit

  • chainValue test helper uses string(rune('0'+i)), which produces non-digit characters for depth > 9. Tests pass today (all assertions use depth ≤ 9) but the helper is misleading; fmt.Sprintf("level-%d", i) is correct.
  • TestValueUnmarshalJSON has no cases for "trace":null or "value":[], so neither regression has a failing test to drive the fix.

View session

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets performance and payload-size issues when decoding and serializing deeply merged esc.Value trees, especially for opened environments with long Trace.Base chains.

Changes:

  • Reworked (*esc.Value).UnmarshalJSON to stream-decode a value subtree using a single json.Decoder (avoiding per-level re-scans).
  • Introduced eval.EvalOptions with eval.TraceMode and applied a post-eval rewrite pass to bound Trace.Base serialization cost by default.
  • Updated eval/analysis/CLI tests and changelog to reflect the new eval entry-point signatures and default trace behavior.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
value.go Replaces RawMessage-based decoding with a streaming decoder and adds a custom Trace decoder to keep Trace.Base on the shared decode path.
value_test.go Adds characterization tests + a depth-scaling benchmark for Value.UnmarshalJSON.
eval/eval.go Adds EvalOptions to eval entry points and applies trace-mode rewriting to results.
eval/eval_options.go Introduces TraceMode, EvalOptions, and trace-chain rewrite helpers.
eval/eval_options_test.go Adds unit and integration tests covering trace rewrite behavior and option plumbing.
eval/eval_test.go Updates snapshot harness calls to pass EvalOptions (opting into TraceModeFull for fixture stability).
eval/rotate_example_test.go Updates example call signature for RotateEnvironment.
cmd/esc/cli/cli_test.go Updates test client eval calls to pass EvalOptions (opting into TraceModeFull for fixture stability).
analysis/traversal_test.go Updates CheckEnvironment call signature to include EvalOptions.
analysis/describe_test.go Updates CheckEnvironment call signature to include EvalOptions.
CHANGELOG_PENDING.md Documents the perf change and the breaking signature change + new default trace behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread value.go
Comment on lines 119 to +123
func (v *Value) UnmarshalJSON(data []byte) error {
var raw struct {
Value json.RawMessage `json:"value,omitempty"`
Secret bool `json:"secret,omitempty"`
Unknown bool `json:"unknown,omitempty"`
Trace Trace `json:"trace"`
dec := json.NewDecoder(bytes.NewReader(data))
dec.UseNumber()
return v.decodeFrom(dec)
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off on this one. I checked the prior path and the premise doesn't quite hold: the old code reset Secret/Unknown/Trace unconditionally but did not reset Value when "value" was absent (it was guarded by if len(raw.Value) != 0), so the old behavior for a reused receiver was already inconsistent. The suggested decode-into-temp-then-assign would reset Value on an absent field too — a third behavior, not parity with the old code. For the only path esc actually uses — var v Value; json.Unmarshal(data, &v) on a fresh zero value — old and new are identical, and nested elements decode into fresh Values rather than reused receivers. The partial-mutation-on-error case leaves an invalid value that callers discard via the returned error. Happy to revisit if we ever need to support re-unmarshaling into a populated Value, but that isn't a pattern here.

Comment thread eval/eval_options.go
Comment on lines +53 to +56
func applyTraceMode(v *esc.Value, mode TraceMode) {
if v == nil || mode == TraceModeFull {
return
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 053d364. applyTraceMode now uses TraceModeCollapsed as the switch default, so any out-of-range TraceMode degrades to the safe bounded default instead of silently preserving the full chain (TraceModeFull still early-returns above). Pinned by TestApplyTraceMode_UnknownMode_DefaultsToCollapsed.

Comment thread eval/eval_options_test.go
Comment on lines +17 to +40
import (
"context"
"os"
"path/filepath"
"testing"

"github.com/pulumi/esc"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)

// chainValue builds a Value whose Trace.Base chain has the given depth. The
// payload at each level is distinct so callers can identify which level was
// kept after rewriting.
func chainValue(depth int) esc.Value {
if depth <= 0 {
return esc.Value{}
}
v := esc.Value{Value: "level-0"}
cur := &v
for i := 1; i < depth; i++ {
cur.Trace.Base = &esc.Value{Value: "level-" + string(rune('0'+i))}
cur = cur.Trace.Base
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 23a328cchainValue now uses fmt.Sprintf("level-%d", i) (equivalent to the suggested strconv.Itoa), correct past depth 9.

Comment thread value.go
}
if d, ok := tok.(json.Delim); !ok || d != '{' {
return fmt.Errorf("esc.Trace: expected JSON object, got %v", tok)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important — bug

(*Trace).decodeFrom rejects a JSON null for the "trace" field, breaking the previously-permissive wire format.

When a payload contains "trace":null, json.Decoder.Token() returns Go nil. The guard:

if d, ok := tok.(json.Delim); !ok || d != '{' {
    return fmt.Errorf("esc.Trace: expected JSON object, got %v", tok)
}

fires because nil is not a json.Delim, and an error is returned. The previous implementation decoded into a plain struct via json.Unmarshal, which silently sets any struct field to its zero value on null — no error.

Handle nil before the delimiter check, mirroring how Trace.Base handles null a few lines below:

Suggested change
}
if tok == nil {
return nil // JSON null: leave t at zero value
}
if d, ok := tok.(json.Delim); !ok || d != '{' {
return fmt.Errorf("esc.Trace: expected JSON object, got %v", tok)
}
Why this matters

Any proxy or service that coerces unknown struct fields to null will now receive a parse error where the old code silently produced a zero Trace. This breaks the stated goal of remaining permissive about forward-compatible wire formats.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 23a328c. Added the if tok == nil { return nil } guard before the delimiter check in (*Trace).decodeFrom, mirroring the Trace.Base null handling. "trace":null now decodes to a zero Trace, matching the old json.Unmarshal path. Pinned by a null trace decodes to zero Trace case in TestValueUnmarshalJSON.

Comment thread value.go
return err
}
v.Value = arr
case '{':

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important — bug

Decoding "value":[] now produces a nil []Value instead of the non-nil empty slice []Value{} that the previous json.Unmarshal path returned.

var arr []Value        // nil slice
for dec.More() { ... } // never runs for []
v.Value = arr          // v.Value stores ([]Value)(nil)

In Go, json.Marshal treats a nil slice as null and an empty slice as []. So a value that was serialised as "value":[] and then round-tripped through this decoder is re-serialised as "value":null, silently changing the wire representation.

Why this matters

Environment property values that are empty arrays will be transformed to null on the first decode-encode round trip, changing the semantics for downstream consumers that distinguish empty array from absent/null.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 23a328c. decodeValueField now initializes arr := []Value{}, so an empty array stays a non-nil slice and round-trips to [] rather than null. Pinned by TestValueUnmarshalJSON_EmptyArrayRoundTrip (asserts non-nil + "value":[] in the re-marshaled output) and an empty array stays non-nil table case.

Comment thread eval/eval_options_test.go
cur := &v
for i := 1; i < depth; i++ {
cur.Trace.Base = &esc.Value{Value: "level-" + string(rune('0'+i))}
cur = cur.Trace.Base

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit — tests

chainValue uses string(rune('0'+i)) to construct level labels. This only produces correct decimal digit characters for i in 0–9; for i >= 10 it yields ':', ';', etc. Tests that assert on specific label values (e.g. \"level-1\") happen to use depths ≤ 9 and are unaffected today, but the helper is silently wrong for any depth > 10.

Suggested change
cur = cur.Trace.Base
cur.Trace.Base = &esc.Value{Value: fmt.Sprintf("level-%d", i)}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 23a328c. chainValue now uses fmt.Sprintf("level-%d", i).

Comment thread value_test.go
// re-scans each level; this is a guard against accidental regressions
// during rewrite work.
const depth = 100
input := strings.Repeat(`{"value":{"k":`, depth) + `{"value":"leaf"}` + strings.Repeat(`}}`, depth)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit — tests

TestValueUnmarshalJSON_Errors and the table-driven TestValueUnmarshalJSON have no cases for the two behavioral regressions introduced by this PR:

  1. {"value":"x","trace":null} — this now returns an error but previously decoded to a zero Trace. A test would pin the expected behavior and catch the regression identified in the (*Trace).decodeFrom finding.
  2. {"value":[]} — a case asserting that the result round-trips to [] rather than null would catch the nil-vs-empty-slice regression in decodeValueField.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 23a328c. Added both cases to TestValueUnmarshalJSON (null trace decodes to zero Trace, empty array stays non-nil) plus a dedicated TestValueUnmarshalJSON_EmptyArrayRoundTrip. Confirmed all three fail before the fix and pass after.

@eon-pulumi eon-pulumi Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: 2 · Nit: 2 · Pre-existing: 0

This PR delivers two meaningful improvements: a TraceMode/EvalOptions API to cap Trace.Base chain depth, and a single-pass json.Decoder rewrite of Value.UnmarshalJSON. The design and test coverage for the TraceMode feature are solid. Two behavioral regressions in the new decoder need attention before merge.

Important findings

  1. (*Trace).decodeFrom rejects "trace":null — the previous json.Unmarshal-into-struct path silently accepted null for any struct field, leaving it at its zero value. The new streaming decoder reads the null token and returns an error ("esc.Trace: expected JSON object, got <nil>"). This breaks forward-compatibility for payloads that zero out unknown fields with null. The fix is a two-line nil check before the delimiter assertion, mirroring how Trace.Base already handles this case. See the inline comment on value.go.

  2. decodeValueField initialises the array accumulator as var arr []Value (nil). For an empty array ("value":[]) the append loop never runs and v.Value is set to a nil []Value. In Go, json.Marshal(nil []Value) produces null, not [], so a decode→encode round trip silently changes "value":[] to "value":null. The previous json.Unmarshal path consistently produced []Value{}. The fix is to initialise arr := []Value{}. See the inline comment on value.go.

Nit findings

  • chainValue in eval_options_test.go uses string(rune('0'+i)), which overflows decimal digits for depth > 9 (producing ':',';\, etc.). Tests that assert on label values happen to use depths ≤ 9 and pass today, but the helper should use fmt.Sprintf("level-%d", i) for correctness. See the inline comment.
  • TestValueUnmarshalJSON has no cases for "trace":null or "value":[], so neither regression has a failing test to guide the fix. These would be cheap to add to the existing table.

borisschlosser and others added 2 commits June 11, 2026 11:47
The single-pass decoder diverged from the previous json.Unmarshal path in
two edge cases. Restore exact behavioral parity:

- (*Trace).decodeFrom now accepts a JSON null for "trace", leaving the
  Trace at its zero value instead of erroring. The old struct-based path
  silently zeroed any field on null; forward-compatible payloads that null
  out unknown fields depend on this.
- decodeValueField initializes the array accumulator as []Value{} so an
  empty array stays a non-nil slice. json.Marshal renders a nil []Value as
  null and a non-nil empty one as [], so without this a decode->encode
  round trip silently rewrote "value":[] to "value":null.

Add table cases plus a round-trip test pinning both behaviors, and fix the
chainValue test helper to use fmt.Sprintf so labels stay correct past
depth 9 (string(rune('0'+i)) produced non-digit characters).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
applyTraceMode early-returns on TraceModeFull and otherwise switched on
None/Collapsed with no default, so an out-of-range TraceMode value stripped
nothing and recursed — silently producing the most expensive behavior (full
chains), the opposite of the documented safe default. TraceMode is public
API, so make Collapsed the switch default: any unrecognized mode now degrades
to the bounded default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@borisschlosser

Copy link
Copy Markdown
Contributor Author

@pulumi-eon review again

@eon-pulumi

eon-pulumi Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review complete — all prior findings addressed

The three commits pushed since the last review resolve every finding:

Important (both fixed in commit 23a328c)

  1. (*Trace).decodeFrom now accepts "trace":null — a nil-token early-return was added before the delimiter guard, matching the old json.Unmarshal-into-struct behaviour. Test case added to TestValueUnmarshalJSON.

  2. Empty-array round-trip regression fixed — the array accumulator is now initialised as arr := []Value{} (non-nil), so "value":[] survives a decode→encode round trip as [] rather than null. Test cases added to TestValueUnmarshalJSON and TestValueUnmarshalJSON_EmptyArrayRoundTrip.

Nits (both fixed in commit 23a328c)

  1. chainValue helper now uses fmt.Sprintf("level-%d", i) instead of string(rune('0'+i)), which was correct only for depth ≤ 9.

  2. Test cases for "trace":null and "value":[] were added to TestValueUnmarshalJSON.

Additional fix (commit 053d364)

  1. Unknown TraceMode values (out-of-range ints) now degrade to the safe TraceModeCollapsed default via the switch default branch, with a new test TestApplyTraceMode_UnknownMode_DefaultsToCollapsed pinning the behaviour.

Current state

All code changes look correct. The logic is sound: TraceModeCollapsed strips grandparent-and-deeper chains while recursing into data dimensions (arrays/maps) so the O(1)-per-leaf payload bound holds at every level. The null-trace and empty-array edge cases now have both code fixes and dedicated tests. No remaining issues found.


View session

@borisschlosser

Copy link
Copy Markdown
Contributor Author

@pulumi-eon if you have no further findings remove "Request Change" from the PR and/or "Approve" it

@eon-pulumi

eon-pulumi Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review complete — PR approved

All findings from the prior review have been fully addressed in commits 23a328c and 053d364:

Important (both fixed)

  1. (*Trace).decodeFrom now accepts "trace":null — a nil-token early-return was added before the delimiter guard, restoring the silent-zero behavior of the old json.Unmarshal-into-struct path.

  2. Empty-array round-trip regression fixed — the array accumulator is initialized as arr := []Value{} (non-nil), so "value":[] survives a decode→encode round trip as [] rather than null.

Nits (both fixed)

  1. chainValue helper now uses fmt.Sprintf("level-%d", i) instead of string(rune('0'+i)), which was only correct for depth ≤ 9.

  2. Test cases for "trace":null and "value":[] were added to TestValueUnmarshalJSON, and a dedicated round-trip test was added for the empty-array case.

Additional fix

  1. Unknown TraceMode values (out-of-range ints) degrade to the safe TraceModeCollapsed default, with a dedicated test pinning the behavior.

All CI checks pass on Go 1.21, 1.23, and 1.25. The implementation is correct and well-tested. Approved.


View session

@eon-pulumi eon-pulumi Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior findings have been addressed:

  1. (*Trace).decodeFrom now correctly accepts "trace":null with a nil-token early-return before the delimiter guard.
  2. Empty-array round-trip regression is fixed — arr := []Value{} ensures "value":[] survives decode→encode as [] not null.
  3. chainValue helper uses fmt.Sprintf("level-%d", i) — correct for all depths.
  4. Test cases for "trace":null and "value":[] added to the table.
  5. Unknown TraceMode values degrade to safe TraceModeCollapsed default.

All CI checks pass across Go 1.21, 1.23, and 1.25. The implementation is correct and well-tested. Approving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants