eval: bound serialized Trace.Base and decode Value in a single pass#659
eval: bound serialized Trace.Base and decode Value in a single pass#659borisschlosser wants to merge 3 commits into
Conversation
`Value.UnmarshalJSON` json.Unmarshal'd the input twice per nesting level — once into a RawMessage and once over the captured "value" subtree — so cost scaled as O(size × depth). The hot path now reuses one json.Decoder for the entire subtree, including Trace.Base, decoding each byte exactly once per level. Object merges set Trace.Base on every produced value, and serialization recurses through it. As import-merge depth grew, opened payloads serialized many partial copies of the same logical content. New eval.TraceMode + EvalOptions arguments control this: - TraceModeCollapsed (default): keep the immediate parent only. - TraceModeFull: preserve the entire chain. - TraceModeNone: drop Trace.Base entirely. esc.Trace is unchanged (Base stays omitempty). The three eval entry points gain a final EvalOptions parameter; the zero value selects the safe default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3b565af to
7c0428b
Compare
Review complete — Important: 2 · Nit: 2 · Pre-existing: 0This PR delivers two well-motivated improvements: a Important1. 2. Empty array round-trip regression ( Nit
|
There was a problem hiding this comment.
Pull request overview
This PR targets performance and payload-size issues when decoding and serializing deeply merged esc.Value trees, especially for opened environments with long Trace.Base chains.
Changes:
- Reworked
(*esc.Value).UnmarshalJSONto stream-decode a value subtree using a singlejson.Decoder(avoiding per-level re-scans). - Introduced
eval.EvalOptionswitheval.TraceModeand applied a post-eval rewrite pass to boundTrace.Baseserialization cost by default. - Updated eval/analysis/CLI tests and changelog to reflect the new eval entry-point signatures and default trace behavior.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| value.go | Replaces RawMessage-based decoding with a streaming decoder and adds a custom Trace decoder to keep Trace.Base on the shared decode path. |
| value_test.go | Adds characterization tests + a depth-scaling benchmark for Value.UnmarshalJSON. |
| eval/eval.go | Adds EvalOptions to eval entry points and applies trace-mode rewriting to results. |
| eval/eval_options.go | Introduces TraceMode, EvalOptions, and trace-chain rewrite helpers. |
| eval/eval_options_test.go | Adds unit and integration tests covering trace rewrite behavior and option plumbing. |
| eval/eval_test.go | Updates snapshot harness calls to pass EvalOptions (opting into TraceModeFull for fixture stability). |
| eval/rotate_example_test.go | Updates example call signature for RotateEnvironment. |
| cmd/esc/cli/cli_test.go | Updates test client eval calls to pass EvalOptions (opting into TraceModeFull for fixture stability). |
| analysis/traversal_test.go | Updates CheckEnvironment call signature to include EvalOptions. |
| analysis/describe_test.go | Updates CheckEnvironment call signature to include EvalOptions. |
| CHANGELOG_PENDING.md | Documents the perf change and the breaking signature change + new default trace behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (v *Value) UnmarshalJSON(data []byte) error { | ||
| var raw struct { | ||
| Value json.RawMessage `json:"value,omitempty"` | ||
| Secret bool `json:"secret,omitempty"` | ||
| Unknown bool `json:"unknown,omitempty"` | ||
| Trace Trace `json:"trace"` | ||
| dec := json.NewDecoder(bytes.NewReader(data)) | ||
| dec.UseNumber() | ||
| return v.decodeFrom(dec) | ||
| } |
There was a problem hiding this comment.
Holding off on this one. I checked the prior path and the premise doesn't quite hold: the old code reset Secret/Unknown/Trace unconditionally but did not reset Value when "value" was absent (it was guarded by if len(raw.Value) != 0), so the old behavior for a reused receiver was already inconsistent. The suggested decode-into-temp-then-assign would reset Value on an absent field too — a third behavior, not parity with the old code. For the only path esc actually uses — var v Value; json.Unmarshal(data, &v) on a fresh zero value — old and new are identical, and nested elements decode into fresh Values rather than reused receivers. The partial-mutation-on-error case leaves an invalid value that callers discard via the returned error. Happy to revisit if we ever need to support re-unmarshaling into a populated Value, but that isn't a pattern here.
| func applyTraceMode(v *esc.Value, mode TraceMode) { | ||
| if v == nil || mode == TraceModeFull { | ||
| return | ||
| } |
There was a problem hiding this comment.
Fixed in 053d364. applyTraceMode now uses TraceModeCollapsed as the switch default, so any out-of-range TraceMode degrades to the safe bounded default instead of silently preserving the full chain (TraceModeFull still early-returns above). Pinned by TestApplyTraceMode_UnknownMode_DefaultsToCollapsed.
| import ( | ||
| "context" | ||
| "os" | ||
| "path/filepath" | ||
| "testing" | ||
|
|
||
| "github.com/pulumi/esc" | ||
| "github.com/stretchr/testify/assert" | ||
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| // chainValue builds a Value whose Trace.Base chain has the given depth. The | ||
| // payload at each level is distinct so callers can identify which level was | ||
| // kept after rewriting. | ||
| func chainValue(depth int) esc.Value { | ||
| if depth <= 0 { | ||
| return esc.Value{} | ||
| } | ||
| v := esc.Value{Value: "level-0"} | ||
| cur := &v | ||
| for i := 1; i < depth; i++ { | ||
| cur.Trace.Base = &esc.Value{Value: "level-" + string(rune('0'+i))} | ||
| cur = cur.Trace.Base | ||
| } |
There was a problem hiding this comment.
Fixed in 23a328c — chainValue now uses fmt.Sprintf("level-%d", i) (equivalent to the suggested strconv.Itoa), correct past depth 9.
| } | ||
| if d, ok := tok.(json.Delim); !ok || d != '{' { | ||
| return fmt.Errorf("esc.Trace: expected JSON object, got %v", tok) | ||
| } |
There was a problem hiding this comment.
Important — bug
(*Trace).decodeFrom rejects a JSON null for the "trace" field, breaking the previously-permissive wire format.
When a payload contains "trace":null, json.Decoder.Token() returns Go nil. The guard:
if d, ok := tok.(json.Delim); !ok || d != '{' {
return fmt.Errorf("esc.Trace: expected JSON object, got %v", tok)
}fires because nil is not a json.Delim, and an error is returned. The previous implementation decoded into a plain struct via json.Unmarshal, which silently sets any struct field to its zero value on null — no error.
Handle nil before the delimiter check, mirroring how Trace.Base handles null a few lines below:
| } | |
| if tok == nil { | |
| return nil // JSON null: leave t at zero value | |
| } | |
| if d, ok := tok.(json.Delim); !ok || d != '{' { | |
| return fmt.Errorf("esc.Trace: expected JSON object, got %v", tok) | |
| } |
Why this matters
Any proxy or service that coerces unknown struct fields to null will now receive a parse error where the old code silently produced a zero Trace. This breaks the stated goal of remaining permissive about forward-compatible wire formats.
There was a problem hiding this comment.
Fixed in 23a328c. Added the if tok == nil { return nil } guard before the delimiter check in (*Trace).decodeFrom, mirroring the Trace.Base null handling. "trace":null now decodes to a zero Trace, matching the old json.Unmarshal path. Pinned by a null trace decodes to zero Trace case in TestValueUnmarshalJSON.
| return err | ||
| } | ||
| v.Value = arr | ||
| case '{': |
There was a problem hiding this comment.
Important — bug
Decoding "value":[] now produces a nil []Value instead of the non-nil empty slice []Value{} that the previous json.Unmarshal path returned.
var arr []Value // nil slice
for dec.More() { ... } // never runs for []
v.Value = arr // v.Value stores ([]Value)(nil)In Go, json.Marshal treats a nil slice as null and an empty slice as []. So a value that was serialised as "value":[] and then round-tripped through this decoder is re-serialised as "value":null, silently changing the wire representation.
Why this matters
Environment property values that are empty arrays will be transformed to null on the first decode-encode round trip, changing the semantics for downstream consumers that distinguish empty array from absent/null.
There was a problem hiding this comment.
Fixed in 23a328c. decodeValueField now initializes arr := []Value{}, so an empty array stays a non-nil slice and round-trips to [] rather than null. Pinned by TestValueUnmarshalJSON_EmptyArrayRoundTrip (asserts non-nil + "value":[] in the re-marshaled output) and an empty array stays non-nil table case.
| cur := &v | ||
| for i := 1; i < depth; i++ { | ||
| cur.Trace.Base = &esc.Value{Value: "level-" + string(rune('0'+i))} | ||
| cur = cur.Trace.Base |
There was a problem hiding this comment.
Nit — tests
chainValue uses string(rune('0'+i)) to construct level labels. This only produces correct decimal digit characters for i in 0–9; for i >= 10 it yields ':', ';', etc. Tests that assert on specific label values (e.g. \"level-1\") happen to use depths ≤ 9 and are unaffected today, but the helper is silently wrong for any depth > 10.
| cur = cur.Trace.Base | |
| cur.Trace.Base = &esc.Value{Value: fmt.Sprintf("level-%d", i)} |
There was a problem hiding this comment.
Fixed in 23a328c. chainValue now uses fmt.Sprintf("level-%d", i).
| // re-scans each level; this is a guard against accidental regressions | ||
| // during rewrite work. | ||
| const depth = 100 | ||
| input := strings.Repeat(`{"value":{"k":`, depth) + `{"value":"leaf"}` + strings.Repeat(`}}`, depth) |
There was a problem hiding this comment.
Nit — tests
TestValueUnmarshalJSON_Errors and the table-driven TestValueUnmarshalJSON have no cases for the two behavioral regressions introduced by this PR:
{"value":"x","trace":null}— this now returns an error but previously decoded to a zeroTrace. A test would pin the expected behavior and catch the regression identified in the(*Trace).decodeFromfinding.{"value":[]}— a case asserting that the result round-trips to[]rather thannullwould catch the nil-vs-empty-slice regression indecodeValueField.
There was a problem hiding this comment.
Done in 23a328c. Added both cases to TestValueUnmarshalJSON (null trace decodes to zero Trace, empty array stays non-nil) plus a dedicated TestValueUnmarshalJSON_EmptyArrayRoundTrip. Confirmed all three fail before the fix and pass after.
There was a problem hiding this comment.
Important: 2 · Nit: 2 · Pre-existing: 0
This PR delivers two meaningful improvements: a TraceMode/EvalOptions API to cap Trace.Base chain depth, and a single-pass json.Decoder rewrite of Value.UnmarshalJSON. The design and test coverage for the TraceMode feature are solid. Two behavioral regressions in the new decoder need attention before merge.
Important findings
-
(*Trace).decodeFromrejects"trace":null— the previousjson.Unmarshal-into-struct path silently accepted null for any struct field, leaving it at its zero value. The new streaming decoder reads the null token and returns an error ("esc.Trace: expected JSON object, got <nil>"). This breaks forward-compatibility for payloads that zero out unknown fields with null. The fix is a two-line nil check before the delimiter assertion, mirroring howTrace.Basealready handles this case. See the inline comment onvalue.go. -
decodeValueFieldinitialises the array accumulator asvar arr []Value(nil). For an empty array ("value":[]) the append loop never runs andv.Valueis set to a nil[]Value. In Go,json.Marshal(nil []Value)producesnull, not[], so a decode→encode round trip silently changes"value":[]to"value":null. The previousjson.Unmarshalpath consistently produced[]Value{}. The fix is to initialisearr := []Value{}. See the inline comment onvalue.go.
Nit findings
chainValueineval_options_test.gousesstring(rune('0'+i)), which overflows decimal digits for depth > 9 (producing':',';\, etc.). Tests that assert on label values happen to use depths ≤ 9 and pass today, but the helper should usefmt.Sprintf("level-%d", i)for correctness. See the inline comment.TestValueUnmarshalJSONhas no cases for"trace":nullor"value":[], so neither regression has a failing test to guide the fix. These would be cheap to add to the existing table.
The single-pass decoder diverged from the previous json.Unmarshal path in
two edge cases. Restore exact behavioral parity:
- (*Trace).decodeFrom now accepts a JSON null for "trace", leaving the
Trace at its zero value instead of erroring. The old struct-based path
silently zeroed any field on null; forward-compatible payloads that null
out unknown fields depend on this.
- decodeValueField initializes the array accumulator as []Value{} so an
empty array stays a non-nil slice. json.Marshal renders a nil []Value as
null and a non-nil empty one as [], so without this a decode->encode
round trip silently rewrote "value":[] to "value":null.
Add table cases plus a round-trip test pinning both behaviors, and fix the
chainValue test helper to use fmt.Sprintf so labels stay correct past
depth 9 (string(rune('0'+i)) produced non-digit characters).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
applyTraceMode early-returns on TraceModeFull and otherwise switched on None/Collapsed with no default, so an out-of-range TraceMode value stripped nothing and recursed — silently producing the most expensive behavior (full chains), the opposite of the documented safe default. TraceMode is public API, so make Collapsed the switch default: any unrecognized mode now degrades to the bounded default. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@pulumi-eon review again |
Review complete — all prior findings addressedThe three commits pushed since the last review resolve every finding: Important (both fixed in commit
Nits (both fixed in commit
Additional fix (commit
Current state All code changes look correct. The logic is sound: |
|
@pulumi-eon if you have no further findings remove "Request Change" from the PR and/or "Approve" it |
Review complete — PR approvedAll findings from the prior review have been fully addressed in commits Important (both fixed)
Nits (both fixed)
Additional fix
All CI checks pass on Go 1.21, 1.23, and 1.25. The implementation is correct and well-tested. Approved. |
There was a problem hiding this comment.
All prior findings have been addressed:
(*Trace).decodeFromnow correctly accepts"trace":nullwith a nil-token early-return before the delimiter guard.- Empty-array round-trip regression is fixed —
arr := []Value{}ensures"value":[]survives decode→encode as[]notnull. chainValuehelper usesfmt.Sprintf("level-%d", i)— correct for all depths.- Test cases for
"trace":nulland"value":[]added to the table. - Unknown
TraceModevalues degrade to safeTraceModeCollapseddefault.
All CI checks pass across Go 1.21, 1.23, and 1.25. The implementation is correct and well-tested. Approving.
Summary
Two compounding sources of CPU cost in
Value.UnmarshalJSONmade decoding deeply merged opened-environment payloads scale poorly. This PR fixes both.Action 1 — bound the serialized
Trace.BasechainObject merges set
Trace.Baseon every produced value, and serialization recurses through it. A leaf at the end of a D-deep merge chain therefore carried D partial copies of its ancestors on the wire — opened payloads grew super-linearly in import depth for the same logical content.This PR introduces
eval.TraceModeand anEvalOptionsargument on the three eval entry points (EvalEnvironment,CheckEnvironment,RotateEnvironment):TraceModeCollapsed— the zero value, and therefore the default — keeps each value's immediate parent (one level ofTrace.Base) and drops the rest. Bounds payload size at O(1) per leaf regardless of merge depth.TraceModeFull— preserves the entire chain (the prior behavior).TraceModeNone— dropsTrace.Baseentirely.The mode is applied by a small post-evaluation pass that walks
Environment.PropertiesandExecutionContext.Properties, rewritingTrace.Basein place. The exportedesc.Tracestruct is unchanged —Base *Valuestays declared withomitempty, so consumers parse correctly whetherBaseis present or absent.Action 2 — single-pass
Value.UnmarshalJSONThe previous implementation
json.Unmarshal'd the input into aRawMessageand thenjson.Unmarshal'd the capturedvaluesubtree a second time. Each level paid a double scan, so decode cost was O(size × depth).(*Value).UnmarshalJSONnow uses onejson.Decoderfor the entire subtree:RawMessage.Trace.Base, calls a shareddecodeFrom(*json.Decoder)directly instead of routing throughdec.Decode(&Value{}). The latter would dispatch back throughjson.Unmarshaler, allocating a fresh decoder per child and reintroducing the per-level double scan.The function's external contract is unchanged: it still implements
json.Unmarshaler, accepts the same wire format, and remains permissive about unknown top-level fields.Performance
BenchmarkValueUnmarshalJSON_Depth(invalue_test.go) decodes a map of ten string leaves whoseTrace.Basechain has the given depth. Numbers below are on an Apple M4 Pro.Throughput is now ~constant at ~55 MB/s regardless of depth; before, it dropped from 38 MB/s at depth=1 to 8 MB/s at depth=50. Action 1 additionally reduces input size on read paths that decode cached opened-environment blobs, compounding the win.
Backward compatibility
esc.Traceandesc.Valuetypes are unchanged.EvalEnvironment,CheckEnvironment, andRotateEnvironmentgain a finalEvalOptionsargument. Callers should passeval.EvalOptions{}to adopt the new safe default, oreval.EvalOptions{TraceMode: eval.TraceModeFull}to keep prior full-chain serialization.TraceModeFullso recorded fixtures stay stable; the new behavior is covered by dedicated tests.Test plan
make formatmake lint— 0 issuesmake testmake test_cover— full suite with race detectionapplyTraceModeunder each mode (data dimension via map and array, kept-parent recursion, nil base, ExecutionContext)EvalEnvironment, including a 5-environment deep import chain that makes the collapse cap observableUnmarshalJSON(scalars, nested arrays/maps,Trace.Basechains, unknown fields, errors, deep nesting)