fix(table): return scanner memoization errors#1333
Conversation
tanmayrauth
left a comment
There was a problem hiding this comment.
Clean fix, turning these memoization panics into propagated errors is the right call, and reusing the existing double-checked-locking pattern keeps the concurrency behavior identical. Just two nits and one confirmation, all inline.
| if err != nil { | ||
| return fmt.Errorf("failed to build manifest evaluator for spec %d: %w", manifest.PartitionSpecID(), err) | ||
| } | ||
| if manifestEval != nil { |
There was a problem hiding this comment.
The if manifestEval != nil guard is dead now that the err != nil return runs first, buildManifestEvaluator only ever returns a nil func alongside a non-nil error, so past line 1548 this is always true. Harmless, but could be dropped to make the intent clearer.
| mEval := manifestEvals.Get(int(mf.PartitionSpecID())) | ||
| mEval, err := manifestEvals.Get(int(mf.PartitionSpecID())) | ||
| if err != nil { | ||
| return fmt.Errorf("building manifest evaluator for spec %d: %w", mf.PartitionSpecID(), err) |
There was a problem hiding this comment.
Tiny consistency nit: here the wrap is "building manifest evaluator for spec %d" (and "building partition evaluator..." on 393), but the equivalent wraps in scanner.go:626 and transaction.go:1547 say "failed to build ...". Worth aligning the prefix across the three so the messages read uniformly.
| } | ||
|
|
||
| k.mx.RUnlock() | ||
| k.mx.Lock() |
There was a problem hiding this comment.
Just confirming the intent: caching the error means a failing factory is never retried for that spec id (the new TestKeyDefaultMapWrapErrCachesError pins this). That's fine for these deterministic errors, unknown spec id, projection build failure but wanted to flag it explicitly since it's a behavior the helper now guarantees, not just an implementation detail.
Summary
replace the panic-wrapping memoized factory helper with an error-carrying cache
propagate cached factory errors through scanner planning, filtered overwrite classification, and conflict validation
add regression coverage for cached factory failures and an invalid-spec manifest path that used to panic
Why
The scanner code memoized several per-spec builders using
newKeyDefaultMapWrapErr, which converted any factory error into a panic. That meant ordinary failures like unknown partition spec ids or projection/evaluator build errors could crash the process instead of being returned to the caller.The shared helper is also reused in transaction-side manifest classification and conflict validation, so the same panic behavior leaked there too.
Testing
go test ./table -run 'Test(KeyDefaultMapWrapErrCachesError|FetchPartitionSpecFilteredManifests_InvalidSpecIDDoesNotPanic|BuildPartitionProjectionWithInvalidSpecID|BuildManifestEvaluatorWithInvalidSpecID|BuildPartitionEvaluatorWithInvalidSpecID|FetchPartitionSpecFilteredManifests_PropagatesEvalError)$' -count=1
go test ./table -count=1
go run github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.11.4 run --timeout=10m ./table/...
git diff --check