Skip to content

fix: validate schema field IDs during unmarshal#1315

Open
fallintoplace wants to merge 2 commits into
apache:mainfrom
fallintoplace:fix/schema-unmarshal-duplicate-field-ids
Open

fix: validate schema field IDs during unmarshal#1315
fallintoplace wants to merge 2 commits into
apache:mainfrom
fallintoplace:fix/schema-unmarshal-duplicate-field-ids

Conversation

@fallintoplace

Copy link
Copy Markdown
Contributor

Summary

  • validate duplicate field IDs while unmarshalling schemas from JSON
  • add coverage for top-level and nested duplicate IDs returning ErrInvalidSchema

Why

Schema constructors already reject duplicate positive field IDs, but Schema.UnmarshalJSON assigned decoded fields without running the same check. Malformed schema JSON could enter the system and later corrupt ID-based lookups, projections, name mapping, or writes.

Testing

  • go test . -run '^TestUnmarshalSchemaRejectsDuplicateFieldIDs$' -count=1\n- go test . -count=1\n- go run github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.11.4 run --timeout=10m .\n- git diff --check

@fallintoplace fallintoplace requested a review from zeroshade as a code owner June 24, 2026 21:42

@tanmayrauth tanmayrauth left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nit on test coverage.

Comment thread schema_test.go
assert.True(t, tableSchemaSimple.Equals(&schema))
}

func TestUnmarshalSchemaRejectsDuplicateFieldIDs(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice — the top-level and nested-struct cases are the important ones. Since checkDuplicateFieldIDs also walks list element IDs and map key/value IDs (schema.go:118-130), would you consider adding a list or map dup-ID case too? Those branches are only exercised indirectly right now, and they're the easiest to silently break later.

@fallintoplace fallintoplace force-pushed the fix/schema-unmarshal-duplicate-field-ids branch from b5da3aa to 30d2edd Compare June 25, 2026 07:30

@zeroshade zeroshade left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this. The duplicate-ID detection is correct, and I like the recursion into struct/list/map plus the negative-placeholder carve-out. Requesting changes on one gap relative to the PR title (validate schema field IDs) - noted inline.

Comment thread schema.go
return err
}

if err := checkDuplicateFieldIDs(nil, aux.Fields); err != nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catches duplicate positive IDs, but missing list/map IDs still slip through. checkDuplicateFieldIDs only records f.ID > 0 (schema.go:101), and an omitted list element-id or map key-id/value-id decodes to 0 - so a schema that omits one of those still unmarshals cleanly (a lone 0 isn't a duplicate). If the goal is to validate field IDs on unmarshal, it'd be worth rejecting missing/zero element/key/value IDs too (e.g. decode them as *int and error when absent), while still allowing the negative placeholders used during construction. Is the broader validation in scope here, or is this intentionally limited to duplicates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants