Skip to content

feat: let WriteManifest set and return v3 first_row_id#1321

Open
the-onewho-knocks wants to merge 2 commits into
apache:mainfrom
the-onewho-knocks:feat/write-manifest-first-row-id
Open

feat: let WriteManifest set and return v3 first_row_id#1321
the-onewho-knocks wants to merge 2 commits into
apache:mainfrom
the-onewho-knocks:feat/write-manifest-first-row-id

Conversation

@the-onewho-knocks

Copy link
Copy Markdown

Summary

  • add WithManifestFileFirstRowID, a ManifestFileOption that sets first_row_id on a v3+ data manifest
  • forward variadic ManifestFileOption args through WriteManifest to ToManifestFile

Why

WriteManifest predates the v3 work in #735 and forwarded no options to ToManifestFile, so ManifestFile.FirstRowID() was always nil for manifests written through this public helper, even though ManifestWriter/ToManifestFile already supported setting it via opts. Library consumers had no way to set or observe first_row_id through WriteManifest.

WriteManifest's signature change is purely additive (opts is the last, variadic parameter), so existing call sites are unaffected.

Testing

  • go test ./... -run '^TestWriteManifestWithFirstRowIDOption$' -count=1
  • go test ./... -count=1
  • gofmt -l .
  • golangci-lint run

Fixes #1274

Adds WithManifestFileFirstRowID, a ManifestFileOption that sets
first_row_id on a v3+ data manifest. WriteManifest now accepts
variadic ManifestFileOption args and forwards them to
ToManifestFile, so callers can set and observe first_row_id through
the public WriteManifest helper without affecting existing callers.

Fixes apache#1274
Signed-off-by: the-onewho-knocks <borsehardik@gmail.com>

@zeroshade zeroshade left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The variadic opts plumbing through WriteManifest is additive and backward-compatible, and the v3 round-trip test (write with the option, read back, assert inheritance) is exactly right. A few small nits inline around v3-only semantics and coverage; none blocking.

Comment thread manifest.go Outdated
// WithManifestFileFirstRowID sets the first_row_id on a v3+ data manifest.
func WithManifestFileFirstRowID(firstRowID int64) ManifestFileOption {
return func(mf *manifestFile) {
mf.FirstRowIDValue = &firstRowID

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mf.version is already set before options are applied, so this could self-gate to v3+ where first_row_id actually exists:

if mf.version >= 3 {
    mf.FirstRowIDValue = &firstRowID
}

As written, calling this against a v1/v2 writer returns a ManifestFile whose FirstRowID() is non-nil but gets silently dropped on serialize. WithManifestFileContent has the same no-guard pattern so I'm fine either way, but the v3 exclusivity is a bit stronger here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the mf.version >= 3 guard in b6f3bbb, with a doc comment noting it's a no-op on v1/v2 agreed it's worth being explicit here even though WithManifestFileContent doesn't guard the same way.

Comment thread manifest_test.go
m.EqualValues(1000+liveCount, *read[2].DataFile().FirstRowID())
}

func (m *ManifestTestSuite) TestWriteManifestWithFirstRowIDOption() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider adding (a) a v1/v2 + WithManifestFileFirstRowID case to pin the intended behavior (no-op vs non-nil FirstRowID()), and (b) a delete-manifest case, since the spec restricts first_row_id to data manifests. The read side already gates inheritance on formatVersion >= 3 && content == data, so this just locks down the write side.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added both in b6f3bbb:

a v1 manifest + WithManifestFileFirstRowID case confirming it's a no-op (FirstRowID() is nil)
a v3 delete-manifest case confirming the field can be set on the manifest itself (the guard only checks version), but entries don't inherit first_row_id on read since that's separately gated on content == data in the reader

Comment thread manifest.go
schema *Schema,
snapshotID int64,
entries []ManifestEntry,
opts ...ManifestFileOption,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (pre-existing): WriteManifest is exported but has no doc comment. Since you're already touching the signature, a one-liner noting that opts set v3-specific descriptor fields (e.g. WithManifestFileFirstRowID) would be a welcome add.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a doc comment on WriteManifest in b6f3bbb, calling out that opts can include v3-specific options like WithManifestFileFirstRowID.

Fixes apache#1274

Signed-off-by: the-onewho-knocks <borsehardik@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: let WriteManifest set and return v3 first_row_id

2 participants