Skip to content

fix(table): preserve version hint URIs in orphan cleanup#1331

Open
fallintoplace wants to merge 2 commits into
apache:mainfrom
fallintoplace:fix/orphan-cleanup-version-hint-uri
Open

fix(table): preserve version hint URIs in orphan cleanup#1331
fallintoplace wants to merge 2 commits into
apache:mainfrom
fallintoplace:fix/orphan-cleanup-version-hint-uri

Conversation

@fallintoplace

Copy link
Copy Markdown
Contributor

Summary

build the orphan-cleanup version-hint path with a URI-aware join for URL locations
keep the existing filepath join fallback for plain local filesystem paths
add regression coverage for URI and local-path version-hint construction, plus the referenced-file set

Why

Table.getReferencedFiles adds metadata/version-hint.text to the referenced set so orphan cleanup will not delete it. That path was being built with filepath.Join(metadata.Location(), ...), which is fine for plain local paths but corrupts object-store URIs like s3://bucket/table into s3:/bucket/table/....

Once that malformed path hit orphan cleanup's normalizer, it no longer looked like a URL and could diverge from the real s3://.../version-hint.text candidate path. That made the real version-hint file eligible to be treated as an orphan.

Testing

go test ./table -run 'Test(VersionHintLocation|GetReferencedFiles_IncludesStatisticsFiles|.Orphan.)$' -count=1
go test ./table -count=1
go run github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.11.4 run --timeout=10m ./table/...
git diff --check

@fallintoplace fallintoplace requested a review from zeroshade as a code owner June 27, 2026 14:48

@tanmayrauth tanmayrauth left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean fix and the write→read round-trip holds for s3/gs/file URIs and local paths alike. Just one small test-coverage nit. Details inline.

Comment thread table/orphan_cleanup.go
}

func versionHintLocation(tableLocation string) string {
if strings.Contains(tableLocation, "://") || strings.HasPrefix(tableLocation, "file:") {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch adding || strings.HasPrefix(tableLocation, "file:") — since file:///x and file://host/x already contain ://, this branch is really only here to cover the single-slash file:/x opaque form. Just confirming that's the intent.

}
}

func TestVersionHintLocation(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three cases here all either contain :// (s3, file:///) or fall through to filepath.Join (local), so none of them actually exercise the file:-prefix-only branch above. A file:/tmp/tablefile:/tmp/table/metadata/version-hint.text case would lock in the one input that branch uniquely guards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants