feat(schema): add rename_column to UpdateSchemaAction by nazq · Pull Request #2563 · apache/iceberg-rust

nazq · 2026-06-01T19:50:03Z

Which issue does this PR close?

Closes Add rename_column support to UpdateSchemaAction #2562.

What changes are included in this PR?

Adds column rename to UpdateSchemaAction, completing the add/delete/rename triad for non-incompatible schema evolution in transaction::update_schema.

Relationship to #2451 — the bigger refactor

Since opening this PR, @TwinklerG's #2451 has proposed a broader refactor of UpdateSchemaAction into an operations-buffer model that subsumes rename plus update_column, move, type promotion, and allow_incompatible_changes — the full PyIceberg / Java UpdateSchema surface that #697 calls for. I think #2451 is the architecturally sound long-term direction, mirroring what @Fokko built in pyiceberg.

To make this PR a clean stepping stone rather than a divergent path, I've aligned the rename API to #2451's RenameColumn typed-builder shape. The user-facing call is now:

let tx = Transaction::new(&table);
let action = tx.update_schema()
    .add_column(AddColumn::optional("new_col", Type::Primitive(PrimitiveType::Int)))
    .rename(RenameColumn::builder().name("old_name").new_name("new_name").build())
    .rename(RenameColumn::builder().name("person.name").new_name("fullname").build())
    .delete_column("dead_col");
let tx = action.apply(tx).unwrap();
let table = tx.commit(&catalog).await.unwrap();

Identical to how rename works under #2451, so callers who adopt this PR today won't need a source change when #2451 lands — only the dispatch underneath the .rename() method changes (from imperative-vec to operations-buffer). Field IDs are preserved across rename; only the leaf name changes.

Implementation

A new renames: Vec<(String, String)> accumulator on UpdateSchemaAction, plus a new step in commit() between deletes and additions:

After deletes so a rename can re-use a name being deleted in the same action.
Before additions so an addition can re-use a name being renamed away.

The renames are resolved to a HashMap<i32, String> keyed by field ID, then threaded through rebuild_fields / rebuild_field — same recursive walk that handles deletes and nested additions. Each rebuild call site that previously copied field.name.clone() now looks up the rename map first and falls back to the original name. Identifier-field handling falls out for free because Rust keys identifier_field_ids by ID, not name — the existing with_identifier_field_ids(base_schema.identifier_field_ids()) step propagates the set unchanged.

RenameColumn itself is a TypedBuilder struct with name / new_name fields, both setter(into), re-exported from iceberg::transaction alongside AddColumn. Same shape as #2451's RenameColumn.

Semantics

Modeled on pyiceberg.table.update.schema.UpdateSchema.rename_column:

name must exist in the current schema → otherwise PreconditionFailed.
A field cannot be both renamed and deleted in the same action → PreconditionFailed (intent is ambiguous; matches PyIceberg).
new_name cannot contain SCHEMA_NAME_DELIMITER → PreconditionFailed (the unqualified-name requirement; preventing it from silently looking like a move-across-structs).
new_name cannot collide with a sibling that is not itself being deleted or renamed away → PreconditionFailed.
Same field renamed twice → last rename wins (matches PyIceberg's "stack on prior update" behavior).

Scope

Rename only. PyIceberg's UpdateSchema has other ops (update_column, make_column_required, move_*, set_identifier_fields) — those are #2451's scope, not this PR's. Treat this as the rename on-ramp; treat #2451 as the destination.

Are these changes tested?

Yes — 10 new unit tests in transaction::update_schema::tests, matching the style and depth of the existing test_*_add_column* / test_*_delete_column* tests:

Test	What it covers
`test_rename_column`	Simple root-level rename, verifies field ID preserved
`test_rename_nested_column`	Nested rename (`person.name` → `person.fullname`), preserves sibling fields
`test_rename_missing_column_fails`	Missing source field → `PreconditionFailed`
`test_rename_and_delete_same_column_fails`	Rename + delete on same column → `PreconditionFailed`
`test_rename_to_existing_sibling_fails`	Collision with a non-deleted, non-renamed sibling → `PreconditionFailed`
`test_rename_path_with_dot_in_new_name_fails`	Dotted `new_name` rejected → `PreconditionFailed`
`test_rename_preserves_identifier_field`	Identifier-field ID survives rename
`test_rename_frees_name_for_addition`	Rename `z`→`z_old` then add new `z` in same action
`test_rename_same_column_twice_last_wins`	Repeated rename of same field — last one wins
`test_rename_self_is_noop`	Rename `z`→`z` is a no-op (no schema-version bump emitted)

All 18 existing transaction::update_schema tests still pass. Full iceberg lib suite: 1306/1306. Clippy + rustfmt clean.

We've been carrying this on a fork branch for a downstream consumer; happy to migrate that consumer to #2451's API surface as soon as it lands.

TwinklerG · 2026-06-02T06:56:18Z

Implemented most of the schema evolution features in #2451.

Extends UpdateSchemaAction with a rename(RenameColumn) builder method, completing the add/delete/rename triad for non-incompatible schema evolution. API shape: action.rename( RenameColumn::builder().name("old").new_name("new").build(), ); Mirrors the typed-builder shape used by apache#2451's broader schema-evolution refactor, so callers adopting this PR today can migrate to apache#2451 with no source change once it lands. Renames preserve field IDs — only the leaf name changes. RenameColumn::name uses SCHEMA_NAME_DELIMITER for nested fields (e.g. "person.name"); RenameColumn::new_name must be unqualified. Ordering in commit(): 1. deletes validated 2. renames validated — runs after deletes so a rename can re-use a name being deleted, before additions so an addition can re-use a name being renamed away 3. additions validated + ID-assigned 4. schema tree rebuilt with renames threaded through rebuild_fields / rebuild_field Validation rules (match pyiceberg's UpdateSchema.rename_column): - missing source field → PreconditionFailed - source field also staged for deletion → PreconditionFailed - new_name contains SCHEMA_NAME_DELIMITER → PreconditionFailed - new_name collides with a non-deleted, non-renamed sibling → PreconditionFailed - same field renamed twice → last rename wins Identifier-field handling falls out for free — Rust keys identifier fields by ID, not name, so with_identifier_field_ids(base_schema .identifier_field_ids()) propagates the set unchanged across rename. Tests: 10 new tests covering simple root rename, nested rename, missing field, delete-conflict, sibling-collision, dotted-name rejection, identifier-field preservation, rename-frees-old-name (combined with add), repeated-rename-last-wins, no-op self-rename. Full iceberg lib suite: 1306/1306 passing. Clippy + rustfmt clean.

nazq · 2026-06-02T15:33:05Z

Thanks for pointing at #2451, @TwinklerG — I went and read through it carefully.

After studying both PRs, I think your refactor is the more architecturally sound long-term direction. The operations-buffer-then-schema_update shape is what lets UpdateSchemaAction correctly model dependent operations (rename-then-update, add-then-make-required, etc.) — and it's the shape that mirrors what @Fokko built in PyIceberg's UpdateSchema and the Java SchemaUpdate, which is exactly what #697 asks for. The current add_column / delete_column style from #2120 can't grow into that without a breaking change, so paying the API churn cost once now is the right call.

To make that easier, I've updated this PR to use the RenameColumn typed-builder shape from your #2451, so the user-facing call is now identical between the two:

action.rename(RenameColumn::builder().name("old").new_name("new").build())

That way this PR functions as a clean stepping stone: callers adopting it today won't need a source change when #2451 lands — only the dispatch underneath .rename() changes (imperative-vec → operations-buffer). If maintainers want a lighter incremental win now, this PR is ready; either way, #2451 is the destination.

We've been carrying this on a fork for a downstream consumer; happy to migrate to #2451's full API as soon as it merges. If contributing test coverage for any of your #[ignore = "not yet implemented"] cases would help land it faster, let me know which ones — we have real-world data for case-insensitive resolution and decimal-precision widening that could plug in.

TwinklerG · 2026-06-03T02:10:17Z

Thanks @nazq for taking time to review my PR. Your PR is a brilliant move to ensure a smooth transition. I really appreciate your validation of the architecture in #2451.
Actually the test cases including #[ignore = "not yet implemented"] cases are ported from iceberg-java core/src/test/java/org/apache/iceberg/TestSchemaUpdate.java, and I haven't implemented them all yet. I am glad if you can contribute more cases.

nazq mentioned this pull request Jun 2, 2026

Table scan rejects current-schema column names after UpdateSchemaAction commit #2565

Open

nazq force-pushed the update-schema-rename branch 2 times, most recently from 411703e to c42b685 Compare June 2, 2026 15:24

nazq force-pushed the update-schema-rename branch from c42b685 to eca2ba4 Compare June 2, 2026 15:25

nazq mentioned this pull request Jun 2, 2026

feat(schema): refactor UpdateSchemaAction for schema evolution in Iceberg #2451

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(schema): add rename_column to UpdateSchemaAction#2563

feat(schema): add rename_column to UpdateSchemaAction#2563
nazq wants to merge 1 commit into
apache:mainfrom
nazq:update-schema-rename

nazq commented Jun 1, 2026 •

edited

Loading

Uh oh!

TwinklerG commented Jun 2, 2026

Uh oh!

nazq commented Jun 2, 2026

Uh oh!

TwinklerG commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nazq commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

What changes are included in this PR?

Relationship to #2451 — the bigger refactor

Implementation

Semantics

Scope

Are these changes tested?

Uh oh!

TwinklerG commented Jun 2, 2026

Uh oh!

nazq commented Jun 2, 2026

Uh oh!

TwinklerG commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nazq commented Jun 1, 2026 •

edited

Loading