Skip to content

feat: partition root table support with boundary-based leaf pairing#39

Open
leaocx wants to merge 1 commit into
cloudberry-contrib:mainfrom
leaocx:feat/partition-root-support
Open

feat: partition root table support with boundary-based leaf pairing#39
leaocx wants to merge 1 commit into
cloudberry-contrib:mainfrom
leaocx:feat/partition-root-support

Conversation

@leaocx

@leaocx leaocx commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Allow partition root tables in --include-table / --dest-table / --exclude-table and expand them into leaf-to-leaf pairs matched by partition boundary, with cross-version normalisation (GP6 ↔ GP7 ↔ CBDB) and partial-match support.

Motivation

  1. Partition root tables could not be specified as copy targets — cbcopy would Fatal on validation
  2. Large partition tables fell through to CopyOnMaster (coordinator single-pipe) because root reltuples=0, causing apparent hangs on multi-million row tables
  3. Leaf pairing by _prt_N name suffix silently misrouted data after add/drop partition operations
  4. Cross-version copy (GP6 ↔ GP7/CBDB) could not match leaves because boundary string formats differ
  5. Structural mismatch (different leaf counts) caused all-or-nothing fallback — either full leaf-to-leaf or full root→root, with no middle ground

Changes

File Change
option/option.go Add COPY_ON_SEGMENT constant; ForceOnSegment field on Table; fix validatePartTables / ValidateDestTables for partition roots
copy/copy.go Register --copy-on-segment flag
copy/copy_command.go Add forceOnSegment param to CreateCopyStrategy to bypass row-count threshold
copy/copy_manager.go Pass ForceOnSegment to strategy selection
copy/copy_metadata.go Propagate ForceOnSegment in fillTablePairChan
copy/copy_query.go Add Boundary field to PartLeafTable; GP7/CBDB uses pg_get_expr(relpartbound, oid); GP6 upgraded to pg_get_partition_rule_def(r.oid, true) (replaces parrangestart::text which returned internal node-tree)
copy/copy_query_wrapper.go Add pairPartitionLeaves with boundary normalisation, partial matching, and --allow-partition-reshape control; add leavesByRoot; update excludeTablePair
copy/boundary_normalize.go New. normalizeBoundary() converts GP6 rule-def and GP7/CBDB pg_get_expr to canonical form (e.g. RANGE:[2024-01-01,2024-02-01), LIST:(a,b), DEFAULT)
copy/boundary_normalize_test.go New. 17 unit tests for normalisation (GP6, GP7, cross-version equality)
copy/copy_query_wrapper_test.go New. 18 unit tests for leaf pairing (exact match, partial match, cross-version, edge cases)

Key design decisions

Boundary normalisation

  • GP6 pg_get_partition_rule_def: PARTITION p1 START ('2024-01-01'::date) END ('2024-02-01'::date) WITH (...)RANGE:[2024-01-01,2024-02-01)
  • GP7/CBDB pg_get_expr: FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')RANGE:[2024-01-01,2024-02-01)
  • Same normalised string = same boundary → enables cross-version leaf-to-leaf matching

Partial matching

  • Match as many leaves as possible by normalised boundary
  • Matched pairs → leaf-to-leaf copy (precise, parallel)
  • Unmatched src leaves → copy individually to dst root (partition routing handles placement)
  • Replaces old all-or-nothing approach

--allow-partition-reshape flag

  • Full match → no flag needed, leaf-to-leaf
  • Partial match → requires flag; without it, Fatal with actionable hint
  • Zero matches → requires flag for root→root fallback; without it, Fatal
  • Dest is plain table → always root→root (Warn only, no flag needed)

Logging

  • Full match: [INFO] all N leaves matched by boundary (leaf-to-leaf)
  • Partial match without flag: [CRITICAL] N of M src leaves could not be matched. Use --allow-partition-reshape...
  • Partial match with flag: [WARNING] N of M src leaves could not be matched; unmatched leaves will be copied individually to dst root
  • Zero match without flag: [CRITICAL] no boundary matches found. Use --allow-partition-reshape to allow root->root fallback.
  • Zero match with flag: [WARNING] falling back to root->root ON SEGMENT

Test results

Unit tests: 35/35 PASS

E2E cross-cluster tests: 45/45 PASS

9 direction combinations × 5 scenarios:

Combination P1 leaf-to-leaf P2 name-drift P3a no-flag→Fatal P3b reshape→partial P4 plain-dest
GP6 → GP6
GP7 → GP7
CBDB → CBDB
GP7 → CBDB
CBDB → GP7
GP6 → GP7
GP6 → CBDB
GP7 → GP6
CBDB → GP6

Environments:

  • GP6: HashData 3.13.33 (PG 9.4.26)
  • GP7: Greenplum 7.1.0 (PG 12.12)
  • CBDB: Apache Cloudberry 2.4.0 (PG 14.4)

Known scope exclusions

  • --include-table without --dest-table on CB-family (separate issue)
  • Multi-level (intermediate) partitions (separate issue)

@CLAassistant

CLAassistant commented Jun 18, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@talmacschen-arch talmacschen-arch left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this — the boundary-based leaf pairing is the right core idea, and --copy-on-segment / ForceOnSegment solves a real problem. I have a few concerns, the first two of which I think should be addressed before merge. Detailed below.

What the PR does well

  • Boundary-based pairing (never _prt_N suffix/ordinal) is exactly right. pg_get_expr(relpartbound, oid) for GP7/CBDB and the parrange*/parlistvalues concat for GP6 are name-independent, so data can't be misrouted after add/drop partition. 👍
  • The CopyOnMaster fallthrough fix (ForceOnSegment for root→root, where reltuples=0 would otherwise route a multi-million-row tree onto the single coordinator pipe) is a genuine, easy-to-miss bug. Good catch.

🔴 1. The entire test matrix is GP6→GP6; the CB-family path is unverified

The test plan is 50 E2E scenarios across two GP 6.20.3 clusters. But the hardest partition behavior in cbcopy lives on the CB family (Cloudberry / SynxDB / Apache Cloudberry) source path, which is a completely different code path from GP6:

  • The GP7/CBDB boundary SQL you added — coalesce((SELECT pg_get_expr(relpartbound, oid) ...), '') in copy_query.go — is never exercised by GP6 testing (GP6 takes the pg_partition_rule branch).
  • The CBDB branch of leaf pairing / split is unverified.

GP6 happens to "just work" for partition roots because its DDL is inline (a single CREATE TABLE ... PARTITION BY (...) brings all leaves along), which masks problems that only appear on declarative-partition CB sources. Please add a CB→CB (or SynxDB→SynxDB) E2E pass covering at least: boundary pairing success, leaf-count-mismatch fallback, and the large-table root→root path. Until then, "partition root support" is verified only for GP6.

🔴 2. --include-table <root> without --dest-table is still broken on CB family, and the title implies otherwise

All of the new pairing logic lives in processDestinationTables, which is only reached when --dest-table is supplied (GetUserTables: the len(GetDestTablesByDb) == 0 branch returns earlier). The --include-table <root> / no---dest-table path is untouched and still flows through collectTablesAndSchemas → DDL extraction → getUserTableRelationsWithIncludeFiltering.

On CB-family source that extraction matches OIDs by exact FQN and does not expand a root OID to its descendants, so it emits only CREATE TABLE root ... PARTITION BY (...) — an empty partition skeleton on the destination (no leaves, no ATTACH PARTITION). In --metadata-only this is silent; in a full copy the data stage then fails with relation "..._1_prt_xxx" does not exist. (This is the gap the author flagged at queries_relations.go:161-163 — the unported gpbackup GP7+ change e00d2a1f.)

This is pre-existing, not introduced here — but the PR title "partition root table support" reads as if --include-table <root> now works, when on CB family the most common usage (copy a partition table to another cluster, same name, no --dest-table) still yields an empty skeleton. Please either scope the title/description to "--dest-table data-copy path only" and note the remaining DDL gap as a separate issue, or address the OID expansion in this PR.

🟡 3. Fallback reasons are logged at Debug — a "looks-like-success" trap

Successful pairing logs at Info ("mapping partition root ... as N leaf->leaf pairs"), but every fallback reason (leaf count mismatch, empty boundary, no boundary match, duplicate boundary) logs at gplog.Debug, which is invisible by default. A large table can silently fall back to root→root and lose all per-leaf parallelism with no signal to the operator. Please promote the fallback to gplog.Warn, with the reason and the consequence, e.g.:

WARN: partition "src...sales" -> "dst...sales": leaf pairing failed
      (leaf count mismatch 5 vs 4); falling back to whole-tree root->root
      copy (forced ON SEGMENT). Per-leaf parallelism is lost.

Concrete instance of why this matters: GP6→CB --dest-table <root> will never pair, because the GP6 boundary string format (parrange* concat) and the CBDB format (pg_get_expr) differ, so it always falls back — silently, today.

🟡 4. No automated tests

Seven files changed, zero _test.go. pairPartitionLeaves is pure, table-driven logic that's straightforward to unit test: boundary match, leaf-count mismatch, empty boundary, duplicate boundary, no-match. Given #1 (CB path can't be exercised on the GP6 test rig), unit tests are the cheapest way to cover the CBDB branch's pairing logic deterministically.

🟢 5. Intermediate partitions (partType == 'i') are not rejected

validatePartTables now continues for Partition == 1 and otherwise only checks existence. A multi-level partition's intermediate level (neither a true root nor a leaf) isn't a root, so it falls to the "does not exist on source" Warn and is silently dropped rather than rejected with a clear error. Minor / rare, but worth an explicit fatal for clarity.


To be clear on severity: #1 and #2 are the ones I'd want resolved (or the scope explicitly narrowed) before merge; #3#5 are quality improvements. Happy to re-review once the CB-family path has a test pass.

@talmacschen-arch

Copy link
Copy Markdown
Collaborator

Following up specifically on the leaf-count-mismatch handling, which I think needs a stronger stance than the current behavior.

Today, in pairPartitionLeaves:

if len(srcLeaves) != len(destLeaves) {
    gplog.Debug("... leaf count mismatch (%v vs %v); falling back to root->root", ...)
    return nil, nil, false
}

→ the caller silently falls back to a whole-tree root->root copy (forced ON SEGMENT). My concern is not just that it's logged at Debug (raised in my earlier review) — it's that two genuinely different situations are collapsed into one silent fallback:

  1. Destination is a plain (non-partitioned) tablelen(destLeaves) == 0. Flattening the whole partition tree into a single table is an unambiguous, intended operation. root->root is correct here. A Warn + fallback is fine.

  2. Destination is itself a partitioned table, but its leaf count / boundaries differ from the source — this is a structural mismatch in a partition-to-partition migration. root->root relies on kernel tuple routing, which gives three outcomes:

    • dest ranges cover all source rows → correct (legitimate re-partition);
    • dest is missing a range and has no DEFAULT partition → COPY errors loudly (safe-ish);
    • dest is missing a range and has a DEFAULT partition → rows silently land in the default partition → silent data misplacement.

The third outcome is the dangerous one, and for a migration-fidelity tool it shouldn't be auto-resolved by a silent fallback. I'd argue case 2 should Fatal by default, printing both leaf counts (and the first non-matching boundary), and only degrade to root->root when the user explicitly opts in (e.g. a --allow-partition-reshape flag) for the legitimate re-partition-on-migrate use case.

Concretely:

  • len(destLeaves) == 0 (plain dest) → Warn + root->root flatten. (keep)
  • len(destLeaves) > 0 and leaf count / boundary differs → Fatal with a clear message, unless --allow-partition-reshape is set → then Warn + root->root.
  • dest root does not exist → Fatal (already correct).

This keeps the boundary-pairing fast path, keeps the legitimate flatten case working, but stops a structural divergence between a partitioned source and a partitioned destination from being silently absorbed (and possibly misrouted into a DEFAULT partition). It also makes the "I deliberately want to re-partition during migration" path an explicit, auditable choice rather than an invisible default.

Allow partition root tables in --include-table / --dest-table /
--exclude-table (previously caused CRITICAL error). Expand partition
roots into leaf-to-leaf pairs matched by partition boundary (not
_prt_N ordinal which drifts after add/drop partition).

Key changes:
- Add --copy-on-segment flag and ForceOnSegment to bypass row-count
  threshold for partition root tables whose reltuples=0
- Pair leaves by normalised range/list/default boundary across GP6,
  GP7, and CBDB (cross-version leaf-to-leaf matching supported)
- GP6 boundary query upgraded from parrangestart::text (internal
  node-tree) to pg_get_partition_rule_def(r.oid, true) for
  human-readable boundary values
- Boundary normalisation: GP6 rule-def and GP7/CBDB pg_get_expr both
  normalise to canonical form (e.g. RANGE:[2024-01-01,2024-02-01))
- Partial matching: maximise leaf-to-leaf pairs by boundary; unmatched
  src leaves copy individually to dst root via partition routing
- --allow-partition-reshape flag controls partial-match and zero-match
  behaviour (Fatal without flag, partial/root-fallback with flag)
- Promote all fallback reasons from Debug to Warn/Critical with
  actionable hints
- Fix validatePartTables / ValidateDestTables to allow partition roots
- Propagate ForceOnSegment through fillTablePairChan
- Add 35 unit tests covering boundary normalisation, cross-version
  matching, and all pairing edge cases
@leaocx leaocx force-pushed the feat/partition-root-support branch from a255d72 to 3a78ce5 Compare June 24, 2026 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants