Skip to content

feat: commentIdBase prop for collision-free collab comment IDs (#257)#956

Draft
jedrazb wants to merge 16 commits into
mainfrom
fix/257-comment-id-base-v2
Draft

feat: commentIdBase prop for collision-free collab comment IDs (#257)#956
jedrazb wants to merge 16 commits into
mainfrom
fix/257-comment-id-base-v2

Conversation

@jedrazb

@jedrazb jedrazb commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Supersedes #942

In a collaborative session every peer's createCommentIdAllocator() started at 1, so the first comment from each peer got the same id and replies threaded onto the wrong card. This adds optional commentIdBase/commentIdStride props (React + Vue) that partition the ID space: pass ydoc.clientID * 1_000_000 with commentIdStride={1_000_000} and each peer mints from a disjoint range. Default 0/Infinity preserves the single-editor 1, 2, 3, … sequence. Fixes #257.

Review fixes applied on top of the original work:

  • Scope the collaboration-example dev server to Playwright runs that include the chromium project, so project-scoped runs (notably the release publish-path parity smoke gate) don't boot an unrelated server.
  • Time-box the example's startup GitHub stars fetch (2s) so a restricted network can't hang the webServer startup window.
  • Document the two preconditions for cross-peer collision-freedom (distinct clientIDs; fewer than stride allocations per session).
  • Use Vue binding syntax in the Vue prop's JSDoc example.

Known follow-up (not addressed here): DocxEditor.tsx crossed 2000 lines so the max-lines cap was nudged to 2020. The real fix is the planned file split, tracked separately.


View with Codesmith Autofix with Codesmith
Need help on this PR? Tag /codesmith with what you need. Autofix is disabled.

samcorcos and others added 9 commits June 19, 2026 10:28
seedAbove now ignores IDs outside (base, base+stride] so a peer's synced
tracked-change revisionId can't pull this peer's allocator into another
partition; commentIdStride prop threads the stride through React/Vue and
the collab example. Floating add-comment button gets a data-testid so the
e2e test no longer matches on computed CSS, and getAttribute null is
guarded for a clearer failure.
- Scope the collaboration-example dev server to Playwright runs that
  include the chromium project, so project-scoped runs (notably the
  release publish-path parity smoke gate) don't boot an unrelated server.
- Time-box the example's startup GitHub stars fetch (2s) so a restricted
  network can't hang the webServer startup window.
- Document the two preconditions for cross-peer collision-freedom
  (distinct clientIDs; fewer than `stride` allocations per session).
- Use Vue binding syntax in the Vue prop's JSDoc example.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docx-editor Ready Ready Preview, Comment Jun 23, 2026 6:51am

Request Review

@eigenpal-release-pal

eigenpal-release-pal Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

All contributors have signed the CLA ✍️ ✅

Posted by the CLA bot.

@greptile-apps

greptile-apps Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds optional commentIdBase / commentIdStride props (React + Vue) that partition the comment/revision ID space across collaborating peers, fixing the collision where every peer's allocator started at 1. The default values (0 / Infinity) preserve the existing single-editor 1, 2, 3… sequence with no behavioural change.

  • Core allocator (commentIdAllocator.ts): createCommentIdAllocator gains base and stride params; seedAbove now ignores IDs outside (base, base+stride] so synced peer revision marks cannot cross-contaminate the local counter.
  • React + Vue: new props threaded to the allocator at mount; parity contract and API snapshots updated; DocxEditorPagedArea gains data-testid="floating-add-comment" to support the E2E test.
  • E2E + Playwright config: hermetic two-peer BroadcastChannel test added; collab dev server conditionally started only when the chromium project is in scope; GitHub stars fetch time-boxed to 2 s to avoid hanging CI startup.

Confidence Score: 4/5

Safe to merge; the core logic is correct for all documented usage patterns and the backward-compatible defaults preserve existing single-editor behaviour.

The allocator correctly filters out-of-partition IDs in seedAbove, both React and Vue wire up the props symmetrically, unit tests cover the key collab edge cases, and the parity contract is updated. Two observations hold the score: next() is unbounded and will silently overflow ceiling if a session mints more than stride IDs (the ceiling variable exists but is unused in next()), and the E2E test's .last() card picker could non-deterministically grab a peer-synced card depending on sidebar sort order under slow Yjs sync.

packages/core/src/prosemirror/commentIdAllocator.ts — the next() overflow path; e2e/tests/collab-comment-id.spec.ts — the .last() card-ID picker.

Important Files Changed

Filename Overview
packages/core/src/prosemirror/commentIdAllocator.ts Core change: adds base/stride params for ID partitioning; seedAbove correctly clamps to partition, but next() is unbounded and silently overflows ceiling
packages/core/src/prosemirror/tests/commentIdAllocator.test.ts Good unit coverage: default sequence, partition isolation, in-partition seed, out-of-partition seed, and cross-peer contamination all tested
packages/react/src/components/DocxEditor.tsx Correctly threads commentIdBase/commentIdStride into useRef(createCommentIdAllocator(...)); read-once-at-mount behavior is documented and intentional
packages/vue/src/components/DocxEditor.vue Vue component initializes allocator with props at setup time; commentIdStride defaults to undefined (which correctly maps to Infinity via function default)
e2e/tests/collab-comment-id.spec.ts Hermetic two-peer BroadcastChannel test; the .last() card picker could race against Yjs sync delivering the peer's card out of order
playwright.config.ts Conditionally boots the collab dev server only when chromium project is in scope; requestedProjects parsing handles both --project=x and --project x forms correctly
examples/collaboration/src/useCollaboration.ts Derives commentIdBase = ydoc.clientID * 1_000_000 inside the roomName-scoped useMemo and exports both props; clean integration
examples/collaboration/vite.config.ts AbortSignal.timeout(2000) added to the GitHub stars fetch to prevent hanging the Playwright webServer startup window on restricted networks

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["DocxEditor mount\n(commentIdBase, commentIdStride)"] --> B["createCommentIdAllocator(base, stride)\nnextId = base + 1\nceiling = base + stride"]
    B --> C["seedCommentAllocator called\non document load"]
    C --> D{"For each comment/\nrevision mark ID"}
    D --> E["allocator.seedAbove(id)"]
    E --> F{"id >= nextId\nAND id <= ceiling?"}
    F -- Yes --> G["nextId = id + 1\n(in-partition seed)"]
    F -- No --> H["Ignored\n(out-of-partition\nor already passed)"]
    G --> I["allocator.next()\nreturns nextId++"]
    H --> I
    I --> J{"nextId > ceiling?\n(overflow)"}
    J -- "No (common case)" --> K["ID minted in own partition\n✓ No collision"]
    J -- "Yes (> stride IDs minted)" --> L["ID crosses into\nadjacent partition\n⚠ Silent overflow"]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A["DocxEditor mount\n(commentIdBase, commentIdStride)"] --> B["createCommentIdAllocator(base, stride)\nnextId = base + 1\nceiling = base + stride"]
    B --> C["seedCommentAllocator called\non document load"]
    C --> D{"For each comment/\nrevision mark ID"}
    D --> E["allocator.seedAbove(id)"]
    E --> F{"id >= nextId\nAND id <= ceiling?"}
    F -- Yes --> G["nextId = id + 1\n(in-partition seed)"]
    F -- No --> H["Ignored\n(out-of-partition\nor already passed)"]
    G --> I["allocator.next()\nreturns nextId++"]
    H --> I
    I --> J{"nextId > ceiling?\n(overflow)"}
    J -- "No (common case)" --> K["ID minted in own partition\n✓ No collision"]
    J -- "Yes (> stride IDs minted)" --> L["ID crosses into\nadjacent partition\n⚠ Silent overflow"]
Loading

Reviews (1): Last reviewed commit: "fix(comment-id): address review on colla..." | Re-trigger Greptile

Comment on lines +55 to 63
export function createCommentIdAllocator(base = 0, stride = Infinity): CommentIdAllocator {
let nextId = base + 1;
const ceiling = base + stride;
return {
next: () => nextId++,
seedAbove(maxId: number) {
if (maxId >= nextId) nextId = maxId + 1;
if (maxId >= nextId && maxId <= ceiling) nextId = maxId + 1;
},
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 next() is unbounded — silent partition overflow

ceiling is computed but only consulted in seedAbove; next() never checks it. Once a peer mints its 1,000,001st ID it silently emits values in the adjacent partition. From that point, the other peer's seedAbove will accept those values (they fall within (adjacentBase, adjacentBase + stride]) and advance the victim's counter, causing the exact collision this feature is meant to prevent — with no error or warning.

Adding a dev-mode guard directly in next() (e.g., if (process.env.NODE_ENV !== 'production' && nextId > ceiling) console.warn(...)) would make the overflow immediately visible during development without any production overhead.

Comment thread e2e/tests/collab-comment-id.spec.ts Outdated
Comment on lines +24 to +28
const card = page.locator(`${SIDEBAR} .docx-comment-card[data-comment-id]`).last();
await card.waitFor({ state: 'visible' });
const id = await card.getAttribute('data-comment-id');
if (id == null) throw new Error('comment card rendered without data-comment-id');
return Number(id);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 .last() card picker is racing against Yjs sync

By the time peer B presses Enter and the assertion resolves, peer A's comment card may have already synced into peer B's sidebar. If the sidebar renders synced cards after the locally-minted card (e.g., by comment ID ascending, where peer B's base is higher than peer A's), .last() returns the correct card. But if the sidebar sorts by document position or Y.Array insertion order, A's card might appear after B's, causing .last() to return A's ID — making idB === idA and tripping the distinctness assertion non-deterministically.

A more resilient approach is to snapshot visible IDs before adding the comment and then wait for a new ID to appear, rather than relying on positional order.

@jedrazb

jedrazb commented Jun 20, 2026

Copy link
Copy Markdown
Contributor Author

Heads up before this lands: the per-peer ID scheme produces comment/revision IDs that Word won't accept, so collaborative saves will come back corrupt.

The relevant attributes (w:comment/@w:id, w:ins/@w:id, w:del/@w:id, w:commentReference/@w:id, w:commentRangeStart/End) are all ST_DecimalNumber, which the schema defines as plain xsd:integer, no upper bound:

<xsd:simpleType name="ST_DecimalNumber">
  <xsd:restriction base="xsd:integer"/>
</xsd:simpleType>

The schema doesn't match what Word actually does, though. Word reads these IDs as signed 32-bit ints, so anything above 2,147,483,647 (0x7FFFFFFF) gets rejected: the file opens into "unreadable content" / Document Recovery and the comment or revision is dropped or renumbered. We already depend on this exact limit elsewhere. packages/core/src/utils/hexId.ts caps generated IDs at MAX_HEX_ID_EXCLUSIVE = 0x7fffffff, with a comment that it's there "to survive both Word ('Document Recovery') and strict OOXML validators." Same ceiling applies to the decimal w:id fields.

The scheme here is base = ydoc.clientID * 1_000_000. clientID is a random uint32 (up to ~4.3e9), so the base passes 0x7FFFFFFF once clientID is over ~2147, which is almost always: a random clientID lands at or under 2147 about 1 in 2 million times. So in practice every peer's first comment is already out of range (e.g. clientID 4e9 gives an id near 4e15). The serializer writes comment.id straight through (it only clamps negatives), so those values reach the XML unchanged.

The partitioning idea itself is fine for keeping IDs unique; the problem is just that clientID is too large to use as the multiplier.

There are really two separate requirements:

I'd keep the per-peer partition for the live side, and renumber the comment + revision IDs down to a dense 1..N at serialize time, only when something is actually out of range so normal single-editor saves stay untouched, with a hard 0x7FFFFFFF clamp as a backstop. One thing to watch: the editor uses a single ID space shared across comments and revisions, so the renumber has to draw from one sequence rather than two. This also covers oversized IDs coming from a malformed input file, not just collab.

Suggest holding the merge until the save side is handled. The partition on its own isn't safe for Word.

@jedrazb jedrazb marked this pull request as draft June 20, 2026 19:07
@samcorcos

Copy link
Copy Markdown
Contributor

Follow-up patch addressing the int32 w:id concern raised above, plus the two Greptile P2s. Applies cleanly on top of 44910a02.

What it does:

  • New packages/core/src/docx/serializer/decimalIdRemap.ts: pre-pass run from repackDocx / repackDocxFromRaw that walks the Document (body, comments, headers/footers, notes, tables, SDTs) and, only when any comment/revision ID exceeds 0x7FFFFFFF, remaps all of them to a dense 1..N from a single shared sequence. No-op otherwise, so single-editor saves are byte-identical. attemptSelectiveSave bails to full repack when a remap is needed (range markers in unchanged paragraphs).
  • New decimalIdRemap.test.ts with an output-level invariant: createDocx() a doc whose IDs are ~4×10¹⁵, unzip, assert every w:id in document.xml + comments.xml is ≤ 0x7FFFFFFF and that commentRangeStartw:comment IDs still match. This test fails on the current head of this branch (Received: 4000000001000001) and passes with the patch.
  • commentIdAllocator.ts: JSDoc now states the int32 reality and points at the remap; next() dev-warns once if it crosses ceiling.
  • collab-comment-id.spec.ts: addComment() snapshots existing IDs and polls for the new one instead of .last().
  • Changeset summary updated.

Verified locally: bun run typecheck, bun test packages/core/src/docx/ (376 pass), npx playwright test collab-comment-id --project=chromium (pass).

Patch (apply with git apply)
diff --git a/.changeset/comment-id-base-collab.md b/.changeset/comment-id-base-collab.md
index eb849e32..95f8fcc0 100644
--- a/.changeset/comment-id-base-collab.md
+++ b/.changeset/comment-id-base-collab.md
@@ -4,4 +4,4 @@
 '@eigenpal/docx-editor-vue': minor
 ---
 
-Add `commentIdBase`/`commentIdStride` props to partition comment/revision IDs per collaborating peer. Fixes #257.
+Add `commentIdBase`/`commentIdStride` props to partition comment/revision IDs per collaborating peer. On save, oversized IDs are renumbered to fit Word's signed-int32 `w:id` limit so collaborative documents open cleanly. Fixes #257.
diff --git a/e2e/tests/collab-comment-id.spec.ts b/e2e/tests/collab-comment-id.spec.ts
index 2797f666..578eeba6 100644
--- a/e2e/tests/collab-comment-id.spec.ts
+++ b/e2e/tests/collab-comment-id.spec.ts
@@ -15,17 +15,26 @@ const SIDEBAR = '.docx-unified-sidebar';
 const ADD_COMMENT_BTN = '[data-testid="floating-add-comment"]';
 
 async function addComment(page: Page, body: string): Promise<number> {
+  // Snapshot existing IDs first, then wait for a NEW one to appear — `.last()`
+  // would race against the peer's comment syncing into this sidebar.
+  const before = new Set(await visibleCommentIds(page));
   await page.locator('.layout-page-content').click({ clickCount: 3 });
   await page.locator(ADD_COMMENT_BTN).click();
   const ta = page.locator(`${SIDEBAR} textarea`).last();
   await ta.waitFor({ state: 'visible' });
   await ta.fill(body);
   await ta.press('Enter');
-  const card = page.locator(`${SIDEBAR} .docx-comment-card[data-comment-id]`).last();
-  await card.waitFor({ state: 'visible' });
-  const id = await card.getAttribute('data-comment-id');
-  if (id == null) throw new Error('comment card rendered without data-comment-id');
-  return Number(id);
+  let added: number | undefined;
+  await expect
+    .poll(
+      async () => {
+        added = (await visibleCommentIds(page)).find((id) => !before.has(id));
+        return added;
+      },
+      { timeout: 10_000 }
+    )
+    .toBeDefined();
+  return added!;
 }
 
 async function visibleCommentIds(page: Page): Promise<number[]> {
diff --git a/packages/core/src/docx/rezip.ts b/packages/core/src/docx/rezip.ts
index 42811c19..2f787315 100644
--- a/packages/core/src/docx/rezip.ts
+++ b/packages/core/src/docx/rezip.ts
@@ -39,6 +39,7 @@
 import JSZip from 'jszip';
 import type { Document } from '../types/document';
 import { serializeDocument } from './serializer/documentSerializer';
+import { remapOversizedRevisionIds } from './serializer/decimalIdRemap';
 import { type RawDocxContent } from './unzip';
 import { escapeXml } from './serializer/xmlUtils';
 
@@ -106,6 +107,10 @@ export async function repackDocx(doc: Document, options: RepackOptions = {}): Pr
   const { compressionLevel = 6, updateModifiedDate = true, modifiedBy } = options;
   const exportDocument = doc;
 
+  // Compact collab-partitioned comment/revision IDs down to Word's signed-int32
+  // range before serializing any part. No-op for normal single-editor saves.
+  remapOversizedRevisionIds(exportDocument);
+
   // Load the original ZIP
   const originalZip = await JSZip.loadAsync(doc.originalBuffer);
 
@@ -205,6 +210,8 @@ export async function repackDocxFromRaw(
   const { compressionLevel = 6, updateModifiedDate = true, modifiedBy } = options;
   const exportDocument = doc;
 
+  remapOversizedRevisionIds(exportDocument);
+
   // Create a new ZIP with all original files
   const newZip = new JSZip();
 
diff --git a/packages/core/src/docx/selectiveSave.ts b/packages/core/src/docx/selectiveSave.ts
index 949edc33..1acf6b03 100644
--- a/packages/core/src/docx/selectiveSave.ts
+++ b/packages/core/src/docx/selectiveSave.ts
@@ -10,6 +10,7 @@
 
 import type { Document, BlockContent } from '../types/document';
 import { serializeDocument } from './serializer/documentSerializer';
+import { remapOversizedRevisionIds } from './serializer/decimalIdRemap';
 import {
   serializeCommentsWithInfo,
   serializeCommentsExtended,
@@ -106,6 +107,10 @@ export async function attemptSelectiveSave(
   const content = doc.package.document.content;
   if (hasNewImagesOrHyperlinks(content)) return null;
 
+  // Collab-partitioned comment/revision IDs need a document-wide renumber that
+  // touches commentRange markers in unchanged paragraphs — full repack only.
+  if (remapOversizedRevisionIds(doc)) return null;
+
   const comments = doc.package.document.comments;
   const hasComments = comments && comments.length > 0;
   const headerFooterUpdates = collectHeaderFooterUpdates(doc);
diff --git a/packages/core/src/docx/serializer/decimalIdRemap.test.ts b/packages/core/src/docx/serializer/decimalIdRemap.test.ts
new file mode 100644
index 00000000..d7e4a723
--- /dev/null
+++ b/packages/core/src/docx/serializer/decimalIdRemap.test.ts
@@ -0,0 +1,158 @@
+import { describe, test, expect } from 'bun:test';
+import JSZip from 'jszip';
+import type { Document, Comment, Paragraph } from '../../types/document';
+import { remapOversizedRevisionIds, MAX_DECIMAL_ID } from './decimalIdRemap';
+import { serializeDocument } from './documentSerializer';
+import { serializeComments } from './commentSerializer';
+import { createDocx } from '../rezip';
+
+// Typical collab ID: ydoc.clientID (~4e9) * 1_000_000 — far above signed int32.
+const HUGE_A = 4_000_000_001_000_001;
+const HUGE_B = 1_234_567_001_000_001;
+
+function para(content: Paragraph['content']): Paragraph {
+  return { type: 'paragraph', content };
+}
+
+function run(text: string) {
+  return { type: 'run' as const, content: [{ type: 'text' as const, text }] };
+}
+
+function makeDoc(commentId: number, revisionId: number, replyId?: number): Document {
+  const comments: Comment[] = [
+    { id: commentId, author: 'A', content: [para([run('c')])] },
+    ...(replyId != null
+      ? [{ id: replyId, author: 'B', parentId: commentId, content: [para([run('r')])] }]
+      : []),
+  ];
+  return {
+    package: {
+      document: {
+        content: [
+          para([
+            { type: 'commentRangeStart', id: commentId },
+            run('hello '),
+            {
+              type: 'insertion',
+              info: { id: revisionId, author: 'A', date: '2025-01-01T00:00:00Z' },
+              content: [run('world')],
+            },
+            { type: 'commentRangeEnd', id: commentId },
+          ]),
+        ],
+        comments,
+      },
+    },
+  };
+}
+
+/** Every `w:id="N"` in `xml` as a number. */
+function emittedWIds(xml: string): number[] {
+  return [...xml.matchAll(/w:id="(\d+)"/g)].map((m) => Number(m[1]));
+}
+
+describe('remapOversizedRevisionIds', () => {
+  test('no-op when every ID already fits in signed int32', () => {
+    const doc = makeDoc(3, 7);
+    expect(remapOversizedRevisionIds(doc)).toBe(false);
+    expect(doc.package.document.comments![0].id).toBe(3);
+    const body = doc.package.document.content[0] as Paragraph;
+    expect((body.content[0] as { id: number }).id).toBe(3);
+  });
+
+  test('remaps to dense 1..N preserving relative order across comments + revisions', () => {
+    const doc = makeDoc(HUGE_A, HUGE_B, HUGE_A + 1);
+    expect(remapOversizedRevisionIds(doc)).toBe(true);
+
+    // Single shared sequence, sorted: HUGE_B → 1, HUGE_A → 2, HUGE_A+1 → 3.
+    const [comment, reply] = doc.package.document.comments!;
+    expect(comment.id).toBe(2);
+    expect(reply.id).toBe(3);
+    expect(reply.parentId).toBe(2);
+
+    const body = doc.package.document.content[0] as Paragraph;
+    expect(body.content[0]).toEqual({ type: 'commentRangeStart', id: 2 });
+    expect(body.content[3]).toEqual({ type: 'commentRangeEnd', id: 2 });
+    expect((body.content[2] as { info: { id: number } }).info.id).toBe(1);
+  });
+
+  test('idempotent — second call is a no-op', () => {
+    const doc = makeDoc(HUGE_A, HUGE_B);
+    expect(remapOversizedRevisionIds(doc)).toBe(true);
+    expect(remapOversizedRevisionIds(doc)).toBe(false);
+  });
+
+  test('descends into tables and pPr-level tracked changes', () => {
+    const doc: Document = {
+      package: {
+        document: {
+          content: [
+            {
+              type: 'table',
+              rows: [
+                {
+                  type: 'tableRow',
+                  structuralChange: {
+                    type: 'tableRowInsertion',
+                    info: { id: HUGE_A, author: 'A' },
+                  },
+                  cells: [
+                    {
+                      type: 'tableCell',
+                      content: [
+                        { ...para([run('x')]), pPrIns: { id: HUGE_B, author: 'A' } } as Paragraph,
+                      ],
+                    },
+                  ],
+                },
+              ],
+            },
+          ],
+        },
+      },
+    };
+    expect(remapOversizedRevisionIds(doc)).toBe(true);
+    for (const id of emittedWIds(serializeDocument(doc))) {
+      expect(id).toBeLessThanOrEqual(MAX_DECIMAL_ID);
+    }
+  });
+});
+
+describe('Word int32 w:id invariant (PR #956 review)', () => {
+  // This is the test that would have caught the original PR's bug without
+  // knowing the implementation: it asserts the SERIALIZED OUTPUT obeys Word's
+  // signed-int32 cap on every `w:id`, regardless of how big the in-memory IDs
+  // are. The schema says `ST_DecimalNumber` is unbounded `xsd:integer`; Word
+  // does not honour that, so neither can we.
+  test('createDocx never emits a w:id above 0x7FFFFFFF in document.xml or comments.xml', async () => {
+    const doc = makeDoc(HUGE_A, HUGE_B, HUGE_A + 1);
+    const buf = await createDocx(doc);
+    const zip = await JSZip.loadAsync(buf);
+
+    const documentXml = await zip.file('word/document.xml')!.async('text');
+    const commentsXml = await zip.file('word/comments.xml')!.async('text');
+
+    const all = [...emittedWIds(documentXml), ...emittedWIds(commentsXml)];
+    expect(all.length).toBeGreaterThan(0);
+    for (const id of all) {
+      expect(id).toBeLessThanOrEqual(MAX_DECIMAL_ID);
+    }
+
+    // Referential integrity survives the remap: the comment's range markers in
+    // document.xml still point at the comment's id in comments.xml.
+    const commentIdMatch = commentsXml.match(/<w:comment w:id="(\d+)"/);
+    expect(commentIdMatch).not.toBeNull();
+    expect(documentXml).toContain(`<w:commentRangeStart w:id="${commentIdMatch![1]}"`);
+  });
+
+  test('serializer-level invariant holds for the body alone', () => {
+    const doc = makeDoc(HUGE_A, HUGE_B);
+    remapOversizedRevisionIds(doc);
+    for (const id of emittedWIds(serializeDocument(doc))) {
+      expect(id).toBeLessThanOrEqual(MAX_DECIMAL_ID);
+    }
+    for (const id of emittedWIds(serializeComments(doc.package.document.comments!))) {
+      expect(id).toBeLessThanOrEqual(MAX_DECIMAL_ID);
+    }
+  });
+});
diff --git a/packages/core/src/docx/serializer/decimalIdRemap.ts b/packages/core/src/docx/serializer/decimalIdRemap.ts
new file mode 100644
index 00000000..ee0d1df9
--- /dev/null
+++ b/packages/core/src/docx/serializer/decimalIdRemap.ts
@@ -0,0 +1,176 @@
+/**
+ * Save-time renumbering of comment + tracked-change `w:id` values.
+ *
+ * The OOXML schema types these IDs as `ST_DecimalNumber` (unbounded
+ * `xsd:integer`), but Word reads them as **signed int32** — anything
+ * above `0x7FFFFFFF` triggers "unreadable content" / Document Recovery and
+ * the comment or revision is dropped. This is the decimal counterpart to
+ * the hex-id cap in `utils/hexId.ts`.
+ *
+ * Collaborative sessions partition the live ID space per peer (e.g.
+ * `ydoc.clientID * 1_000_000`, see `commentIdAllocator.ts`), which routinely
+ * lands IDs in the 10^12–10^15 range. That's fine in memory; on save we
+ * compact every comment + revision ID down to a dense `1..N` sequence so the
+ * file opens cleanly in Word. The remap is a no-op when every ID already fits,
+ * so single-editor saves stay byte-for-byte identical.
+ *
+ * Mutates the `Document` in place — same convention as `processNewImages`
+ * and the other rezip pre-passes.
+ */
+
+import type {
+  Document,
+  BlockContent,
+  Paragraph,
+  ParagraphContent,
+  Run,
+  Hyperlink,
+  SimpleField,
+  ComplexField,
+  InlineSdt,
+  Table,
+} from '../../types/document';
+
+/** Word's effective upper bound for `ST_DecimalNumber` `w:id` attributes. */
+export const MAX_DECIMAL_ID = 0x7fffffff;
+
+type IdVisitor = (id: number) => number;
+
+function visitRun(run: Run, fn: IdVisitor): void {
+  for (const pc of run.propertyChanges ?? []) pc.info.id = fn(pc.info.id);
+}
+
+function visitHyperlink(link: Hyperlink, fn: IdVisitor): void {
+  for (const child of link.children) {
+    if (child.type === 'run') visitRun(child, fn);
+  }
+}
+
+function visitField(field: SimpleField | ComplexField, fn: IdVisitor): void {
+  if (field.type === 'simpleField') {
+    for (const c of field.content) {
+      if (c.type === 'run') visitRun(c, fn);
+      else if (c.type === 'hyperlink') visitHyperlink(c, fn);
+    }
+  } else {
+    for (const r of field.fieldCode) visitRun(r, fn);
+    for (const r of field.fieldResult) visitRun(r, fn);
+  }
+}
+
+function visitInlineSdt(sdt: InlineSdt, fn: IdVisitor): void {
+  for (const item of sdt.content) {
+    if (item.type === 'run') visitRun(item, fn);
+    else if (item.type === 'hyperlink') visitHyperlink(item, fn);
+    else if (item.type === 'simpleField' || item.type === 'complexField') visitField(item, fn);
+    else if (item.type === 'inlineSdt') visitInlineSdt(item, fn);
+  }
+}
+
+function visitParagraphContent(item: ParagraphContent, fn: IdVisitor): void {
+  switch (item.type) {
+    case 'run':
+      return visitRun(item, fn);
+    case 'hyperlink':
+      return visitHyperlink(item, fn);
+    case 'simpleField':
+    case 'complexField':
+      return visitField(item, fn);
+    case 'inlineSdt':
+      return visitInlineSdt(item, fn);
+    case 'commentRangeStart':
+    case 'commentRangeEnd':
+      item.id = fn(item.id);
+      return;
+    case 'insertion':
+    case 'deletion':
+    case 'moveFrom':
+    case 'moveTo':
+      item.info.id = fn(item.info.id);
+      for (const c of item.content) {
+        if (c.type === 'run') visitRun(c, fn);
+        else if (c.type === 'hyperlink') visitHyperlink(c, fn);
+      }
+      return;
+    case 'moveFromRangeStart':
+    case 'moveFromRangeEnd':
+    case 'moveToRangeStart':
+    case 'moveToRangeEnd':
+      item.id = fn(item.id);
+      return;
+    default:
+      return;
+  }
+}
+
+function visitParagraph(p: Paragraph, fn: IdVisitor): void {
+  if (p.pPrIns) p.pPrIns.id = fn(p.pPrIns.id);
+  if (p.pPrDel) p.pPrDel.id = fn(p.pPrDel.id);
+  for (const pc of p.propertyChanges ?? []) pc.info.id = fn(pc.info.id);
+  for (const item of p.content) visitParagraphContent(item, fn);
+}
+
+function visitTable(t: Table, fn: IdVisitor): void {
+  for (const pc of t.propertyChanges ?? []) pc.info.id = fn(pc.info.id);
+  for (const row of t.rows) {
+    for (const pc of row.propertyChanges ?? []) pc.info.id = fn(pc.info.id);
+    if (row.structuralChange) row.structuralChange.info.id = fn(row.structuralChange.info.id);
+    for (const cell of row.cells) {
+      for (const pc of cell.propertyChanges ?? []) pc.info.id = fn(pc.info.id);
+      if (cell.structuralChange) cell.structuralChange.info.id = fn(cell.structuralChange.info.id);
+      visitBlocks(cell.content, fn);
+    }
+  }
+}
+
+function visitBlocks(blocks: BlockContent[], fn: IdVisitor): void {
+  for (const block of blocks) {
+    if (block.type === 'paragraph') visitParagraph(block, fn);
+    else if (block.type === 'table') visitTable(block, fn);
+    else if (block.type === 'blockSdt') visitBlocks(block.content, fn);
+  }
+}
+
+/** Visit every comment/revision `w:id` in the package via `fn` (write-back). */
+function visitDocumentIds(doc: Document, fn: IdVisitor): void {
+  const pkg = doc.package;
+  visitBlocks(pkg.document.content, fn);
+  for (const c of pkg.document.comments ?? []) {
+    c.id = fn(c.id);
+    if (c.parentId != null) c.parentId = fn(c.parentId);
+  }
+  for (const hf of pkg.headers?.values() ?? []) visitBlocks(hf.content, fn);
+  for (const hf of pkg.footers?.values() ?? []) visitBlocks(hf.content, fn);
+  for (const n of pkg.footnotes ?? []) visitBlocks(n.content, fn);
+  for (const n of pkg.endnotes ?? []) visitBlocks(n.content, fn);
+  for (const n of pkg.footnoteSeparators ?? []) visitBlocks(n.content, fn);
+  for (const n of pkg.endnoteSeparators ?? []) visitBlocks(n.content, fn);
+}
+
+/**
+ * If any comment or tracked-change ID in `doc` exceeds Word's signed-int32
+ * limit, remap **all** of them to a dense `1..N` sequence (single sequence
+ * shared across comments and revisions, since the editor uses one ID space for
+ * both). Returns `true` when a remap was applied.
+ *
+ * No-op (returns `false`) when every ID already fits, so non-collab saves are
+ * untouched.
+ */
+export function remapOversizedRevisionIds(doc: Document): boolean {
+  const ids = new Set<number>();
+  visitDocumentIds(doc, (id) => {
+    ids.add(id);
+    return id;
+  });
+
+  let max = 0;
+  for (const id of ids) if (id > max) max = id;
+  if (max <= MAX_DECIMAL_ID) return false;
+
+  const sorted = [...ids].sort((a, b) => a - b);
+  const remap = new Map<number, number>();
+  for (let i = 0; i < sorted.length; i++) remap.set(sorted[i], i + 1);
+
+  visitDocumentIds(doc, (id) => remap.get(id) ?? Math.min(id, MAX_DECIMAL_ID));
+  return true;
+}
diff --git a/packages/core/src/prosemirror/commentIdAllocator.ts b/packages/core/src/prosemirror/commentIdAllocator.ts
index 5458a507..a2f737b1 100644
--- a/packages/core/src/prosemirror/commentIdAllocator.ts
+++ b/packages/core/src/prosemirror/commentIdAllocator.ts
@@ -37,9 +37,11 @@ export interface CommentIdAllocator {
  * @param base - Offset for the first allocated ID (`base + 1`). Pair with
  * `stride` to partition the ID space across collaborating peers so concurrent
  * allocations never collide — e.g. `ydoc.clientID * 1_000_000` with
- * `stride = 1_000_000`. Yjs `clientID` is a uint32, so the product stays well
- * under `Number.MAX_SAFE_INTEGER` (OOXML `w:id` is `xsd:integer`, no upper
- * bound). Defaults to `0`, preserving the original `1, 2, 3, …` sequence.
+ * `stride = 1_000_000`. Defaults to `0`, preserving the original `1, 2, 3, …`
+ * sequence. These IDs are **in-memory only**: the OOXML schema types `w:id` as
+ * unbounded `xsd:integer`, but Word reads it as signed int32, so the serializer
+ * compacts to `1..N` on save when any ID exceeds `0x7FFFFFFF`
+ * (`docx/serializer/decimalIdRemap.ts`).
  * @param stride - Width of this allocator's partition. `seedAbove` ignores IDs
  * outside `(base, base + stride]` so a peer's synced revision marks can't pull
  * this allocator into their partition. Defaults to `Infinity` (single editor:
@@ -48,15 +50,30 @@ export interface CommentIdAllocator {
  * Collision-freedom across peers holds under two assumptions: (1) peers pass
  * distinct `base` values — with `clientID * stride` this means distinct Yjs
  * `clientID`s, which Yjs already relies on; and (2) a peer mints fewer than
- * `stride` IDs per session, since `next()` is unbounded and would climb past
+ * `stride` IDs per session. `next()` warns (dev only) if it climbs past
  * `base + stride` into the neighbouring partition. Both hold comfortably for
  * realistic sessions (`stride = 1_000_000`).
  */
 export function createCommentIdAllocator(base = 0, stride = Infinity): CommentIdAllocator {
   let nextId = base + 1;
   const ceiling = base + stride;
+  let warned = false;
   return {
-    next: () => nextId++,
+    next: () => {
+      const id = nextId++;
+      if (
+        !warned &&
+        id > ceiling &&
+        typeof process !== 'undefined' &&
+        process.env?.NODE_ENV !== 'production'
+      ) {
+        warned = true;
+        console.warn(
+          `[docx-editor] commentIdAllocator overflowed its partition (base=${base}, stride=${stride}).`
+        );
+      }
+      return id;
+    },
     seedAbove(maxId: number) {
       if (maxId >= nextId && maxId <= ceiling) nextId = maxId + 1;
     },

…e comments

Collab partitions comment/revision IDs per peer (clientID * 1e6), which
exceeds Word's signed-int32 w:id limit and makes saved files open into
Document Recovery with comments/revisions dropped. Add a save-time pre-pass
(decimalIdRemap) that compacts oversized comment + revision IDs to a dense
1..N sequence from one shared sequence; no-op when every ID already fits, so
single-editor saves stay byte-identical. Also bounds-warn the allocator in
dev and de-races the collab e2e.

The renumber runs on detached shallow clones of the comments array, so the
open session and collab peers keep their live IDs and stay in sync with the
unchanged PM comment marks; only this save's snapshot is renumbered.

Co-authored-by: Sam Corcos <2626231+samcorcos@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jacobjove and others added 6 commits June 23, 2026 08:38
* feat(core): footnote-PM-doc layout seam + data-footnote-id + note-ref marker on serialize

Adds getFootnotePmDoc seam (mirrors getHfPmDoc) so the painter can read a
live footnote ProseMirror doc; stamps data-footnote-id on .layout-footnote-content
for per-footnote position disambiguation; re-inserts the w:footnoteRef/w:endnoteRef
marker run when serializing a normal note from content (parser drops it). Groundwork
for editable footnotes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(react): lazy hidden footnote ProseMirror view + writeback

Mounts one off-screen EditorView for the actively-edited footnote (lazy single
slot keyed by activeFootnoteId), mirroring HiddenHeaderFooterPMs. Transactions
sync proseDocToBlocks back to Document.package.footnotes and clear verbatimXml;
useLayoutPipeline feeds the live doc through getFootnotePmDoc so the painter
reflects edits. Footnote-edit wiring lives in a new useFootnoteEditState hook
(keeps PagedEditor under the line cap). Click routing/overlay follow.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(react): route clicks into footnotes for editing

Clicking a painted footnote enters footnote-edit mode for that data-footnote-id,
places the caret in its hidden view (scoped span-snap so colliding per-footnote
PM positions never cross-match), focuses it, and blurs+read-onlys the body;
clicking outside exits. activeSurface() now routes body|HF|footnote. Adds e2e
spec (run via playwright-runner) and a unit test for the hit-resolver.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(react): restore body focus when a click exits footnote-edit mode

Exiting footnote mode via a body click now hands focus + caret back to the
body in the same gesture. The restore is deferred to a parent-level effect
(exitFootnoteToBody) that runs after HiddenFootnotePM tears its view down, so
the teardown's destroy-blur can't steal the focus, and the body PM is already
editable by then. Verified by the footnote-editing e2e (click→type→exit→type).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(react): visible caret + selection overlay for footnote editing

Draws the caret and range-selection rects over the actively-edited footnote,
mirroring the header/footer overlay. New framework-free core helpers
(computeFootnoteCaretRectFromView/SelectionRectsFromView) scope the DOM walk to
the active .layout-footnote-content[data-footnote-id] so colliding per-footnote
PM positions never cross-match; DocxEditorPagedArea portals the overlay and
recomputes on painter:painted + resize. e2e asserts the caret renders within
the clicked footnote bounds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(vue): editable footnotes at parity with React

Mirrors the React footnote-editing feature in the Vue adapter: lazy single
hidden footnote EditorView + writeback (useFootnotePM), click routing into
footnotes (usePagesPointer), and caret/selection overlay (useFootnoteOverlay +
FootnoteOverlay.vue) reusing the framework-free core helpers. getFootnotePmDoc
feeds the live doc to the painter. No public API change; parity contract holds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore: changeset for editable footnotes

* chore: update API report for footnote-editing exports

* fix(react): exit footnote-edit mode when the edited footnote is gone

Reset footnoteEditId when the actively-edited footnote is no longer present in
the document (e.g. a new document loaded mid-edit) — otherwise the body PM
stays readOnly and the next writeback resolves the stale id against the new
document. Keyed on footnote existence, not document identity (history.state
gets a fresh object per body edit, so a bare [document] reset would exit
mid-edit). Mirrors the Vue adapter's destroyFootnotePM-on-load behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(footnotes): black caret + glyph-accurate click placement

The footnote caret rendered blue (HF chrome color) — make it black to match the
body caret in both adapters. Footnote clicks resolved position via a coarse
estimate that drifted (clicking between letters landed at the word end); React
now reuses the body's glyph-accurate findPositionInSpan (exported from core) on
the span under the cursor, scoped to the active footnote. Vue keeps the coarse
snap for now (precise caret exposes a separate Vue typing issue).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(footnotes): enable click-drag text selection

The footnote click handler placed the caret and returned without arming drag
state, so click-drag selected nothing. Arm dragAnchorRef + isDraggingRef from the
clicked position on mousedown (DOM-resolved, no view needed); the existing move
handler already extends the selection through the footnote surface. e2e covers it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Jacob Jove <jacobjove@Jacobs-MacBook-Pro.local>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(core): keep a footnote reference on the same line as its word

The line-breaker created a wrap opportunity at every run boundary, so a
footnote/endnote reference run (a separate superscript with no space
before it, e.g. copyright.¹) could split onto the next line. Fold the
width of the unbreakable content that follows a run's last word into the
wrap decision so adjacent runs with no whitespace between them wrap as a
single cluster, matching Word.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(core): derive footnote-glue maxWidth from measureTextWidth

The hardcoded maxWidth assumed this file's 8px/char canvas stub, but the
guard only installs it when no global document exists; run after another
test that sets one, measureTextWidth returned different widths and the
control case failed. Derive the threshold from measureTextWidth so the
test is self-consistent under whichever stub is active.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Jacob Jove <jacobjove@Jacobs-MacBook-Pro.local>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
)

* fix: superscript footnote/endnote refs only when the style says so (Word parity)

A bare anchor run (no FootnoteReference rStyle, e.g. Pandoc output) was
force-raised by the layout bridge, diverging from Word, which renders an
unstyled anchor at the baseline. Drop the implicit superscript; it now flows
solely from the resolved character-style chain or the run's own vertAlign.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(core): note basedOn limitation + endnote-mark intent in footnote-ref parity test

Addresses review feedback on PR #994 — clarify that getRunStyleOwnProperties
resolves only own (not basedOn-inherited) vertAlign, and that endnoteRef content
maps to the footnoteRef mark.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(core): correct basedOn note — styleParser pre-merges the chain

The prior comment (and a PR #994 review reply) wrongly claimed a basedOn-inherited
vertAlign renders at the baseline. styleParser.resolveStyleInheritance merges the
parent rPr into the child before getRunStyleOwnProperties reads it, so production
honors inherited superscript; the own-only behavior is the test mock's, not the
real resolver's.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Jacob Jove <jacobjove@Jacobs-MacBook-Pro.local>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
)

* fix(core): grow a section's footer/header band per its own margins

extendMarginsForHeaderFooter decided whether to grow the header/footer band
once, from the body section's margins, then applied that decision to every
section. A section whose own margins are thin — e.g. a landscape table section
with a 0.5in bottom margin (≈ the footer distance) embedded in a 1in-margin
portrait body — never grew its footer band, so the footer overlapped the
footnote area and the page number rode up beside the last footnote instead of
sitting below it. Decide the overflow per margin set, using that set's own
top/bottom and header/footer distances.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore: changeset for header/footer band fix

---------

Co-authored-by: Jacob Jove <jacobjove@Jacobs-MacBook-Pro.local>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Merging main tipped two files over their max-lines caps:
- DocxEditor.vue (1252 > 1250): the collab comment-id props add inline
  withDefaults + allocator wiring; bumped to 1260.
- useDocxEditor.ts (1003 > 1000): editable footnotes (#995) on main pushed
  it over with no override; added a 1050 cap, matching the other host-file
  ceilings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comment IDs can collide between collaborating peers (Comment.id is a module-level scalar)

3 participants