Improve DOCX render fidelity by aldrinjenson · Pull Request #993 · eigenpal/docx-editor

aldrinjenson · 2026-06-22T19:57:46Z

Summary

Improve cached diagram label placement by using the authored text transform and preserving text insets/color metadata.
Render section-specific headers and footers per page so active navigation labels follow the page section.
Keep paragraph-relative footer floats anchored to the footer band instead of shifting them into body content.
Add focused regressions for diagram label geometry, footer float anchoring, and section-aware header rendering.

Validation

bun test packages/core/src/layout-painter/tests/section-header-rendering.test.ts packages/core/src/docx/tests/diagram-drawing.test.ts packages/core/src/layout-painter/tests/header-footer-spacing.test.ts packages/core/src/layout-painter/tests/hf-float-textbox-no-advance.test.ts packages/core/src/docx/tests/tiff-preview.test.ts packages/core/src/layout-bridge/tests/toFlowBlocks-image-sizing.test.ts packages/core/src/layout-engine/tests/anchored-textbox-pagebreak.test.ts packages/core/src/layout-painter/tests/floating-image-layer-zorder.test.ts
bun run --filter '@eigenpal/docx-editor-core' typecheck
bun run --filter '@eigenpal/docx-editor-react' typecheck
bun run --filter '@eigenpal/docx-editor-vue' typecheck
Pre-commit hook: workspace typecheck, parity gates, API snapshot check, Prettier, ESLint
Local visual comparison against a Word-rendered baseline

vercel · 2026-06-22T19:57:52Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docx-editor	Ready	Preview, Comment	Jun 23, 2026 2:49pm

eigenpal-release-pal · 2026-06-22T19:58:00Z

All contributors have signed the CLA ✍️ ✅

_{Posted by the CLA bot.}

greptile-apps · 2026-06-22T20:05:48Z

Greptile Summary

This PR improves DOCX render fidelity across three areas: multi-column layout, image parsing, and browser-safe image painting. It also adds regression test coverage for the fixed scenarios.

Multi-column layout: Paragraph and table fragments now use paginator.columnWidth instead of the pre-captured contentWidth, and table justification offsets are computed from result.x (returned post-addFragment) rather than the column x-coordinate captured before the paginator may have advanced columns, fixing incorrect fragment placement in multi-column sections.
Image parsing: parseInline/parseAnchor return null when no blip rId is found, skipping non-picture DrawingML shapes cleanly. The previously unused _media parameter in parseTextBoxContent is now forwarded to the paragraph parser, fixing media resolution for images nested inside text-box paragraphs.
Browser-safe painting: A new applyImageSourceForBrowser function centralises the browser-format check across all six image render sites (inline, block, floating, header/footer, watermark, and floating layer). TIFF, EMF, and WMF data-URLs are now hidden (visibility: hidden) instead of producing broken-image icons, while the model data is preserved for round-trip/export. Note: a changeset file (bun changeset) covering @eigenpal/docx-editor-core is required per CLAUDE.md for any code PR that touches production sources — none was added in this branch.

Confidence Score: 4/5

Safe to merge after adding a changeset file; the layout and painter changes are well-tested, and the refactoring is mechanically consistent across all six image render sites.

The three independent fix areas each have matching synthetic test coverage and the changes are narrowly scoped. The missing changeset file is the only blocker — without it the Release workflow will produce an incorrect CHANGELOG entry or fail changeset status in CI. The two inline comments are quality observations that do not affect correctness.

packages/core/src/layout-painter/renderImage.ts (new exported API surface) and packages/core/src/layout-engine/index.ts (column-layout geometry change) are the most impactful files and warrant a careful read. All other changed files are straightforward call-site updates or tests.

Important Files Changed

Filename	Overview
packages/core/src/layout-painter/renderImage.ts	New `applyImageSourceForBrowser` centralises the browser-render guard; TIFF/EMF/WMF data-URLs are hidden instead of producing broken-image icons. Minor semantic issue: undefined src incorrectly sets `data-unsupported-image`.
packages/core/src/layout-engine/index.ts	Multi-column fixes: paragraph/table fragments now use `paginator.columnWidth` instead of the captured content width, and table justification is computed from `result.x` (post-addFragment) for correctness. Minor: `fragment.width` is initialised then immediately overwritten.
packages/core/src/docx/imageParser.ts	`parseInline`/`parseAnchor` now return `null` when no blip rId is found, cleanly skipping non-picture DrawingML shapes; `parseDrawing` already returned `null
packages/core/src/docx/textBoxParser.ts	Previously unused `_media` parameter is now named `media` and forwarded to the paragraph parser, fixing image resolution for pictures nested inside text-box paragraphs.
packages/core/src/layout-painter/renderWatermark.ts	Replaces direct `img.src = wm.dataUrl` with `applyImageSourceForBrowser`, adding the browser-format guard to the watermark paint path that previously had no validation.
packages/core/src/layout-painter/renderPage/headerFooter.ts	Floating header/footer images now go through `applyImageSourceForBrowser` instead of direct src assignment; straightforward defensive fix.
packages/core/src/layout-painter/floatingImageLayer.ts	Same browser-format guard applied to the floating image layer renderer; one-line change.
packages/core/src/layout-painter/renderParagraph/runs.ts	`applyImageSourceForBrowser` replaces direct `img.src` assignment in both `renderInlineImageRun` and `renderBlockImage`; logic is otherwise unchanged.
packages/core/src/layout-engine/integration/section-breaks.test.ts	New integration tests assert that paragraph fragments in two-column sections use `columnWidth` and that table fragments receive the correct post-`addFragment` column x-coordinate.
docs/render-fidelity-qa.md	New QA checklist for render-fidelity investigations; documentation-only, no code risk.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant P as DOCX Parser
    participant IP as imageParser
    participant TB as textBoxParser
    participant LE as layoutEngine
    participant RI as renderImage
    participant DOM as DOM

    P->>IP: parseImage(drawingEl, rels, media)
    IP->>IP: parseAnchor / parseInline
    alt no blip rId (non-picture shape)
        IP-->>P: null (skip)
    else picture found
        IP->>IP: resolveImageData(rId, rels, media)
        IP-->>P: "Image { rId, src, ... }"
    end

    P->>TB: parseTextBoxContent(..., rels, media)
    Note over TB: media now forwarded to parseParagraph
    TB->>IP: parseImage (nested picture)
    IP-->>TB: "Image { src: resolved dataUrl }"

    P->>LE: layoutDocument(blocks, measures, options)
    LE->>LE: layoutParagraph / layoutTable
    Note over LE: width = paginator.columnWidth
    Note over LE: table x = result.x + justification
    LE-->>P: "Layout { pages, fragments }"

    LE->>RI: renderImageFragment / renderFloatingImagesLayer
    RI->>RI: applyImageSourceForBrowser(img, src)
    alt src is data:image/tiff or EMF/WMF
        RI->>DOM: removeAttribute(src), visibility:hidden
    else src is data:image/png, blob:, https:
        RI->>DOM: "img.src = src"
    end

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant P as DOCX Parser
    participant IP as imageParser
    participant TB as textBoxParser
    participant LE as layoutEngine
    participant RI as renderImage
    participant DOM as DOM

    P->>IP: parseImage(drawingEl, rels, media)
    IP->>IP: parseAnchor / parseInline
    alt no blip rId (non-picture shape)
        IP-->>P: null (skip)
    else picture found
        IP->>IP: resolveImageData(rId, rels, media)
        IP-->>P: "Image { rId, src, ... }"
    end

    P->>TB: parseTextBoxContent(..., rels, media)
    Note over TB: media now forwarded to parseParagraph
    TB->>IP: parseImage (nested picture)
    IP-->>TB: "Image { src: resolved dataUrl }"

    P->>LE: layoutDocument(blocks, measures, options)
    LE->>LE: layoutParagraph / layoutTable
    Note over LE: width = paginator.columnWidth
    Note over LE: table x = result.x + justification
    LE-->>P: "Layout { pages, fragments }"

    LE->>RI: renderImageFragment / renderFloatingImagesLayer
    RI->>RI: applyImageSourceForBrowser(img, src)
    alt src is data:image/tiff or EMF/WMF
        RI->>DOM: removeAttribute(src), visibility:hidden
    else src is data:image/png, blob:, https:
        RI->>DOM: "img.src = src"
    end

Comments Outside Diff (1)

packages/core/src/layout-engine/index.ts, line 381-397 (link)

fragment.width initialised then unconditionally overwritten

In the empty-paragraph path the fragment is created with width: paginator.columnWidth and then fragment.width = paginator.columnWidth is written again immediately after addFragment. The post-addFragment write is the authoritative one (in case ensureFits advanced to a page with different geometry), so the constructor value is always dead. Setting width: 0 (or any placeholder) at construction and relying solely on the post-addFragment assignment would remove the ambiguity and match how x and y are handled. The same pattern recurs in the non-empty paragraph loop at line 458/475.

_{Reviews (1): Last reviewed commit: "Improve DOCX render fidelity" | Re-trigger Greptile}

jedrazb · 2026-06-23T07:31:20Z

LGTM overall — render-fidelity logic is sound, typecheck clean, tests green, React/Vue parity holds.

One security finding to fix before merge:

HIGH — zero-click DoS in the new TIFF decoder (packages/core/src/docx/tiffPreview.ts:137). concatenateStrips allocates new Uint8Array(total) where total is summed straight from the file's StripByteCounts with no upper clamp, and the per-strip bounds check happens after the allocation. A ~138-byte crafted TIFF with StripByteCounts = 0x40000000 forces a 1 GiB allocation (near 2^32 → RangeError that propagates uncaught through buildMediaMap and aborts document open). Reproduced locally.

Fix: clamp total against bytes.length (strip data can't exceed the file) before allocating, add a width*height sanity cap before the BMP allocation (line 164), and wrap the decode in try/catch → return undefined. Per CLAUDE.md: never feed a file-supplied number straight into allocation.

jedrazb · 2026-06-23T07:31:33Z

+  stripByteCounts: number[]
+): Uint8Array {
+  const total = stripByteCounts.reduce((sum, count) => sum + Math.max(0, count), 0);
+  const out = new Uint8Array(total);


Security (HIGH): unbounded allocation from attacker-controlled bytes.

total is the raw sum of the file's StripByteCounts (each a u32, no upper clamp), and the per-strip bounds check at line 142 runs after this allocation. A ~138-byte crafted TIFF with StripByteCounts = 0x40000000 forces new Uint8Array(1 GiB); values near 2^32 throw RangeError that propagates uncaught through buildMediaMap (parser.ts:403/452, no try/catch) and aborts document open. Zero-click DoS on untrusted input. Reproduced locally.

Suggest clamping before allocating, e.g.:

const total = stripByteCounts.reduce((sum, count) => sum + Math.max(0, count), 0); if (total > bytes.length) return new Uint8Array(0); // strip data can't exceed the source file const out = new Uint8Array(total);

Also add a width*height sanity cap before the BMP allocation (line 164) and wrap the decode in try/catch → return undefined. Per CLAUDE.md: never feed a file-supplied number straight into allocation.

Thanks, fixed in 6e324d5a.

The TIFF preview path now validates/clamps strip byte counts before allocation, validates strip offsets/counts before copying, caps preview dimensions before BMP allocation, rejects invalid samples-per-pixel values, and wraps preview decoding so malformed TIFF input returns undefined instead of aborting document open. I also added regression tests for oversized strip byte counts, oversized dimensions, and malformed buffers.

Latest PR head is green for lint, test, build, CLA, and Vercel.

…elity # Conflicts: # packages/core/src/docx/parser.ts # packages/core/src/prosemirror/conversion/toProseDoc/textbox.ts

aldrinjenson · 2026-06-23T14:18:55Z

Addressed the remaining review feedback on this PR:

Hardened TIFF preview decoding in 6e324d5a with pre-allocation strip validation, preview dimension caps, invalid sample-count rejection, and safe malformed-input handling.
Added TIFF regression coverage for oversized strip byte counts, oversized dimensions, and malformed input.
Confirmed the earlier bot notes are now covered: changeset is present, missing image sources are no longer marked as unsupported images, and paragraph fragments use placeholder width before post-layout geometry assignment.
Regenerated the generated API reports in 094be4c6 after merging main, which fixed the build check drift.

Latest PR head is green for lint, test, build, CLA, and Vercel.

aldrinjenson added 2 commits June 22, 2026 15:33

Fix multi-column fragment positioning

9692dd6

Improve DOCX render fidelity

a3a55b8

vercel Bot deployed to Preview June 22, 2026 19:59 View deployment

greptile-apps Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread packages/core/src/layout-painter/renderImage.ts

Improve DOCX header and footer fidelity

a20b620

vercel Bot deployed to Preview June 22, 2026 20:23 View deployment

Improve DOCX render fidelity

13f54c3

vercel Bot deployed to Preview June 22, 2026 21:37 View deployment

Improve section header and footer rendering

9ea6d64

vercel Bot deployed to Preview June 22, 2026 22:10 View deployment

Address render fidelity review feedback

07c78ec

vercel Bot deployed to Preview June 22, 2026 22:17 View deployment

Update API reports for render fidelity changes

823a6ae

vercel Bot deployed to Preview June 22, 2026 22:25 View deployment

Preserve header text box z-order

63ee630

vercel Bot deployed to Preview June 22, 2026 22:45 View deployment

Honor text box autofit in headers

166c415

vercel Bot deployed to Preview June 22, 2026 23:02 View deployment

Update API reports for text box autofit

ff61fc1

vercel Bot deployed to Preview June 22, 2026 23:10 View deployment

jedrazb reviewed Jun 23, 2026

View reviewed changes

Harden TIFF preview decoding

6e324d5

vercel Bot deployed to Preview June 23, 2026 13:56 View deployment

Merge remote-tracking branch 'origin/main' into codex/docx-render-fid…

16d21cd

…elity # Conflicts: # packages/core/src/docx/parser.ts # packages/core/src/prosemirror/conversion/toProseDoc/textbox.ts

vercel Bot deployed to Preview June 23, 2026 14:03 View deployment

Update generated API reports

094be4c

vercel Bot deployed to Preview June 23, 2026 14:12 View deployment

Match inactive header footer display

49f5e79

vercel Bot deployed to Preview June 23, 2026 14:33 View deployment

Balance bounded continuous column sections

04a38af

vercel Bot deployed to Preview June 23, 2026 14:49 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve DOCX render fidelity#993

Improve DOCX render fidelity#993
aldrinjenson wants to merge 15 commits into
eigenpal:mainfrom
aldrinjenson:codex/docx-render-fidelity

aldrinjenson commented Jun 22, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

eigenpal-release-pal Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 22, 2026 •

edited

Loading

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

Uh oh!

jedrazb commented Jun 23, 2026

Uh oh!

jedrazb Jun 23, 2026

Uh oh!

aldrinjenson Jun 23, 2026

Uh oh!

aldrinjenson commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

aldrinjenson commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

vercel Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eigenpal-release-pal Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

Uh oh!

jedrazb commented Jun 23, 2026

Uh oh!

jedrazb Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

aldrinjenson Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

aldrinjenson commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aldrinjenson commented Jun 22, 2026 •

edited

Loading

vercel Bot commented Jun 22, 2026 •

edited

Loading

eigenpal-release-pal Bot commented Jun 22, 2026 •

edited

Loading

greptile-apps Bot commented Jun 22, 2026 •

edited

Loading