Improve TableFormer model loading and add memory pattern optimization#18
Merged
Conversation
… decoder
TableFormer's model paths (DOCLING_TABLEFORMER_ENCODER/_DECODER/_BBOX) default
to relative paths ("models/tableformer/*.onnx") when unset, unlike layout/OCR's
env vars which are consistently documented with absolute paths everywhere.
Every doc/script that sets up the manual local-model workflow (README.md,
pdf_setup.sh, pdf_conformance.sh, pdf_groundtruth.sh, performance.sh) only
exported the layout/OCR vars, never TableFormer's — so anyone running the CLI
or an embedding binding (e.g. fleischwolf-node) from any directory other than
the exact one the relative default resolves against silently got the
geometric table-reconstruction fallback instead of ML table-structure
recognition, with zero diagnostic signal.
- tableformer.rs: emit a one-time stderr note when the models can't be found,
naming the exact paths checked, instead of returning None with no trace.
- README.md + the four pdf_*.sh/performance.sh scripts: export
DOCLING_TABLEFORMER_ENCODER/_DECODER/_BBOX as absolute paths alongside
layout/OCR, so local dev setups aren't CWD-dependent (fleischwolf-node's
deps.js already did this correctly via installDependencies(); this brings
the manual-setup docs/scripts to parity with it).
- tableformer.rs: disable ONNX Runtime's memory-pattern optimizer specifically
on the decoder session. The decoder's KV-cache grows by one entry every
autoregressive step, so its input shapes differ on every `run()` call;
memory-pattern optimization assumes stable shapes to plan buffer reuse, and
strace showed the decoder's external-weights file (decoder.onnx.data) being
re-opened on what looks like every decode step. This is `ort`'s own
documented guidance for variable-input-size sessions — not independently
verified against live models in this environment (no models available here),
worth confirming with strace against a real corpus.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DofkqhMuAJbL9arnnuVisL
…fig setup
installDependencies() could only auto-fetch pdfium and the OCR model; the
layout model (required for any PDF/image conversion) had no public download,
so every user hit a hard error on first use and had to either run the Python
export toolchain locally or find/host the files themselves — with no
convenient way to do either from a plain Node app.
Both models are permissively licensed (docling-project/docling-layout-heron,
Apache-2.0; docling-project/docling-models, CDLA-Permissive-2.0/Apache-2.0),
so fleischwolf can redistribute the ONNX export with attribution:
- .github/workflows/publish-models.yml: manual workflow that runs the existing
scripts/export_{layout,tableformer}.py and publishes the resulting .onnx
files as GitHub Release assets (tag models-v1); re-running re-uploads,
handling the optional TableFormer decoder.onnx.data external-weights
sidecar generically (checked per-file, not hardcoded).
- MODELS_NOTICE.md: attribution for the redistributed models.
- deps.js: installDependencies() now defaults `modelsUrl` to that release, so
it works with zero configuration; `{ modelsUrl }` / FLEISCHWOLF_MODELS_URL
still override for a custom export/host. Flattens tableformer/*.onnx to
tableformer-*.onnx for the fetch (GitHub release assets can't contain "/"),
and both error messages (installDependencies, assertMlReady) now print a
numbered, copy-pasteable troubleshooting guide instead of a single dense
paragraph, shown only when the fetch itself fails.
- scripts/setup_nodejs_dependencies.sh: a curl-pipeable wrapper — run from a
Node app's directory to pre-fetch everything (installs the `fleischwolf` npm
dep if missing, then drives installDependencies()) ahead of time, e.g. in a
container build step.
- READMEs (root + fleischwolf-node): document the zero-config flow and the new
script.
Verified deps.js end-to-end against the real (not-yet-published) release URL:
pdfium/OCR download correctly, the 404s on the unpublished model assets are
caught per-file (TableFormer skips gracefully, layout falls through to the
guided error) rather than crashing the whole install.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DofkqhMuAJbL9arnnuVisL
…ependencies.sh installDependencies() duplicated what a plain bash script does better: fetch pdfium + the ONNX models straight into ./models and ./.pdfium, which both the Rust CLI and the Node bindings already default to relative to CWD. Removes the JS download machinery entirely (deps.js keeps only path resolution + status checks), deletes the now-pointless setup_nodejs_dependencies.sh wrapper, and adds the CWD-relative .pdfium/lib fallback to pdfium_backend.rs so the Rust side needs no env vars either. publish-models.yml now also re-hosts pdfium + the OCR model (previously only layout/TableFormer), so download_dependencies.sh talks to a single release. TableFormer assets are published under their bare names (encoder.onnx, decoder.onnx, bbox.onnx) instead of a tableformer- prefix, matching the script's fixed target paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01DofkqhMuAJbL9arnnuVisL
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves the TableFormer ONNX model loading experience by adding better diagnostics for missing models and optimizing ONNX Runtime memory usage patterns. It also ensures environment variables are properly exported in setup scripts so TableFormer works correctly regardless of the process's working directory.
Key Changes
Added
warn_missing_once()function: Emits a single per-process warning to stderr when TableFormer models aren't found, explaining the fallback to geometric reconstruction and directing users to set environment variables. This replaces silent failures that were particularly problematic for embedding applications or processes not run from the repo root.Optimized ONNX Runtime memory patterns: Modified the
build()closure to accept amem_patternparameter and disable memory pattern optimization for the decoder session. This avoids repeated re-validation of the memory plan on each autoregressive step, since the decoder's KV-cache grows by one entry per step, causing input shapes to differ on everyrun()call.Updated setup and test scripts: Added explicit exports of
DOCLING_TABLEFORMER_ENCODER,DOCLING_TABLEFORMER_DECODER, andDOCLING_TABLEFORMER_BBOXenvironment variables in:scripts/pdf_setup.sh(conditional export when models exist)scripts/pdf_conformance.sh(with defaults)scripts/pdf_groundtruth.sh(with defaults)scripts/performance.sh(conditional export when models exist)Updated README.md: Added documentation explaining the optional TableFormer environment variables, the silent fallback behavior, and why explicit configuration is recommended when invoking from non-repo-root directories.
Implementation Details
mem_pattern=false) while encoder and bbox use it (mem_pattern=true), reflecting the decoder's dynamic shape requirements.std::sync::Onceto ensure it fires exactly once per process, avoiding log spam while still making the issue visible.https://claude.ai/code/session_01DofkqhMuAJbL9arnnuVisL