Skip to content

fix(BEE): recover unmatched tokens in table last row when model under-predicts row count#3728

Open
AnkitAhlawat7742 wants to merge 2 commits into
docling-project:mainfrom
AnkitAhlawat7742:missing_table_data
Open

fix(BEE): recover unmatched tokens in table last row when model under-predicts row count#3728
AnkitAhlawat7742 wants to merge 2 commits into
docling-project:mainfrom
AnkitAhlawat7742:missing_table_data

Conversation

@AnkitAhlawat7742

Copy link
Copy Markdown
Contributor

Fix #3402

issue description
When the last row of a table sits at the very bottom edge of the table image crop , the TableFormer model fails to
predict it as a separate row. The tokens are present in the input token list and well within the page bounds, but the model simply doesn't output a row for them.

Resolves

This solution adds a post-processing recovery step in TableStructureModel.predict_tablesthat detects word-level tokens not assigned to any predicted cell and appends them as an extra trailing row.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

DCO Check Passed

Thanks @AnkitAhlawat7742, all your commits are properly signed off. 🎉

@mergify

mergify Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Merge Protections

🟢 Merge protection satisfied — ready to merge.

Show 1 satisfied protection

🟢 Enforce conventional commit

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

I, Ankit.Ahlawat@ibm.com <Ankit.Ahlawat@ibm.com>, hereby add my Signed-off-by to this commit: 8d471bc

Signed-off-by: Ankit.Ahlawat@ibm.com <Ankit.Ahlawat@ibm.com>
@cau-git

cau-git commented Jul 1, 2026

Copy link
Copy Markdown
Member

@AnkitAhlawat7742 Thanks for the proposal. Before we resort to recovery-style post-processing, could you please check instead if enlarging the table detection a bit towards the bottom would cover this as well? Then we can see if that has undesired side effects on previously working tables and decide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BEE] Table model missing the last part of the table in the Japanese document

2 participants