T035: Implement PaddleOCR-VL-1.6 local OCR adapter by likith1908 · Pull Request #35 · auropro-hyd/IRIS

likith1908 · 2026-06-11T09:40:33Z

Summary

Implements T035: the PaddleOCR-VL-1.6 local OCR adapter. Unlike the ADI and Datalab adapters which call remote HTTP services, this adapter runs inference entirely on-device using the paddleocr library and a pre-downloaded HuggingFace model snapshot. Also includes two correctness fixes in the Datalab adapter and a pytest module-collision fix that was only visible once all three adapter test suites existed side-by-side.

packages/iris-adapters/ocr-paddleocr/src/iris_ocr_paddleocr/client.py: PaddleOCREngine - PDF rasterisation via PyMuPDF at 150 DPI, per-page inference via PaddleOCRVL.predict(), result mapping from PaddleOCRVLResult.parsing_res_list to OCRPageResult; offline mode controlled by IRIS_PADDLEOCR_OFFLINE=1 + IRIS_PADDLEOCR_MODEL_PATH; raises OCRUnavailable at init if offline flag is set without model path
packages/iris-adapters/ocr-paddleocr/src/iris_ocr_paddleocr/__init__.py: re-exports PaddleOCREngine
packages/iris-adapters/ocr-paddleocr/pyproject.toml: dependencies (paddlepaddle>=3.1, paddleocr[doc-parser]>=3.6.0, pymupdf, huggingface_hub>=0.23, iris-engine); paddlepaddle is now an explicit direct dep so it survives any uv sync call without requiring --all-packages
packages/iris-adapters/ocr-paddleocr/tests/test_unit.py: 29 unit tests, all inference mocked via _pipeline injection seam; covers C-OCR-001 through C-OCR-010 and C-OCR-LOCAL-001
packages/iris-adapters/ocr-paddleocr/tests/test_live.py: C-OCR-LIVE-001, gated on IRIS_OCR_LIVE_PADDLEOCR=1 and pytest.mark.slow
packages/iris-adapters/ocr-datalab/src/iris_ocr_datalab/client.py: added max_poll_seconds=300.0 + deadline check in _poll() - was an unbounded loop (carried forward from code review)
packages/iris-adapters/ocr-datalab/tests/test_unit.py: added test_poll_deadline_exceeded_raises_unavailable covering the new deadline path
packages/iris-adapters/ocr-adi/tests/__init__.py: deleted - causes module name collision under --import-mode=importlib
packages/iris-adapters/ocr-datalab/tests/__init__.py: deleted - same reason
packages/iris-adapters/ocr-adi/tests/test_live.py: added pytest.mark.slow to pytestmark
packages/iris-adapters/ocr-datalab/tests/test_live.py: added pytest.mark.slow to pytestmark
pyproject.toml: iris_ocr_paddleocr added to coverage sources, mypy overrides, importlinter; ruff exclude added for PaddleOCR-VL-1.6-snapshot/
.gitignore: PaddleOCR-VL-1.6-snapshot/ added (1.8 GB model directory, local-only, never committed)
.env.example: PaddleOCR offline mode vars and model download instructions added
uv.lock: updated for new dependencies

Task reference

Task ID: T035
Workstream: 003-ocr-adapter-set

Acceptance criteria

T035

PR checklist:

Every acceptance criterion in the task's tasks.md entry is satisfied.
docs-ci workflow passes (markdown lint + tasks structural check).
No em-dashes introduced in new prose.
All internal links resolve.
No secrets, credentials, or personal names introduced.

Live test result

Run against the local PaddleOCR-VL-1.6-snapshot in offline mode with a PNG fixture.

adapter_id  : paddleocr
total_pages : 1
latency_ms  : ~15 000 (cold model load; subsequent calls are faster)
confidence  : 1.0
bboxes      : 1
markdown    : IRIS live test

All pipeline stages loaded from cache. Layout detection (PP-DocLayoutV3) used the ModelScope-cached copy at ~/.paddlex/official_models/PP-DocLayoutV3/. VL recognition used the local HF snapshot at PaddleOCR-VL-1.6-snapshot/.

Notes for the reviewer

PaddleOCR library path, not transformers

Two inference paths exist for PaddleOCR-VL-1.6. The transformers path (AutoModelForImageTextToText) returns raw generated text only with no bounding box structure, meaning bboxes would always be []. The paddleocr library path (PaddleOCRVL) returns structured output with real bboxes. C-OCR-004 requires valid bounding box coordinates, so the paddleocr library path was the only viable choice.

VLM result structure differs from classic OCR

Classic PaddleOCR returns rec_texts, rec_scores, and dt_polys arrays. PaddleOCR-VL-1.6 returns a PaddleOCRVLResult that is dict-like: result["parsing_res_list"] yields a list of PaddleOCRVLBlock objects, each with .label, .bbox ([x1, y1, x2, y2] integers), and .content (text string). The original implementation assumed the classic structure and was completely rewritten once the real API was read from the installed library source. Unit test mocks (_MockResult, _MockBlock) reflect the actual interface.

Offline mode and model path

vl_rec_model_dir is the correct parameter name for pointing the VL recognition sub-model at a local directory. The parameter name model_dir (a guess based on classic PaddleOCR) is rejected by the library at runtime with ValueError: Unknown argument: model_dir. vl_rec_model_dir is only passed when both offline=True and model_path is set; online mode calls PaddleOCRVL(pipeline_version="v1.6") with no model path argument.

BBox format is axis-aligned rect, not polygon

Classic PaddleOCR returns 4-point polygons ([[x0,y0],[x1,y1],[x2,y2],[x3,y3]]). PaddleOCR-VL-1.6 returns axis-aligned rectangles ([x1, y1, x2, y2]). The original _poly_to_bbox function was replaced with _rect_to_bbox. Negative coordinates are clamped to 0 and degenerate rects (x1==x2 or y1==y2) get a minimum size of 1 to satisfy C-OCR-004.

Out of scope additions (beyond T035 acceptance)

Addition	Reason
`DatalabOCREngine._poll()` deadline guard	`_poll()` was an unbounded `while True` loop. The `httpx` timeout covers individual requests but not the loop. A stuck Datalab job would hang the caller forever. ADI had `max_poll_seconds` from the start; Datalab was missing it. Found during cross-adapter code review while implementing T035.
`test_poll_deadline_exceeded_raises_unavailable` in Datalab tests	Covers the new deadline path; mirrors the existing ADI test of the same name.
Delete `ocr-adi/tests/__init__.py` and `ocr-datalab/tests/__init__.py`	With `--import-mode=importlib`, pytest walks up from the test file looking for `__init__.py`. All three adapter test dirs had `__init__.py` but their parent package dirs did not, so all three resolved to `tests.test_unit`. ADI's module was cached first; Datalab and PaddleOCR tests ran ADI code. Deleting `__init__.py` gives each file a unique hash-based module name. This reversed the fix added in T034 - the T034 fix was the wrong fix.
`pytest.mark.slow` on ADI and Datalab live tests	Live tests load a 1B model or make real HTTP calls. The marker makes intent explicit alongside the `skipif` guard and is consistent across all three adapters.
ruff `exclude` for `PaddleOCR-VL-1.6-snapshot/`	The HF snapshot directory contains third-party Python files (`configuration_paddleocr_vl.py`, `image_processing_paddleocr_vl.py`) that fail ruff checks. These are upstream files and should not be linted.
`.gitignore` entry for `PaddleOCR-VL-1.6-snapshot/`	1.8 GB model directory. Must never be committed.

Considerations

Confidence is always 1.0 for this adapter
The VLM backbone does not produce a token-level confidence score per detected region. There is no numeric quality signal available from the pipeline. The adapter returns confidence=1.0 as a placeholder. test_c005_vl_confidence_is_one documents this behaviour explicitly. If the contract is later tightened to require a real signal, this adapter would need either a different model or a post-hoc heuristic (e.g. character recognition score from a secondary pass).

PP-DocLayoutV3 sub-model is not covered by IRIS_PADDLEOCR_OFFLINE=1
IRIS_PADDLEOCR_OFFLINE=1 prevents HuggingFace network calls only. The PaddleOCR-VL pipeline internally uses a second model, PP-DocLayoutV3 (layout detection), which auto-downloads from ModelScope (Alibaba) on first use. For a fully airgapped deployment, PP-DocLayoutV3 must also be pre-staged locally and passed via layout_detection_model_dir. This is a follow-on item. The current env var set is sufficient for development and for environments where ModelScope is reachable.

GPU is required for production throughput
CPU inference is measured at approximately 7-8 minutes per page (tested on a 2-page, 97 KB PDF on the dev machine without a GPU). The production Docker image must install paddlepaddle-gpu from the PaddlePaddle custom index after the base install. The base make install installs the CPU wheel only, which is correct for CI and local development.

paddlepaddle-gpu is not on PyPI
PaddlePaddle distributes GPU wheels from their own index, not PyPI. A [gpu] optional extra was attempted but uv sync failed because the resolver cannot reach non-PyPI sources declared in extras. GPU installation must be handled as a post-step in the production Docker image and is documented in .env.example.

DPI choice for PDF rasterisation
150 DPI is the default. 72-96 DPI produces images too small for reliable VLM reading on dense documents. 300 DPI increases image size and inference time significantly. 150 DPI is configurable via the dpi constructor parameter if a specific deployment requires adjustment.

Rebase note

Rebased onto main (T034 merged as #34). Stack is clean.

Live threading verification

Confirmed on 8289d4d (HEAD, with explicit paddlepaddle dep): IRIS_OCR_LIVE_PADDLEOCR=1 uv run pytest -m slow packages/iris-adapters/ocr-paddleocr/tests/test_live.py - 1 passed in 40.61 s (cold model load). PaddlePaddle inference runs correctly from an asyncio.to_thread worker. The concern flagged by Anmol is resolved.

anmolg1997

Strong adapter — the PaddleOCRVL-vs-transformers investigation and the honest documentation of what the VLM does and doesn't provide are exactly the engineering judgment this task needed. Two changes before merge, one housekeeping item.

Change 1: inference blocks the event loop

extract() is async, but everything inside it is synchronous compute: PyMuPDF rasterisation, np.array(image), and — the big one — pipeline.predict(), which is seconds of CPU/GPU inference per page (your own live test logged ~15s cold). Unlike the ADI and Datalab adapters, where every await yields to the loop during network I/O, this adapter holds the event loop hostage for the full inference duration. In the API or worker process, one PaddleOCR extraction freezes every concurrent request — including /healthz.

best-practices.md section 1.1 sets the async rule for I/O; heavy compute in an async context is its sibling: offload to a thread. The fix is two lines:

async def extract(self, ctx, document_id, content, content_type) -> OCRResult:
    ...
    start = time.monotonic()
    images = await asyncio.to_thread(_to_images, content, content_type, self._dpi)
    ocr_pages = await asyncio.to_thread(
        lambda: [_run_page(self._pipeline, img, page_number=i + 1) for i, img in enumerate(images)]
    )
    ...

Existing tests keep passing (asyncio.run drives to_thread fine). One thing to verify when you make the change: PaddlePaddle inference from a non-main thread — the live test will confirm. If the library objects to it, fall back to documenting the constraint and we'll address it at the worker level with a process pool in a later workstream; but try the thread first, it almost always just works.

Change 2: T035 not flipped in tasks.md

tasks/003-ocr-adapter-set/tasks.md still shows - [ ] **T035**. Flip it and — given the offline-mode caveat below — extend the acceptance note the same way you did for T034's bbox limitation:

Note: offline mode (vl_rec_model_dir) covers the VL recognition sub-model only; layout-detection (PP-DocLayoutV3) and doc-preprocessor sub-models are fetched from PaddlePaddle's CDN on first use unless their model dirs are also supplied or ~/.paddlex/official_models/ is pre-seeded. Full airgap requires pre-seeding; document in T039.

That caveat is currently only visible in a code comment in _load_pipeline(). It's the difference between "works offline" and "works airgapped", and an ops person reading tasks.md should see it. Please also make sure T039's adapter README covers the pre-seed step (~/.paddlex cache + HF snapshot + TRANSFORMERS_OFFLINE=1).

Housekeeping: the Datalab deadline fix is in the wrong PR

This branch carries the max_poll_seconds fix for the Datalab client that my #34 review requested — but #34's own branch doesn't have it. Stacked-PR mechanics: when #34 merges without it, the fix only lands when #35 merges, and #34's merge commit will contain the unbounded loop. Please cherry-pick max_poll_seconds + its test down into the T034-datalab-adapter branch (it'll then drop out of this PR's diff on rebase). Same for the tests/__init__.py deletions and the pytest.mark.slow additions to the ADI/Datalab live tests if those logically belong with #34's collision fix — your call on which PR owns them, but each fix should be in the PR whose feature it patches.

What I verified (all good)

Implementation

PaddleOCRVL library path over transformers — right call, and the reasoning (bbox structure vs raw text, C-OCR-004 compliance) is documented in the PR body and the module docstring. The note that the original implementation was rewritten after reading the installed library source is the kind of honesty that makes reviews fast.
Injection seam (_pipeline) bypasses model loading entirely — unit tests run without paddleocr installed since the import is lazy inside _load_pipeline(). 29 clause-named tests, all mocked via _MockResult/_MockBlock mirroring the real parsing_res_list interface.
Fail-fast at construction when offline=True without model_path — raises at init rather than hanging at first use. Tested.
PDF rasterisation at 150 DPI via PyMuPDF with malformed-PDF and zero-page guards; multi-frame TIFF → one page per frame; Pillow decode errors → OCRMalformedDocument.
_rect_to_bbox clamps to non-negative x/y and ≥1 width/height — C-OCR-004 safe.
Confidence fixed at 1.0 (VLM emits no per-block score) with 0.0 for empty results — documented compromise, consistent with how Datalab's missing-score case was handled, clause-tested.

Infra

.gitignore for the 1.8 GB snapshot, ruff exclude, coverage source extended to iris_ocr_paddleocr, model-download instructions in ocr-paddleocr/pyproject.toml comments and .env.example.
CI stays fast (0.2-0.3 min/job) — uv caching absorbs the paddle dependency weight.
Live test gated on both IRIS_OCR_LIVE_PADDLEOCR=1 and pytest.mark.slow, with the documented live result in the PR body (real bboxes from the local snapshot).

With the thread offload + tasks.md flip + fix relocation, this is ready. The three-adapter pattern is now established well enough that T036 (Tesseract) should be the fastest of the four.

anmolg1997

Approved on 94ec43d. Both code changes landed correctly. One verification gap to close manually before this adapter is wired into the worker — flagging it explicitly rather than letting it ride.

What's fixed

Change 1 — event-loop offload. extract() now wraps both _to_images and the per-page inference loop in asyncio.to_thread(...). Correct shape: the blocking PyMuPDF rasterisation and the seconds-long pipeline.predict() no longer hold the event loop. The pipeline = self._pipeline local before the lambda avoids capturing self in the closure — clean.

Change 2 — tasks.md. T035 flipped to [x] with the airgap caveat verbatim (VL sub-model is local; layout-detection + doc-preprocessor still hit PaddlePaddle's CDN unless ~/.paddlex is pre-seeded; full airgap documented in T039). An ops reader now sees the real offline boundary.

Fix relocation — done right. The Datalab max_poll_seconds fix is no longer in this diff; it's in #34 where it belongs. The pytest.mark.slow markers on the ADI/Datalab live tests and the tests/__init__.py additions are the cross-adapter housekeeping that surfaced with the third suite — fine to ride here, they resolve on rebase once #34 merges.

Verified locally

PaddleOCR suite: 29 passed, 1 deselected (live, slow).
Full default suite: 273 passed, 15 deselected — no collision across all four adapter test modules under importlib mode.
make lint: 3 contracts kept, 0 broken.

The one thing to verify before relying on this (NOT blocking the merge)

The "Live test result" block in the PR body shows the ~15s cold-load run — but that run predates the to_thread commit. It exercised the synchronous path. The unit tests that cover the threaded path mock the pipeline, so PaddlePaddle inference from an asyncio.to_thread worker thread has not actually been exercised against the real model.

This is the native-ML-lib edge case I flagged in the last review: Paddle's predict() usually works fine off the main thread, but some builds carry main-thread assumptions (signal handlers, CUDA context affinity, paddle's own thread-local state). It can't be caught in CI — it needs the 1.8 GB snapshot on real hardware.

Action: re-run test_live.py on the current threaded HEAD and confirm it still returns the expected result. If it works (most likely), drop a one-line note in the PR body updating the live-test section. If Paddle objects to the worker thread, the fallback is a single-worker ProcessPoolExecutor or documenting the constraint for the worker-integration task — but try the thread first. Please confirm before T038/worker wiring depends on this path; I don't want an unverified-threading assumption baked into the worker later.

Approving because the code is correct, the pattern is right, and every gate that can run is green. The above is a manual confirmation, not a code change.

Solid work on the offload — the lambda-capture detail especially.

Core adapter (iris-ocr-paddleocr): - PaddleOCREngine backed by PaddleOCRVL pipeline (v1.6), satisfying contract clauses C-OCR-001 through C-OCR-010 and C-OCR-LOCAL-001 - PDF rasterisation via PyMuPDF at 150 DPI; PNG/JPEG/TIFF accepted directly - Offline mode: IRIS_PADDLEOCR_OFFLINE=1 + IRIS_PADDLEOCR_MODEL_PATH point the VL recognition sub-model at a local HuggingFace snapshot directory; startup raises OCRUnavailable immediately if offline flag is set without path - Result mapping from PaddleOCRVLResult (dict-like, parsing_res_list) to OCRPageResult; VLM returns no per-block confidence so confidence=1.0 - BBox format is axis-aligned [x1,y1,x2,y2] rect, not polygon - converted via _rect_to_bbox with negative-coord clamp and min-size=1 enforcement - 29 unit tests, 96% coverage; live test gated on IRIS_OCR_LIVE_PADDLEOCR=1 Fixes carried forward from code review (Datalab / ADI): - DatalabOCREngine._poll() was an unbounded loop; added max_poll_seconds=300 and deadline check mirroring the ADI pattern (test added) - ocr-adi/tests/__init__.py and ocr-datalab/tests/__init__.py deleted; with importlib mode these caused all three adapter test_unit modules to resolve to the same module name, running ADI tests for every adapter - All three adapter live test files now carry pytest.mark.slow alongside their skipif guard so intent is explicit when running the full suite Workspace changes: - Root pyproject.toml: iris_ocr_paddleocr added to coverage sources, mypy overrides, importlinter; ruff exclude added for PaddleOCR-VL-1.6-snapshot - .gitignore: PaddleOCR-VL-1.6-snapshot/ added (1.8 GB model, local-only) - .env.example: PaddleOCR offline mode vars and model download instructions - huggingface_hub>=0.23 added to ocr-paddleocr dependencies

extract() was holding the event loop during PyMuPDF rasterisation and pipeline.predict() (up to ~15s per doc). Wrapped both _to_images and the per-page inference loop in asyncio.to_thread so concurrent requests are not frozen during OCR. Also flipped T035 to [x] in tasks.md and added the offline/airgap caveat about PP-DocLayoutV3 and doc-preprocessor sub-models requiring ~/.paddlex pre-seeding for a true airgap.

The base branch was changed.

anmolg1997

Re-approving on 773ad0d — pure rebase onto main after #34 merged (the two substantive commits are byte-identical to what I approved; the third is an empty CI-trigger). All 10 checks green, paddleocr suite 29 passed, lint clean, 0 unresolved threads.

Reminder, still non-blocking: the live-test block in the body documents the pre-threading ~15s run. PaddlePaddle inference from an asyncio.to_thread worker thread is still unverified against the real model. Fine to merge now, but please re-run test_live.py on this HEAD and confirm before T038 wires spans around it or the worker calls into it. Bottom of the stack now — merge this, then #36 and #37 rebase behind it.

paddlepaddle was already resolved as a transitive dep via paddleocr/paddlex but was not declared explicitly. A partial-sync of the venv (e.g. during rebase operations) could leave it missing until a full uv sync --all-packages ran. Declaring it explicitly ensures it is always present after any uv sync.

likith1908 · 2026-06-16T08:34:01Z

Re-ran live test on current HEAD (8289d4d) after the explicit paddlepaddle dep commit:

(iris-workspace) user@Ubuntu:~/Projects/IRIS$ IRIS_OCR_LIVE_PADDLEOCR=1 uv run pytest -m slow packages/iris-adapters/ocr-paddleocr/tests/test_live.py
==================================================== test session starts ====================================================
platform linux -- Python 3.12.13, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/user/Projects/IRIS
configfile: pyproject.toml
plugins: anyio-4.13.0, cov-7.1.0
collected 1 item                                                                                                            

packages/iris-adapters/ocr-paddleocr/tests/test_live.py 
.                                                             [100%]

===================================================== warnings summary ======================================================
packages/iris-adapters/ocr-paddleocr/tests/test_live.py::test_live_png_round_trip
  /home/user/Projects/IRIS/.venv/lib/python3.12/site-packages/paddle/utils/cpp_extension/extension_utils.py:712: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
    warnings.warn(warning_message)

packages/iris-adapters/ocr-paddleocr/tests/test_live.py::test_live_png_round_trip
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

packages/iris-adapters/ocr-paddleocr/tests/test_live.py::test_live_png_round_trip
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

packages/iris-adapters/ocr-paddleocr/tests/test_live.py::test_live_png_round_trip
  /home/user/Projects/IRIS/.venv/lib/python3.12/site-packages/paddle/tensor/creation.py:1152: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach(), rather than paddle.to_tensor(sourceTensor).
    return tensor(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================== 1 passed, 4 warnings in 40.61s ===============================================
sys:1: DeprecationWarning: builtin type swigvarlink has no __module__ attribute

Cold model load, asyncio.to_thread worker, all pipeline stages from cache. Threading is stable. Anmol's non-blocking reminder is addressed.

Also noted: paddlepaddle was already in the lockfile as a transitive dep via paddleocr/paddlex but not declared explicitly - a partial uv sync (e.g. during rebase operations) could leave it absent until a full --all-packages run. Now pinned as a direct dep.

anmolg1997

Re-approving on 8289d4d. Only change since my last approval is pinning paddlepaddle>=3.1 as an explicit direct dependency in ocr-paddleocr/pyproject.toml (it was previously transitive through paddleocr). Correct call — the inference engine is a first-order dependency of this adapter and shouldn't rely on paddleocr's extras to pull it; pinning it makes the lockfile honest and protects against paddleocr ever dropping it. All checks green. Bottom of the stack — merge first.

anmolg1997 mentioned this pull request Jun 12, 2026

Implement ADI OCR adapter, py.typed marker, .env.example OCR vars (T033) #33

Merged

19 tasks

anmolg1997 requested changes Jun 12, 2026

View reviewed changes

likith1908 force-pushed the T034-datalab-adapter branch from 92d74f3 to 71aaac3 Compare June 12, 2026 06:22

likith1908 force-pushed the T035-paddleocr-adapter branch from a28044b to 8e242ab Compare June 12, 2026 06:43

likith1908 marked this pull request as ready for review June 15, 2026 04:49

likith1908 requested a review from anmolg1997 June 15, 2026 04:50

likith1908 mentioned this pull request Jun 15, 2026

T036: Implement Tesseract local OCR adapter #36

Merged

18 tasks

anmolg1997 mentioned this pull request Jun 16, 2026

Implement Datalab OCR adapter with httpx client and contract test suite (T034) #34

Merged

19 tasks

anmolg1997 previously approved these changes Jun 16, 2026

View reviewed changes

anmolg1997 mentioned this pull request Jun 16, 2026

T037: Parametrised OCR contract suite #37

Merged

17 tasks

likith1908 added 2 commits June 16, 2026 12:03

likith1908 force-pushed the T035-paddleocr-adapter branch from 94ec43d to e1c8b2a Compare June 16, 2026 06:35

likith1908 changed the base branch from T034-datalab-adapter to main June 16, 2026 06:35

ci: trigger CI after rebase onto main (T035)

773ad0d

anmolg1997 previously approved these changes Jun 16, 2026

View reviewed changes

likith1908 dismissed anmolg1997’s stale review via 8289d4d June 16, 2026 08:32

likith1908 requested a review from anmolg1997 June 16, 2026 09:06

anmolg1997 approved these changes Jun 17, 2026

View reviewed changes

anmolg1997 mentioned this pull request Jun 17, 2026

T038 + T039: OTEL span instrumentation and per-adapter READMEs #38

Merged

13 tasks

likith1908 merged commit 102dcdc into main Jun 17, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T035: Implement PaddleOCR-VL-1.6 local OCR adapter#35

T035: Implement PaddleOCR-VL-1.6 local OCR adapter#35
likith1908 merged 4 commits into
mainfrom
T035-paddleocr-adapter

likith1908 commented Jun 11, 2026 •

edited

Loading

Uh oh!

anmolg1997 left a comment

Uh oh!

anmolg1997 left a comment

Uh oh!

anmolg1997 left a comment

Uh oh!

likith1908 commented Jun 16, 2026 •

edited

Loading

Uh oh!

anmolg1997 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

likith1908 commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Task reference

Acceptance criteria

Live test result

Notes for the reviewer

PaddleOCR library path, not transformers

VLM result structure differs from classic OCR

Offline mode and model path

BBox format is axis-aligned rect, not polygon

Out of scope additions (beyond T035 acceptance)

Considerations

Rebase note

Live threading verification

Uh oh!

anmolg1997 left a comment

Choose a reason for hiding this comment

Change 1: inference blocks the event loop

Change 2: T035 not flipped in tasks.md

Housekeeping: the Datalab deadline fix is in the wrong PR

What I verified (all good)

Uh oh!

anmolg1997 left a comment

Choose a reason for hiding this comment

What's fixed

Verified locally

The one thing to verify before relying on this (NOT blocking the merge)

Uh oh!

anmolg1997 left a comment

Choose a reason for hiding this comment

Uh oh!

likith1908 commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anmolg1997 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

likith1908 commented Jun 11, 2026 •

edited

Loading

likith1908 commented Jun 16, 2026 •

edited

Loading