Skip to content

Pandid v2#301

Merged
BergsethCognite merged 4 commits into
mainfrom
pandid_v2
May 29, 2026
Merged

Pandid v2#301
BergsethCognite merged 4 commits into
mainfrom
pandid_v2

Conversation

@BergsethCognite
Copy link
Copy Markdown
Contributor

Bug fixes

  • Outer except no longer raises a literal string "msg"; preserves message and chains original exception.
  • push_result_to_annotations now always returns int (was sometimes a tuple); a malformed result item no longer aborts the whole batch.
  • File-link entities now stored under the canonical search_property key, so file-to-file matches actually work.
  • Cross-view entities normalised to the file view's search_property so the diagram-detect search_field finds them.
  • Per-batch error_count / annotated_count arithmetic corrected (was over-counting whole pages).
  • ANNOTATE_BATCH_SIZE no longer mutated as a module-level global; per-invocation local that can't leak across warm-container calls.
  • Cursor preserved on transient 400 in get_new_files; persistent 400 fails loudly instead of silently restarting from scratch.
  • read_state_store split into typed read_state_cursor / read_state_batch_num with safe coercion.

Hardening

  • Narrowed except Exception to CogniteAPIError where .code / .message is read.
  • Bounded exponential backoff added to run_diagram_detect, get_new_files, and the apply retry loop.
  • Narrowed broad except in _detect_annotation_to_edge_applies to parse-shape errors only.
  • daemon=True on the Mixpanel usage-tracker thread.

Cleanup / hygiene

  • Dead sys.path.append and unused sys / Path imports removed.
  • @DataClass on Enum removed.
  • Optional hints fixed (str | None = None) for debug_file, filter_property, filter_values.
  • Orphan pares_direct_relation typo'd classmethod deleted.
  • cognite.client.utils._text.shorten (private API) replaced with a local _truncate helper.
  • RawUploadQueue now lazy-imported (test-friendly; no behavior change).
  • from future import annotations added.
  • list[dict[str, any]] → list[dict[str, Any]] everywhere.
  • Entities de-duplicated on (space, external_id) to prevent duplicate edges.
  • Per-file delete_annotations_for_file replaced with a single batched delete_annotations_for_files per page.
  • O(n²) get_file_source_id replaced with a one-shot (space, external_id) → sourceId index.
  • create_table calls hoisted to a single setup step instead of running every batch.
  • Per-file INFO log demoted to DEBUG.
  • Missing ANNOTATE_BATCH_SIZE import added (latent NameError when debug=False).
  • Various typos fixed ("For for view", "don't contains", "status an logged status", "tag PID" docstrings).

Tests (new)

  • test_pipeline.py — 26 unit tests covering helpers, state coercion, get_all_entities, get_new_files retries, push_result_to_annotations.
  • test_config.py — 21 unit tests covering Optional defaults, required fields, threshold validation, Literal enforcement.
  • All 47 tests run locally with just pytest cognite-sdk pyyaml pydantic (no cognite-extractor-utils needed).

Rename / module alignment (related, outside the function dir)

  • Function directory renamed to fn_dm_context_files_annotation; functions.Function.yaml externalId made static.
  • Extraction pipeline externalId aligned to static ep_ctx_files_pandid_annotation (both ExtractionPipeline.yaml and config.yaml).
  • Workflow task references updated to the new function/extraction-pipeline ids.
  • handler.py::run_locally uses the new extraction-pipeline external id.
  • README updated: stale ids replaced, dataset/RAW-DB confusion fixed, "Running functions locally" rewritten, new "Testing" section added.

Bug fixes
Outer except no longer raises a literal string "msg"; preserves message and chains original exception.
push_result_to_annotations now always returns int (was sometimes a tuple); a malformed result item no longer aborts the whole batch.
File-link entities now stored under the canonical search_property key, so file-to-file matches actually work.
Cross-view entities normalised to the file view's search_property so the diagram-detect search_field finds them.
Per-batch error_count / annotated_count arithmetic corrected (was over-counting whole pages).
ANNOTATE_BATCH_SIZE no longer mutated as a module-level global; per-invocation local that can't leak across warm-container calls.
Cursor preserved on transient 400 in get_new_files; persistent 400 fails loudly instead of silently restarting from scratch.
read_state_store split into typed read_state_cursor / read_state_batch_num with safe coercion.
Hardening
Narrowed except Exception to CogniteAPIError where .code / .message is read.
Bounded exponential backoff added to run_diagram_detect, get_new_files, and the apply retry loop.
Narrowed broad except in _detect_annotation_to_edge_applies to parse-shape errors only.
daemon=True on the Mixpanel usage-tracker thread.
Cleanup / hygiene
Dead sys.path.append and unused sys / Path imports removed.
@DataClass on Enum removed.
Optional hints fixed (str | None = None) for debug_file, filter_property, filter_values.
Orphan pares_direct_relation typo'd classmethod deleted.
cognite.client.utils._text.shorten (private API) replaced with a local _truncate helper.
RawUploadQueue now lazy-imported (test-friendly; no behavior change).
from __future__ import annotations added.
list[dict[str, any]] → list[dict[str, Any]] everywhere.
Entities de-duplicated on (space, external_id) to prevent duplicate edges.
Per-file delete_annotations_for_file replaced with a single batched delete_annotations_for_files per page.
O(n²) get_file_source_id replaced with a one-shot (space, external_id) → sourceId index.
create_table calls hoisted to a single setup step instead of running every batch.
Per-file INFO log demoted to DEBUG.
Missing ANNOTATE_BATCH_SIZE import added (latent NameError when debug=False).
Various typos fixed ("For for view", "don't contains", "status an logged status", "tag PID" docstrings).
Tests (new)
test_pipeline.py — 26 unit tests covering helpers, state coercion, get_all_entities, get_new_files retries, push_result_to_annotations.
test_config.py — 21 unit tests covering Optional defaults, required fields, threshold validation, Literal enforcement.
All 47 tests run locally with just pytest cognite-sdk pyyaml pydantic (no cognite-extractor-utils needed).
Rename / module alignment (related, outside the function dir)
Function directory renamed to fn_dm_context_files_annotation; functions.Function.yaml externalId made static.
Extraction pipeline externalId aligned to static ep_ctx_files_pandid_annotation (both ExtractionPipeline.yaml and config.yaml).
Workflow task references updated to the new function/extraction-pipeline ids.
handler.py::run_locally uses the new extraction-pipeline external id.
README updated: stale ids replaced, dataset/RAW-DB confusion fixed, "Running functions locally" rewritten, new "Testing" section added.
@BergsethCognite BergsethCognite requested a review from a team as a code owner May 29, 2026 10:29
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

BergsethCognite and others added 2 commits May 29, 2026 12:31
…ith 'import' and 'import from''

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@BergsethCognite BergsethCognite merged commit 0ecf5bc into main May 29, 2026
6 checks passed
@BergsethCognite BergsethCognite deleted the pandid_v2 branch May 29, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants