feat(Odin): Integrate actions in checkin worker#535
Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces support for Extraction Pipelines 2.0 Actions by adding new DTOs (such as Action, ActionUpdate, and ActionStatus) and integrating action dispatching and queuing within the check-in worker. It also includes comprehensive unit tests for the new action dispatching mechanics and DTO validation. The review feedback highlights a critical issue where spawning a daemon thread for the action dispatcher without a try-except block can cause unhandled exceptions to silently terminate the thread, making troubleshooting difficult. A code suggestion is provided to safely wrap the dispatcher call and log any exceptions.
There was a problem hiding this comment.
Code Review
This pull request introduces support for "Actions" in the checkin worker. It adds new action-related DTOs (such as ActionType, ActionStatus, Action, and ActionUpdate), updates checkin requests and responses to include actions, and implements action dispatching and queuing within the CheckinWorker. Additionally, comprehensive unit tests have been added. The reviewer recommended wrapping the spawned action dispatcher thread's target in a try-except block to prevent silent failures and ensure exceptions are properly logged.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces action dispatching capabilities to the CheckinWorker class, allowing users to set an action dispatcher callback and queue action updates thread-safely to be included in outgoing check-in requests. It also spawns a daemon thread to execute the dispatcher when pending actions are received in a check-in response, and includes a comprehensive test suite to verify this functionality. The reviewer recommends wrapping the action dispatcher execution in a try-except block inside the spawned thread to prevent silent failures and ensure any unhandled exceptions are properly logged.
PR Short Summary
Type of change
What changed
cognite/extractorutils/unstable/core/checkin_worker.py
execution never blocks the checkin interval
across retries
tests/test_unstable/test_checkin_worker_actions.py (new file)
Why it changed
What to focus on during review
_requeue_checkin uses prepend, not append ((action_updates or []) + self._action_updates). This preserves ordering across retries: updates from a failed batch must precede updates that arrived after the failure. The test test_action_updates_requeue_order_preserved_across_retry validates this.
Daemon thread for dispatch — the ActionDispatcher thread is daemon so it doesn't block extractor shutdown. The trade-off is that in-flight action handling is dropped on process exit without a clean shutdown window.
Lock scope in try_write_checkin — the drain (_action_updates[:] + .clear()) is a single critical section. A queue_action_update racing the drain either lands in the current batch (wins the lock first) or the next one (loses it) — never lost.
extra="ignore" on Action and CheckinResponse — deliberate: Odin may add fields before the SDK is updated.
Test evidence
Risks and unknowns
of the current batch and will go out on the next checkin — this is correct but means outcome reporting has at most one checkin interval of latency.
concern at current action volumes but worth noting.
Rollout and rollback
Checklist