为Windows用户增加内置工具,根据用户需要进行识别剪切板图片;增加思考模式的显示方便过程透明化#277
Open
Mcy0618 wants to merge 18 commits into
Open
Conversation
- 在 build_runtime_system_prompt() 中新增 PLAN 模式感知段落 - 在 refresh_runtime_client() 中重建 system prompt - 解决LLM在PLAN模式下不知道自身状态、反复尝试调用被拒工具的问题
- Read images from system clipboard on Windows (Pillow/PowerShell), macOS (PIL/osascript), and Linux (PIL/xclip/wl-paste) - Three output modes: base64, file, text (auto-describe via vision model) - Settings fallback: when metadata not injected, load vision config from settings - 23 unit tests covering all platforms and fallback tiers
- pyproject.toml: add Pillow>=10.0.0 to [project.optional-dependencies] dev so 'uv sync --extra dev' installs it and CI has the expected dependency - tests: _fake_png_bytes() now catches ImportError and calls pytest.skip() instead of letting ImportError propagate as a hard test failure - ensures test file can be imported in CI environments without Pillow
- remove unused import subprocess from test file (ruff F401) - drop unused ool variable in test_input_model_is_pydantic (ruff F841) - fix test_powershell_image_found: explicitly set mock stdout to 'OK' string, add mock for os.close to avoid OSError on fd=999 in Linux CI - all 23 clipboard_screenshot tests pass locally
- MagicMock(stdout='OK') fails because .strip() returns a MagicMock on CI instead of a plain string, causing the 'OK' comparison to fail - Replace with subprocess.CompletedProcess which has real str attributes - move subprocess import into the test function to keep ruff F401 clean
…shell_image_found
…lict in runtime.py refresh_runtime_client()
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a cross-platform clipboard screenshot tool and introduces optional “thinking” streaming support end-to-end (API client → engine events → backends/frontends), plus small prompt updates.
Changes:
- Added
clipboard_screenshottool with platform-specific clipboard readers and unit tests. - Added
show_thinkingsetting/CLI flag and new thinking delta event plumbing across engine, UI backends, and React terminal frontend. - Added PLAN-mode section to the runtime system prompt.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_tools/test_clipboard_screenshot_tool.py | Adds unit tests covering clipboard screenshot tool behaviors and platform helpers. |
| src/openharness/ui/textual_app.py | Renders AssistantThinkingDelta events in the Textual UI. |
| src/openharness/ui/runtime.py | Threads show_thinking CLI override into settings merge. |
| src/openharness/ui/protocol.py | Extends transcript roles and backend event types to include thinking. |
| src/openharness/ui/output.py | Adds console rendering for AssistantThinkingDelta with separation from normal text output. |
| src/openharness/ui/backend_host.py | Emits thinking_delta backend events for the React frontend. |
| src/openharness/ui/app.py | Adds show_thinking parameters to entrypoints and streams thinking deltas in print/worker modes. |
| src/openharness/tools/clipboard_screenshot_tool.py | Implements the new clipboard screenshot tool (Windows/macOS/Linux + vision description mode). |
| src/openharness/tools/init.py | Registers ClipboardScreenshotTool in the default tool registry. |
| src/openharness/prompts/context.py | Adds explicit PLAN mode guidance to the system prompt builder. |
| src/openharness/engine/stream_events.py | Introduces AssistantThinkingDelta event type and adds it to StreamEvent. |
| src/openharness/engine/query_engine.py | Propagates settings.show_thinking into query runs. |
| src/openharness/engine/query.py | Adds show_thinking to QueryContext and maps API thinking events to engine events. |
| src/openharness/config/settings.py | Adds show_thinking setting and OPENHARNESS_SHOW_THINKING env override. |
| src/openharness/commands/registry.py | Adds /thinking slash command to toggle thinking display. |
| src/openharness/cli.py | Adds --show-thinking CLI flag and passes override into runtime entrypoints. |
| src/openharness/channels/adapter.py | Ignores thinking deltas in channel replies. |
| src/openharness/autopilot/service.py | Ignores thinking deltas when collecting assistant output. |
| src/openharness/api/openai_client.py | Adds thinking delta streaming and converts <think>...</think> blocks into thinking events when enabled. |
| src/openharness/api/copilot_client.py | Forwards show_thinking into the inner API request. |
| src/openharness/api/client.py | Adds show_thinking to ApiMessageRequest and introduces ApiThinkingDeltaEvent. |
| pyproject.toml | Adds Pillow to dev dependencies for test/support tooling. |
| frontend/terminal/src/types.ts | Extends TranscriptItem.role to include thinking. |
| frontend/terminal/src/hooks/useBackendSession.ts | Buffers thinking deltas and flushes them into transcript items before assistant output. |
| frontend/terminal/src/components/TranscriptPane.tsx | Adds label/color handling for thinking transcript entries. |
| TODO.md | Adds/updates implementation checklist items for plan mode and clipboard screenshot tool. |
| .catpaw/rules/python-launcher.md | Adds a repo rule document about using py on Windows. |
Comments suppressed due to low confidence (2)
src/openharness/ui/textual_app.py:1
- Thinking deltas and assistant text deltas both append into
self._assistant_buffer, so once a real assistant response starts, it will include prior thinking text (and vice-versa). Use a dedicated buffer for thinking (e.g.,self._thinking_buffer) and/or reset the appropriate buffer when transitioning from thinking → assistant text.
"""Default Textual terminal UI for OpenHarness."""
src/openharness/ui/app.py:1
run_replnow acceptsshow_thinkingbut (per the diff) it is not threaded into either thebackend_onlypath (run_backend_host(...)) or the React TUI launcher path, so--show-thinkinglikely has no effect in normal interactive mode. Passshow_thinkingthrough to the runtime build/host config (and add it torun_backend_host/launch_react_tuiinputs as needed) so the setting is honored consistently.
"""Interactive session entry points."""
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+84
to
+86
| def is_read_only(self, arguments: ClipboardScreenshotToolInput) -> bool: | ||
| del arguments | ||
| return True |
Comment on lines
+237
to
+247
| script = ( | ||
| f"Add-Type -AssemblyName System.Windows.Forms;" | ||
| f"Add-Type -AssemblyName System.Drawing;" | ||
| f"$img = [System.Windows.Forms.Clipboard]::GetImage();" | ||
| f'if ($img -ne $null) {{' | ||
| f' $img.Save("{tmp_path}", [System.Drawing.Imaging.ImageFormat]::Png);' | ||
| f' Write-Output "OK"' | ||
| f"}} else {{" | ||
| f' Write-Output "NO_IMAGE"' | ||
| f"}}" | ||
| ) |
Comment on lines
275
to
+279
| async def _render_event(event: StreamEvent) -> None: | ||
| if isinstance(event, AssistantThinkingDelta): | ||
| print(f"[DEBUG] Sending thinking_delta: {event.text[:50]}...", file=sys.stderr) | ||
| await self._emit(BackendEvent(type="thinking_delta", message=event.text)) | ||
| return |
| permission_mode=permission_mode, | ||
| max_turns=max_turns, | ||
| effort=effort, | ||
| show_thinking=show_thinking or None, |
Comment on lines
+260
to
+268
| elif isinstance(event, AssistantThinkingDelta): | ||
| collected_text += event.text | ||
| if output_format == "text": | ||
| sys.stderr.write(event.text) | ||
| sys.stderr.flush() | ||
| elif output_format == "stream-json": | ||
| obj = {"type": "thinking_delta", "text": event.text} | ||
| print(json.dumps(obj), flush=True) | ||
| events_list.append(obj) |
| finish_reason: str | None = None | ||
| usage_data: dict[str, int] = {} | ||
| # Buffer to strip inline <think>…</think> blocks across streaming chunks. | ||
| # Buffer to strip inline blocks across streaming chunks. |
Comment on lines
+467
to
+468
| # Matches complete blocks (DOTALL so newlines are included). | ||
| _THINK_RE = re.compile(r"<think>(.*?)</think>", re.DOTALL) |
Comment on lines
473
to
+476
| def _strip_think_blocks(buf: str) -> tuple[str, str]: | ||
| """Strip complete ``<think>…</think>`` blocks and return ``(visible_text, leftover)``. | ||
| """Strip complete ``...`` blocks and return ``(visible_text, leftover)``. | ||
|
|
||
| Complete pairs are removed via regex. An unclosed ``<think>`` is held in | ||
| Complete pairs are removed via regex. An unclosed ```` is held in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
macOS(PIL/osascript)和 Linux(PIL/xclip/wl-paste)