Skip to content

feat: RuntimeTool trait with go/ts/rust execution backends (#1822)#1

Open
aboimpinto wants to merge 129 commits into
feat/external-tool-abstraction-prfrom
feat/runtime-tool-pr
Open

feat: RuntimeTool trait with go/ts/rust execution backends (#1822)#1
aboimpinto wants to merge 129 commits into
feat/external-tool-abstraction-prfrom
feat/runtime-tool-pr

Conversation

@aboimpinto
Copy link
Copy Markdown
Owner

Summary

Adds the RuntimeTool trait and three new code-execution backends: go_execution, ts_execution, and rust_execution.

Changes

  • RuntimeTool trait — unified pluggable trait for code-execution backends, extending ExternalTool with tool definition, temp-file execution, and JSON result formatting
  • go_execution — Go sandbox via \go run file.go\
  • ts_execution — TypeScript sandbox with multi-candidate resolution (tsx → tsx.cmd → ts-node → deno → npx tsx)
  • rust_execution — Rust sandbox via
    ustc\ compile-then-run with secure temp directories
  • Windows .cmd probe fix — npm global installs create .cmd wrappers not found by \Command::new(); added .cmd fallback candidates

Testing

All four runtimes produce identical output for the same cross-runtime test (github_repo_info):

Runtime File
C# runtime_TestBed/dotnet_TestBed/github_repo_info.cs
Go runtime_TestBed/go_TestBed/github_repo_info.go
Rust runtime_TestBed/rust_TestBed/github_repo_info.rs
TypeScript runtime_TestBed/typescript_TestBed/github_repo_info.ts

Depends on

  • PR adding ExternalTool trait and dotnet_execution (base branch)

Closes Hmbown#1822

zlh124 and others added 30 commits May 18, 2026 23:06
…osts

On Linux, `arboard::Clipboard::new()` opens a blocking connect() to the
X11 Unix socket. When no X server is running (headless, WSL2 without
WSLg), the call hangs indefinitely. Because raw mode and the alternate
screen are already active at that point, Ctrl+C no longer generates
SIGINT and the event loop hasn't started yet — leaving the user with a
blank screen and no way to exit.

Move clipboard initialization from `ClipboardHandler::new()` (called
synchronously during App construction) to a lazy `ensure_clipboard()`
that runs on first read/write with a 500 ms timeout. If the X11
connection doesn't respond in time, the handler stays in fallback mode
and `write_text` falls through to the existing OSC 52 / pbcopy /
PowerShell paths.
…on Windows

Only DEEPSEEK_LOG_LEVEL should gate verbose CLI output. RUST_LOG controls
the tracing subscriber independently (file logging). On Windows stderr is
not redirected to the log file, so coupling the two causes tracing log
messages to leak into the TUI alt-screen, corrupting the display.

Closes Hmbown#1774
…mbown#1714)

The CLI --model handoff only exports DEEPSEEK_MODEL, which apply_env_overrides
funneled into the DeepSeek-only root default_text_model slot. default_model()
then treated it as a normalizable weak default and fell back to a
DeepSeek/provider default for an unrecognized custom model on a non-DeepSeek
provider, so e.g. --provider openai --model MiniMax-M2.7 silently sent a
DeepSeek model to the endpoint.

Route DEEPSEEK_MODEL into the active provider's model slot for non-DeepSeek
providers (mirroring the existing OPENAI_MODEL branch), and return an explicit
non-DeepSeek-alias model verbatim from default_model(). DeepSeek/DeepSeekCN
behavior is unchanged. Adds a regression test next to
openai_provider_accepts_custom_model_and_base_url.

Refs Hmbown#1714, Hmbown#1739, Hmbown#1736.
Address gemini-code-assist review on PR Hmbown#1740 (MEDIUM, clarity): the
non-DeepSeek else-branch match listed ApiProvider::Deepseek /
DeepseekCN arms that are logically unreachable (handled by the if
branch above). Collapse to a single
'Deepseek | DeepseekCN => unreachable!(...)' arm so the intent is
explicit for future maintainers. No behavior change.

Refs Hmbown#1714.
…rovider (Hmbown#1739)

should_replay_reasoning_content_for_provider() returned false whenever
provider_accepts_reasoning_content(provider) was false (true for
ApiProvider::Openai) without checking the model. This single gate feeds
both build_for_provider (include_reasoning) and
sanitize_thinking_mode_messages, so a DeepSeek reasoning model on the
generic openai provider (DeepSeek-compatible endpoint) had all
reasoning_content stripped -> the DeepSeek thinking-mode API 400s
('reasoning_content in the thinking mode must be passed back'). This is
the over-aggressive half of ac01b22 (fix Hmbown#1542).

Gate the early return on the model too:
!provider_accepts_reasoning_content(provider) && !requires_reasoning_content(model).
Known DeepSeek reasoning models replay regardless of provider; genuine
non-DeepSeek models on openai still strip (effort=off still wins). Hmbown#1542
not regressed (provider_accepts_reasoning_content untouched).

Two pre-existing client.rs tests asserted the buggy case (deepseek-v4-pro
on Openai -> dropped); retargeted to gpt-4o to preserve their Hmbown#1542
intent without encoding the bug. New positive/negative coverage in
chat.rs.

Refs Hmbown#1739, Hmbown#1694, Hmbown#1542, Hmbown#1736.
…Hmbown#1743)

Address gemini-code-assist review on PR Hmbown#1743:

- HIGH: should_replay_reasoning_content_for_provider was made model-aware
  in the previous commit, but handle_chat_completion_stream still computed
  is_reasoning_model = requires_reasoning_content(model) &&
  provider_accepts_reasoning_content(provider). On the openai provider +
  a DeepSeek model that was false during SSE parsing, so reasoning tokens
  were stored as content (not reasoning_content) and the next request
  still 400'd -- the fix was incomplete. Extract is_reasoning_model_for_stream()
  and route the stream call site through it; add an equivalence test
  locking it to the replay predicate so the two paths can't drift.
- MEDIUM: rename generic_openai_provider_drops_deepseek_reasoning_content
  -> generic_openai_provider_drops_reasoning_content_for_non_deepseek_models
  (now uses gpt-4o; old name was misleading).

Non-DeepSeek models on any provider are unaffected (Hmbown#1542 not regressed).

Refs Hmbown#1739, Hmbown#1694, Hmbown#1542.
…own#1727)

When a model streams a turn with only a reasoning block (empty content,
no tool_calls -- e.g. gpt-oss via ollama's harmony->OpenAI shim mapping
to reasoning_content), has_sendable_assistant_content is false: the
if-only persist branch was skipped, NO event was emitted, and the turn
fell through to break. The UI spinner hung with no reply and no error.

Add an else-if at the same persist site that, only on a clean end
(tool_uses empty, turn_error.is_none(), not cancelled), warns and emits
an Event::status notice telling the user the turn ended and they can
retry. No assistant message is persisted (prior behavior preserved); no
retry/re-prompt/placeholder policy is added (out of scope). Guard
ordering extracted into a pure should_emit_thinking_only_status() helper
with a unit test, matching the existing should_hold_turn_for_subagents
style. Maintainer audit Hmbown#1736 confirms Hmbown#1727 is not shipped in v0.8.39
and release-blocking.

Refs Hmbown#1727, Hmbown#1736.
…own#1742)

Address gemini-code-assist review on PR Hmbown#1742 (MEDIUM): the status was
emitted at the assistant-persist site, but the same turn can still
CONTINUE for pending steers or sub-agent completions -- the user would
see a spurious 'turn ended without output' notice immediately before the
turn resumed.

Capture thinking_only_no_sendable at the persist site (no emission
there) and decide at the end of the tool_uses.is_empty() path, just
before the terminal break -- reachable only when there were no pending
steers, no sub-agent completions, and we were not holding for running
children. Extend should_emit_thinking_only_status with steers_pending
and holding_for_subagents (false if either), recomputed live at the
decision point as defense-in-depth. Unit test updated with the two new
no-emit cases.

Refs Hmbown#1727.
…#1691)

git commit -m "feat: complete sub-pages" failed on Windows with
'pathspec sub-pages" did not match' because the quoted -m message was
split on spaces. Root cause is not a tokenizer bug: CommandSpec::shell()
builds 'cmd /C "chcp 65001 >/dev/null & <command>"' and std::process::Command
applies MSVCRT escaping (" -> \"); cmd.exe does not use MSVCRT parsing,
so the quoting is destroyed and git receives feat:/complete/sub-pages"
as separate pathspecs. The Unix sh -c path was already correct.

Add a cfg-split push_shell_args() replacing cmd.args() at the 3 std
spawn sites in shell.rs. On Windows, for the cmd /C <payload> shape
only, pass /C and payload via CommandExt::raw_arg so the string reaches
cmd.exe verbatim (as a terminal does); other programs keep normal
escaping. Non-Windows is a faithful pass-through (byte-for-byte
unchanged). portable_pty path intentionally untouched (out of scope).

The Unix path is provably unchanged (tested); the Windows raw_arg
runtime correctness is only verifiable on a Windows runner -- flagged in
the PR for Windows CI verification per the Hmbown#1736 Windows policy.

Refs Hmbown#1691, Hmbown#1736.
…bown#1744)

Address gemini-code-assist review on PR Hmbown#1744 (two MEDIUM):

- cmd detection used program.eq_ignore_ascii_case("cmd"), which fails
  for a full path (C:\Windows\System32\cmd.exe) or a .exe suffix, so the
  raw_arg quoting fix would not apply. Use Path::file_stem() instead
  (fully-qualified std::path::Path -> no unused import off-Windows).
- Strengthen the Windows block of issue_1691_quoted_commit_message_round_trips
  to assert argv content equals spec.args, not just arg count, so the
  raw_arg payload (quotes preserved, no extra escaping) is actually
  verified. sandbox/mod.rs already asserts content -- left untouched.

Windows paths are cfg-gated (compile-checked on macOS, executed on
Windows CI). macOS build + clippy clean.

Refs Hmbown#1691.
Plain Home and End now navigate within the current line instead of
jumping to the absolute start/end of the entire input.  Ctrl+A and
Ctrl+E remain as absolute start/end shortcuts.

- Add move_cursor_line_start() / move_cursor_line_end() to App
- Wire Home -> move_cursor_line_start(), End -> move_cursor_line_end()
- On single-line input the new methods behave identically to the
  absolute versions (no behaviour change)
- End on a newline character skips to the end of the next line
- 14 tests covering multiline, singleline, and edge cases
Per gemini-code-assist review on Hmbown#1749: End on a newline character
should stay at the end of the current line (idempotent), not skip
to the next line.  Removes the non-standard skip-past-newline logic
and updates the associated tests.
Hmbown and others added 9 commits May 24, 2026 04:07
- Thinking cells now qualify as detail targets alongside tools and sub-agents
- Space key on empty composer toggles the focused cell's collapsed state
- Status message confirms expand/collapse action
- Builds on existing collapsed_cells HashSet from mouse context menu
…wn#1961)

Before the turn breaks at the thinking-only checkpoint, drain
any sub-agent completions that arrived between the last hold
check and now. If a child finished while we were running the
final status check, surface its sentinel immediately rather
than delaying it to the next turn.
- npm/codewhale/README.md: remove DeepSeek-first language
- docs/INSTALL.md: scoop install codewhale (not deepseek-tui)
- Wire decision card overlay rendering in main render loop
- Decision cards now appear centered on transcript when active
…ntributors

Fixes the macOS test failure on PR Hmbown#1988 and the contributor-credit
gate from scripts/release/check-versions.sh.

cell_has_detail_target() was matching HistoryCell::Thinking, which
caused the activity footer to append " · ⌥+V raw" to thinking cells
that have no separate raw detail target. The detail-card flow only
exists for tool / sub-agent cells; thinking renders its own raw text
inline. Removing Thinking from the match arm restores the behavior the
existing
activity_footer_hint_surfaces_visible_thinking_without_raw_tool_hint
test asserts.

The CHANGELOG.md 0.8.43 section now credits the 30 contributors added
to README acknowledgements in this cycle, satisfying the README-vs-
CHANGELOG cross-check in check-versions.sh. crates/tui/CHANGELOG.md is
re-synced so the matching guard passes.

Verified locally on macOS:
- cargo fmt --all -- --check         : clean
- cargo clippy --workspace --all-targets --all-features --locked -D warnings : clean
- cargo test --workspace --all-features --locked : 41 suites, 0 failed
- ./scripts/release/check-versions.sh : Version state OK
- ./scripts/release/publish-crates.sh dry-run : all 14 crates OK
- cargo build --release --locked : clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aboimpinto aboimpinto force-pushed the feat/runtime-tool-pr branch 6 times, most recently from 1b28185 to c7003fa Compare May 24, 2026 16:08
Adds RuntimeTool trait shared across all execution backends and
the dotnet_execution tool for running C# scripts via .NET SDK.
@aboimpinto aboimpinto force-pushed the feat/runtime-tool-pr branch from fed6b95 to 7778736 Compare May 24, 2026 17:09
@aboimpinto aboimpinto force-pushed the feat/runtime-tool-pr branch from b8c2420 to 96554bd Compare May 24, 2026 17:27
@aboimpinto aboimpinto force-pushed the feat/runtime-tool-pr branch 2 times, most recently from 25ce4f5 to e90621d Compare May 24, 2026 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants