feat: per-turn tool retrieval (trim tool-definition context flood)#858
feat: per-turn tool retrieval (trim tool-definition context flood)#858anandgupta42 wants to merge 4 commits into
Conversation
…rieval, critic gate) Three flag-gated, independent, default-off reliability features for the agent loop: - `provider/constrained.ts` — grammar/JSON-Schema constrained decoding so a local model (vLLM/LM Studio/llama.cpp) is forced to emit a parseable, schema-correct tool call. Pure (schema in → payload out). - `tool/retrieval.ts` — per-turn tool subset (always-on core + lexical top-k), never dropping a tool referenced mid-trajectory; trims the ~78-tool context flood. v1 lexical, dependency-free, deterministic. - `tool/critic.ts` — pre-execution gate for side-effecting tools via a pluggable `Verifier`; default allows everything (ungated), a real verifier is injected. Wired flag-gated into `session/llm.ts` (markers; default off → upstream path unchanged). 18 unit tests; typecheck clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds deterministic lexical-ranked tool selection to reduce context during inference and wires it conditionally into ChangesTool Retrieval and LLM Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The `ALTIMATE_CRITIC_GATE` flag was a no-op — `tool/critic.ts` was never imported into the execute path, so enabling the flag did nothing. Removing it from this PR so every shipped flag is actually wired: - `ALTIMATE_TOOL_RETRIEVAL` — wired in `session/llm.ts`, validated (-50% input tokens at equal resolve) - `ALTIMATE_CONSTRAINED_TOOLCALLS` — wired in `session/llm.ts` (local providers) The critic gate (pre-execution `Verifier` for side-effecting tools) moves to a follow-up that wires it into the `session/prompt.ts` execute wrapper with an integration test. Code preserved on `feat/critic-gate`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opencode/src/tool/retrieval.ts (1)
55-76: ⚡ Quick win
topkbudget is mostly consumed by CORE.
keep(CORE + callerkeep) counts againsttopk, and CORE alone is 10 entries. With the defaulttopk: 12, an enabled retrieval pass exposes only ~2 lexically-ranked non-core tools out of ~78 — which may starve the selection of task-relevant tools and undercut the feature's intent. Consider treatingtopkas a budget for retrieved tools beyond core/keep, or raising the default.♻️ One option: budget retrieved tools separately from core/keep
- for (const r of ranked) { - if (keep.size >= topk) break - keep.add(r.name) - } + const limit = keep.size + topk // topk additional tools beyond core + forced-keep + for (const r of ranked) { + if (keep.size >= limit) break + keep.add(r.name) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/tool/retrieval.ts` around lines 55 - 76, The current select function lets CORE and caller-provided opts.keep consume the topk budget, so CORE (10 items) leaves almost no room for lexically-ranked tools; adjust select so topk (and minToolsToRetrieve behavior) applies only to additional retrieved tools beyond CORE/keep: compute baseKeep from CORE and opts.keep, then compute retrievedBudget = (opts.topk ?? 12) - baseKeep.size (clamped to >=0) or, better, treat topk as the number of non-core tools directly (e.g., retrievedTopk = opts.topk ?? 12) and only count items added from ranked into that retrievedTopk; update the loop that iterates over ranked (and any minToRetrieve logic) to stop after retrievedBudget/retrievedTopk is filled while still always including all CORE/keep entries; reference function select, variables topk/minToRetrieve, CORE, keep, rest, ranked, and score to locate and modify the selection and stopping logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/opencode/src/tool/retrieval.ts`:
- Around line 17-20: Retrieval.CORE contains "ls" which doesn't match any
registered Tool id, so update the CORE array in retrieval.ts: either replace
"ls" with the actual registered tool id used when defining the listing tool
(e.g., the id passed to Tool.define for the filesystem/ls-like tool) or remove
"ls" from Retrieval.CORE; ensure the change aligns with the select() behavior
that only keeps CORE entries present in the candidate set so the tool id in
Retrieval.CORE must exactly match a Tool.define(...) id.
---
Nitpick comments:
In `@packages/opencode/src/tool/retrieval.ts`:
- Around line 55-76: The current select function lets CORE and caller-provided
opts.keep consume the topk budget, so CORE (10 items) leaves almost no room for
lexically-ranked tools; adjust select so topk (and minToolsToRetrieve behavior)
applies only to additional retrieved tools beyond CORE/keep: compute baseKeep
from CORE and opts.keep, then compute retrievedBudget = (opts.topk ?? 12) -
baseKeep.size (clamped to >=0) or, better, treat topk as the number of non-core
tools directly (e.g., retrievedTopk = opts.topk ?? 12) and only count items
added from ranked into that retrievedTopk; update the loop that iterates over
ranked (and any minToRetrieve logic) to stop after retrievedBudget/retrievedTopk
is filled while still always including all CORE/keep entries; reference function
select, variables topk/minToRetrieve, CORE, keep, rest, ranked, and score to
locate and modify the selection and stopping logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 09c66980-6898-48d1-8b44-a89331a61b52
📒 Files selected for processing (5)
packages/opencode/src/provider/constrained.tspackages/opencode/src/session/llm.tspackages/opencode/src/tool/retrieval.tspackages/opencode/test/provider/constrained.test.tspackages/opencode/test/tool/retrieval.test.ts
| export const CORE = [ | ||
| "bash", "read", "write", "edit", "glob", "grep", "ls", | ||
| "task", "todowrite", "skill", | ||
| ] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# List registered tool IDs to compare against CORE.
rg -nP "id:\s*['\"][a-z_]+['\"]" --type=ts -g '!**/*.test.ts' packages/opencode/src/tool || true
# Also look for how tool ids/names are declared
ast-grep --pattern 'Tool.define($$$)'
rg -nP '\b(todowrite|todo_write|todoread)\b' --type=ts -g '!**/*.test.ts'Repository: AltimateAI/altimate-code
Length of output: 50381
🏁 Script executed:
#!/bin/bash
set -euo pipefail
CORE=(bash read write edit glob grep ls task todowrite skill)
# Show where each CORE id is registered via Tool.define("<id>", ...)
for id in "${CORE[@]}"; do
echo "==== $id ===="
rg -n "Tool\.define\\(\"${id//\//\\/}\"" packages/opencode/src/tool -S --type=ts || true
done
# Show the CORE constant and retrieval selection logic for context
echo "==== CORE constant ===="
rg -n "export const CORE" packages/opencode/src/tool/retrieval.ts
sed -n '1,120p' packages/opencode/src/tool/retrieval.ts | nl -ba | sed -n '1,120p'
echo "==== select() logic (if present) ===="
rg -n "function select|const select|select\\(" packages/opencode/src/tool/retrieval.ts
sed -n '120,260p' packages/opencode/src/tool/retrieval.ts | nl -ba | sed -n '120,260p'Repository: AltimateAI/altimate-code
Length of output: 1149
🏁 Script executed:
#!/bin/bash
set -euo pipefail
CORE=(bash read write edit glob grep ls task todowrite skill)
# Show where each CORE id is registered via Tool.define("<id>", ...)
for id in "${CORE[@]}"; do
echo "==== $id ===="
rg -n "Tool\.define\\(\"${id//\//\\/}\"" packages/opencode/src/tool -S --type=ts || true
done
# Show the CORE constant and retrieval selection logic for context
echo "==== CORE constant ===="
rg -n "export const CORE" packages/opencode/src/tool/retrieval.ts
sed -n '1,120p' packages/opencode/src/tool/retrieval.ts | nl -ba | sed -n '1,120p'
echo "==== select() logic (if present) ===="
rg -n "function select|const select|select\\(" packages/opencode/src/tool/retrieval.ts
sed -n '120,260p' packages/opencode/src/tool/retrieval.ts | nl -ba | sed -n '120,260p'Repository: AltimateAI/altimate-code
Length of output: 1149
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the "ls" tool registration id
echo "==== registrations for ls ===="
rg -n "Tool\\.define\\(\"ls\"|Tool\\.define\\('ls'" packages/opencode/src/tool packages/opencode/src -S --type=ts || true
echo "==== any use of id: \"ls\" (Tool.define arg objects etc.) ===="
rg -n "(\"ls\")" packages/opencode/src/tool/retrieval.ts packages/opencode/src/tool -S --type=ts | head -n 200 || true
echo "==== retrieval.ts (CORE + select logic) ===="
cat -n packages/opencode/src/tool/retrieval.ts | sed -n '1,260p'Repository: AltimateAI/altimate-code
Length of output: 3714
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "==== Tool.define(<id>, ...) ids in packages/opencode/src ===="
rg -n "Tool\.define\(\s*(['\"])([a-z_]+)\1" packages/opencode/src -S --type=ts --type=tsx || true
echo "==== Any tool registry mentions of \"ls\" in packages/opencode/src ===="
rg -n "\"ls\"" packages/opencode/src -S --type=ts --type=tsx || trueRepository: AltimateAI/altimate-code
Length of output: 480
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "==== Tool.define IDs (capturing first arg) — filter for ls ===="
rg -n --pcre2 "Tool\.define\(\s*(['\"])([a-z_]+)\1" packages/opencode/src -S -g'*.ts' -g'*.tsx' \
| rg -n "\"ls\"|'ls'" || true
echo "==== Direct Tool.define(\"ls\" / 'ls') matches ===="
rg -n --pcre2 "Tool\.define\(\s*(['\"])ls\1" packages/opencode/src -S -g'*.ts' -g'*.tsx' || true
echo "==== Any tool-id string literal \"ls\" near tool code ===="
rg -n "\"ls\"" packages/opencode/src/tool -S -g'*.ts' -g'*.tsx' || trueRepository: AltimateAI/altimate-code
Length of output: 341
Fix Retrieval.CORE tool id mismatch for ls.
packages/opencode/src/tool/retrieval.ts includes "ls" in Retrieval.CORE, but there’s no Tool.define("ls", ...) (the only "ls" occurrence is inside CORE). Since select() force-keeps CORE entries only when they exist in the candidate set (if (all.has(n))), "ls" won’t be reliably “always-on” when retrieval is enabled—update "ls" to the actual registered tool id (or remove it).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/opencode/src/tool/retrieval.ts` around lines 17 - 20, Retrieval.CORE
contains "ls" which doesn't match any registered Tool id, so update the CORE
array in retrieval.ts: either replace "ls" with the actual registered tool id
used when defining the listing tool (e.g., the id passed to Tool.define for the
filesystem/ls-like tool) or remove "ls" from Retrieval.CORE; ensure the change
aligns with the select() behavior that only keeps CORE entries present in the
candidate set so the tool id in Retrieval.CORE must exactly match a
Tool.define(...) id.
There was a problem hiding this comment.
3 issues found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/opencode/src/tool/retrieval.ts">
<violation number="1" location="packages/opencode/src/tool/retrieval.ts:18">
P1: Core retrieval list uses `"ls"`, but the actual tool name is `"list"`; this can cause the list tool to be incorrectly filtered out.</violation>
<violation number="2" location="packages/opencode/src/tool/retrieval.ts:44">
P2: Scoring ignores 3-character keywords, which can miss relevant tools for common queries (e.g., sql/dbt).</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| export namespace Retrieval { | ||
| /** Always-available agent essentials — never retrieved out. */ | ||
| export const CORE = [ | ||
| "bash", "read", "write", "edit", "glob", "grep", "ls", |
There was a problem hiding this comment.
P1: Core retrieval list uses "ls", but the actual tool name is "list"; this can cause the list tool to be incorrectly filtered out.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/tool/retrieval.ts, line 18:
<comment>Core retrieval list uses `"ls"`, but the actual tool name is `"list"`; this can cause the list tool to be incorrectly filtered out.</comment>
<file context>
@@ -0,0 +1,77 @@
+export namespace Retrieval {
+ /** Always-available agent essentials — never retrieved out. */
+ export const CORE = [
+ "bash", "read", "write", "edit", "glob", "grep", "ls",
+ "task", "todowrite", "skill",
+ ]
</file context>
| const words = new Set(query.toLowerCase().match(/[a-z_]+/g) ?? []) | ||
| const hay = (t.name + " " + (t.description ?? "")).toLowerCase() | ||
| let s = 0 | ||
| for (const w of words) if (w.length > 3 && hay.includes(w)) s += 1 |
There was a problem hiding this comment.
P2: Scoring ignores 3-character keywords, which can miss relevant tools for common queries (e.g., sql/dbt).
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/tool/retrieval.ts, line 44:
<comment>Scoring ignores 3-character keywords, which can miss relevant tools for common queries (e.g., sql/dbt).</comment>
<file context>
@@ -0,0 +1,77 @@
+ const words = new Set(query.toLowerCase().match(/[a-z_]+/g) ?? [])
+ const hay = (t.name + " " + (t.description ?? "")).toLowerCase()
+ let s = 0
+ for (const w of words) if (w.length > 3 && hay.includes(w)) s += 1
+ // small boost for a direct name mention
+ if (words.has(t.name.toLowerCase())) s += 3
</file context>
Constrained tool-call decoding is local-providers-only and has no validation run behind it yet (the A/B that justifies this PR measured tool retrieval). Removing it so the validated retrieval lever can land clean; constrained moves to its own branch/PR pending a local vLLM guided-decoding run. - remove `provider/constrained.ts` + its test - remove the constrained wiring + import from `session/llm.ts` (retrieval stays) #858 is now retrieval-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dev-punia-altimate
left a comment
There was a problem hiding this comment.
Multi-Persona Review — Verdict: block
The PR contains a critical syntax error that will cause runtime failure, and lacks proper error handling for destructive database operations. While the performance optimization is well-researched and aligned with best practices, the code as written will break production migrations.
15/15 agents completed · 236s · 9 findings (1 critical, 2 high, 1 medium)
Critical
- [code-reviewer] Incomplete string literal: 'workspace_id, endpoint_' is cut off mid-line, likely due to truncation in the diff. This will cause a SyntaxError when executed. →
alembic_tenant_clickhouse/migrations/versions/2026_06_01_0030_databricks_reorder_clickhouse_sort_keys.py:358- 💡 Complete the column name after 'endpoint_' — likely 'endpoint_id' or similar — to form a valid tuple element in the leading columns list.
High
- [tech-lead] Migration script directly executes raw SQL via SQLAlchemy text() without abstraction or validation layer, making it brittle and hard to test. →
alembic_tenant_clickhouse/migrations/versions/2026_06_01_0030_databricks_reorder_clickhouse_sort_keys.py:150- 💡 Extract table rebuild logic into a reusable utility in app/utils/clickhouse_migrator.py with input validation and logging, then call from migration. This improves testability and maintainability.
- [tech-lead] No explicit error handling around EXCHANGE TABLES or DROP operations; failure could leave orphaned _new tables or corrupt state. →
alembic_tenant_clickhouse/migrations/versions/2026_06_01_0030_databricks_reorder_clickhouse_sort_keys.py:280- 💡 Wrap table exchange and drop in try/except blocks with rollback logic and explicit logging of state before/after critical operations.
Medium
- [tech-lead] TABLE_PLANS dictionary contains 27 entries with inline string tuples; hard to validate and error-prone if column names change. →
alembic_tenant_clickhouse/migrations/versions/2026_06_01_0030_databricks_reorder_clickhouse_sort_keys.py:55- 💡 Define a dataclass or named tuple for TablePlan and validate column names against system.tables during migration startup to catch typos early.
Multi-Persona Review · vllm:qwen3-next-80b (waves) + vllm-fallback (synth) ·
) * fix: two tests flaky under parallel CI load (S27 + trace snapshot) Both pass locally but fail consistently in CI's heavy parallel run (9474 tests / 378 files) — the repo's "no flaky tests under resource contention" case. Neither is caused by any feature change; they fail identically on unrelated PRs (#854/#858/#863), blocking all of them. - `real-tool-simulation` S27: the progressive-suggestion dedup state is a module-global Set. The test's `beforeEach` reset used a dynamic `await import`, which under parallel CI can resolve to a different module instance than the tool's static import — so the real Set is never reset and accumulates `sql_analyze` from S25/S26 → S27 sees no suggestion. Fix: import `PostConnectSuggestions` statically (same instance the tools use); reset in S27 too. - `tracing-adversarial-snapshot` "shows 'running' status": waited a fixed 50ms for a debounced async snapshot write, too short under CI load → read a stale snapshot. Fix: poll the on-disk status until expected (timeout 4s) instead of a fixed sleep. Closes #879 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: raise CI test timeout 30s→90s to kill resource-contention flakiness The "TypeScript" job runs all 9500+ tests in one parallel bun process. Under CPU contention a few slower tests (real fs/spawn/git-bootstrap) get starved and exceed the 30s per-test timeout NON-deterministically — different tests each run (observed: 32s and 51s timeouts). This blocks every PR with failures unrelated to the diff. 90s gives ~3x headroom over the worst observed, removing the flakiness without masking genuinely-hung tests. Part of #879. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dev-punia-altimate
left a comment
There was a problem hiding this comment.
🤖 Code Review — OpenCodeReview (Gemini) — 3 finding(s)
- 3 anchored to a line (posted inline when the comment stream is on)
- 0 without a line anchor
All findings (full text)
1. packages/opencode/src/tool/retrieval.ts (L40-L42)
[🟠 MEDIUM] The regular expression /[a-z_]+/g only matches alphabetic characters and underscores. If tool names or query terms contain numbers or hyphens (e.g., dbt-schema-verify, v2, read-file), they will be incorrectly split or ignored entirely. This will also cause the direct name mention boost (words.has(t.name.toLowerCase())) to silently fail for any tools with numbers or hyphens in their names, because the words set won't contain the full name.
Consider including numbers and hyphens in the regular expression.
Suggested change:
function score(query: string, t: Tool): number {
const words = new Set(query.toLowerCase().match(/[a-z0-9_-]+/g) ?? [])
const hay = (t.name + " " + (t.description ?? "")).toLowerCase()
2. packages/opencode/src/tool/retrieval.ts (L55-L57)
[🟠 MEDIUM] The default topk limit is set to 12. Since the CORE tools list already contains 10 items, this default only leaves room for a maximum of 2 dynamically retrieved tools per turn (and possibly zero if opts.keep tools fill the remaining slots). As topk is not explicitly passed by the caller in session/llm.ts, this default configuration might be too restrictive and severely limit the LLM's ability to utilize relevant tools.
Consider increasing the default topk value or modifying the logic to guarantee a certain number of dynamically retrieved tools independently of the core and kept tools.
Suggested change:
export function select(query: string, tools: Tool[], opts: Options = {}): Set<string> {
const topk = opts.topk ?? 20 // Adjust to a more reasonable default
const minToRetrieve = opts.minToolsToRetrieve ?? topk
3. packages/opencode/src/session/llm.ts (L190-L197)
[🔵 LOW] According to the code quality checklist:
- Nested Ternary Expressions: The nested ternary used for the
queryassignment is prohibited. Please refactor this to useif-elsestatements. - TypeScript Types: Avoid using the
anytype (c as any,p: any,t as any). If they are strictly necessary due to third-party SDK typing issues, please provide comments explaining the reason.
Suggested change:
const c = lastUser?.content
let query = ""
if (typeof c === "string") {
query = c
} else if (Array.isArray(c)) {
// Explicitly using any due to complex UserContent types from the ai SDK
query = c.map((p: any) => (typeof p === "string" ? p : (p?.text ?? ""))).join(" ")
}
// Explicitly using any as the Tool type might lack a strict description definition
const list = Object.entries(tools).map(([name, t]) => ({ name, description: (t as any)?.description }))
| function score(query: string, t: Tool): number { | ||
| const words = new Set(query.toLowerCase().match(/[a-z_]+/g) ?? []) | ||
| const hay = (t.name + " " + (t.description ?? "")).toLowerCase() |
There was a problem hiding this comment.
[🟠 MEDIUM] The regular expression /[a-z_]+/g only matches alphabetic characters and underscores. If tool names or query terms contain numbers or hyphens (e.g., dbt-schema-verify, v2, read-file), they will be incorrectly split or ignored entirely. This will also cause the direct name mention boost (words.has(t.name.toLowerCase())) to silently fail for any tools with numbers or hyphens in their names, because the words set won't contain the full name.
Consider including numbers and hyphens in the regular expression.
Suggested change:
| function score(query: string, t: Tool): number { | |
| const words = new Set(query.toLowerCase().match(/[a-z_]+/g) ?? []) | |
| const hay = (t.name + " " + (t.description ?? "")).toLowerCase() | |
| function score(query: string, t: Tool): number { | |
| const words = new Set(query.toLowerCase().match(/[a-z0-9_-]+/g) ?? []) | |
| const hay = (t.name + " " + (t.description ?? "")).toLowerCase() |
| export function select(query: string, tools: Tool[], opts: Options = {}): Set<string> { | ||
| const topk = opts.topk ?? 12 | ||
| const minToRetrieve = opts.minToolsToRetrieve ?? topk |
There was a problem hiding this comment.
[🟠 MEDIUM] The default topk limit is set to 12. Since the CORE tools list already contains 10 items, this default only leaves room for a maximum of 2 dynamically retrieved tools per turn (and possibly zero if opts.keep tools fill the remaining slots). As topk is not explicitly passed by the caller in session/llm.ts, this default configuration might be too restrictive and severely limit the LLM's ability to utilize relevant tools.
Consider increasing the default topk value or modifying the logic to guarantee a certain number of dynamically retrieved tools independently of the core and kept tools.
Suggested change:
| export function select(query: string, tools: Tool[], opts: Options = {}): Set<string> { | |
| const topk = opts.topk ?? 12 | |
| const minToRetrieve = opts.minToolsToRetrieve ?? topk | |
| export function select(query: string, tools: Tool[], opts: Options = {}): Set<string> { | |
| const topk = opts.topk ?? 20 // Adjust to a more reasonable default | |
| const minToRetrieve = opts.minToolsToRetrieve ?? topk |
| const c = lastUser?.content as any | ||
| const query = | ||
| typeof c === "string" | ||
| ? c | ||
| : Array.isArray(c) | ||
| ? c.map((p: any) => (typeof p === "string" ? p : (p?.text ?? ""))).join(" ") | ||
| : "" | ||
| const list = Object.entries(tools).map(([name, t]) => ({ name, description: (t as any)?.description })) |
There was a problem hiding this comment.
[🔵 LOW] According to the code quality checklist:
- Nested Ternary Expressions: The nested ternary used for the
queryassignment is prohibited. Please refactor this to useif-elsestatements. - TypeScript Types: Avoid using the
anytype (c as any,p: any,t as any). If they are strictly necessary due to third-party SDK typing issues, please provide comments explaining the reason.
Suggested change:
| const c = lastUser?.content as any | |
| const query = | |
| typeof c === "string" | |
| ? c | |
| : Array.isArray(c) | |
| ? c.map((p: any) => (typeof p === "string" ? p : (p?.text ?? ""))).join(" ") | |
| : "" | |
| const list = Object.entries(tools).map(([name, t]) => ({ name, description: (t as any)?.description })) | |
| const c = lastUser?.content | |
| let query = "" | |
| if (typeof c === "string") { | |
| query = c | |
| } else if (Array.isArray(c)) { | |
| // Explicitly using any due to complex UserContent types from the ai SDK | |
| query = c.map((p: any) => (typeof p === "string" ? p : (p?.text ?? ""))).join(" ") | |
| } | |
| // Explicitly using any as the Tool type might lack a strict description definition | |
| const list = Object.entries(tools).map(([name, t]) => ({ name, description: (t as any)?.description })) |
🤖 Code Review — OpenCodeReview (Gemini) — 3 finding(s)
All findings (full text)1.
|
❌ Tests — Failures DetectedTypeScript — 15 failure(s)
Next StepPlease address the failing cases above and re-run verification. |
What does this PR do?
Adds a flag-gated (
ALTIMATE_TOOL_RETRIEVAL, default OFF) per-turn tool subset: an always-on core set plus a lexically-ranked top-k of the remaining tools, never dropping a tool already referenced mid-trajectory. This trims the ~78-tool definition flood that hurts tool selection and inflates input tokens. v1 is lexical, dependency-free, and deterministic. Wired intosession/llm.ts(marker-wrapped); a no-op when the flag is unset.Validation (measured)
Tool retrieval — A/B, 8 ADE-bench dbt tasks, deepseek-v4-flash, value-graded (output checked for correctness, not just "builds"):
ALTIMATE_TOOL_RETRIEVAL=1→ identical resolve-rate (0 tasks differ), −50% input tokens, −49% cost.
Type of change
Issue for this PR
Closes #857
How did you verify your code works?
test/tool/retrieval.test.ts) — selection logic incl. always-on core + referenced-tool retention; all green.tsgotypecheck clean;altimate_changemarkers insession/llm.tsbalanced (7/7); default-off so the non-flagged path is unchanged.Checklist