Skip to content

EVA-10283: auto-refresh the tools schema cache (login hook + 1h TTL at startup)#11

Merged
conradchu merged 2 commits into
mainfrom
feature/eva-10283-auto-refresh-tools-cache
Jun 11, 2026
Merged

EVA-10283: auto-refresh the tools schema cache (login hook + 1h TTL at startup)#11
conradchu merged 2 commits into
mainfrom
feature/eva-10283-auto-refresh-tools-cache

Conversation

@conradchu

@conradchu conradchu commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Linear: EVA-10283

The dynamic-command registry previously only refreshed when the user manually ran tools refresh; with near-daily server-side MCP tool changes, the CLI silently drifted stale (the 24h TTL was never honored by the startup path, only by tools list/tools refresh). This PR makes the cache self-maintaining:

  • TTL 24h → 1h, and startup now honors it: a dynamic-tool invocation with a stale/missing cache does a 3s best-effort refetch before registration, silently falling back to the stale cache on failure.
  • auth login refreshes the tool list in the same invocation ("Cached N tools." on stderr); refresh failure warns but never fails the login.
  • Honest failure modes: --no-cache (bare or =true, pre- or post-subcommand) + fetch failure, or a missing cache with nothing to fall back to, exit with a transport-coded error (7) instead of silently lying or degrading to unknown command.
  • Static commands stay offline-fast: --help, version, auth *, tools *, completion, and cobra's hidden __complete/__completeNoDesc never trigger a startup fetch; leading persistent flags (--pretty, -q) are scanned past correctly.
  • Background fetches never turn interactive: the startup probe and post-login refresh use a fallback-free token source matching the principal future calls run as (env token when SEARCHLIGHT_TOKEN is set, else the just-saved OAuth tokens) — a rejected token fails silently/warns instead of launching the browser.
  • Docs updated: tools refresh is now an optional escape hatch.

Test plan

  • 19 new startup/login integration tests (httptest JSON-RPC fixtures, no real network): stale/missing/fresh cache × creds/no-creds × success/500/unreachable, static-command skip set, leading flags, --no-cache forms, rejected pre-minted token (asserts no interactive fallback), post-login principal selection, exit-code pinning (transport=7).
  • Coverage 74% aggregate (target ≥70%); go vet, golangci-lint, gosec clean; -race green.
  • Manual: auth login against production (browser flow) — checklist item for reviewer/author.

Design notes

  • Deliberately not implemented (adversarial-review tradeoff, deferred): probe backoff during server outages — with a stale cache and the server down, each dynamic invocation pays the 3s probe timeout. Persisted last-attempt state would fix it if it ever bites.
  • Explicitly out of scope (user decision during spec): fetch-on-miss for unknown subcommands; server-version-keyed cache invalidation.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Best-effort schema cache refresh after successful login
    • Startup probe refresh for dynamic tools with stale cache (bounded ~3s) when credentials exist
  • Improvements

    • Cache TTL reduced from 24h to 1h for fresher tool info
    • Clarified offline fallback and explicit handling of --no-cache and static commands
  • Documentation

    • Updated docs and README to reflect discovery, refresh, and fallback behavior
  • Tests

    • Added comprehensive tests for startup and post-login refresh scenarios

…-10283)

The dynamic-command registry previously only refreshed when the user
manually ran 'tools refresh' — with near-daily server-side tool changes
the CLI silently drifted stale. Now the cache refreshes at two targeted
moments: right after 'auth login' succeeds, and at startup when the
cache is over 1h old (TTL lowered from 24h) via a 3s best-effort probe
that falls back to the stale cache on failure. Silent fallback is
narrowed to be honest: --no-cache failures and missing-cache failures
propagate as transport errors instead of degrading to 'unknown command'.
Static commands (help, version, auth, tools, completion, cobra
__complete) never trigger a fetch, preserving offline/pre-auth behavior.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@linear

linear Bot commented Jun 11, 2026

Copy link
Copy Markdown

EVA-10283

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 99b9b38a-9e5c-48af-aef8-ae596e54ca47

📥 Commits

Reviewing files that changed from the base of the PR and between 3aa7976 and a105df7.

📒 Files selected for processing (1)
  • cmd/startup_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/startup_test.go

📝 Walkthrough

Walkthrough

The PR reduces the tools schema cache TTL to 1 hour, adds a bounded startup freshness probe that conditionally refreshes tools/list for dynamic commands (with non-interactive token handling and --no-cache semantics), triggers a best-effort post-login cache refresh, enhances tests to cover these flows, and updates docs.

Changes

Schema Caching and Startup Freshness Probe

Layer / File(s) Summary
Startup Schema Freshness Probe and TTL Configuration
cmd/root.go
Adds ensureFreshSchema() that parses argv to skip static commands, respects --no-cache, checks cache freshness and stored credentials, and conditionally calls LoadOrFetch with a non-interactive token client; implements mcpClientWith(), schemaRefreshClient(), staticTokenSource, hasStoredCredentials(), and reduces schema TTL to 1 hour.
Post-Login Cache Refresh
cmd/auth.go, cmd/dynamic.go
Calls refreshSchemaAfterLogin() after successful auth login; helper performs a short-timeout LoadOrFetch using schemaRefreshClient() and logs cached tool count on success; updates comment in registerDynamicTools.
Test Infrastructure and Startup Behavior Coverage
cmd/integration_test.go, cmd/startup_test.go
Integration test now records Authorization headers per JSON-RPC call. cmd/startup_test.go adds tests covering stale/fresh/missing cache cases, fallback on fetch failure, credential absence, static-command skips, --no-cache parsing/semantics, revoked-token behavior, and post-login refresh success/failure scenarios.
Documentation Updates
CLAUDE.md, README.md, docs/agents.md
Docs updated to reflect 1-hour TTL, automatic post-auth login cache population, conditional startup refresh for dynamic tools (bounded ~3s), offline fallback behavior, and tools refresh as the manual escape hatch.

Sequence Diagram

sequenceDiagram
  participant CLI as CLI Startup
  participant Cache as Schema Cache
  participant Creds as Credentials
  participant Server as Schema Server
  participant Tools as Tool Registration

  CLI->>Cache: Check if cache exists and fresh (< 1h)
  Cache-->>CLI: Cache state (fresh/stale/missing)
  
  alt Cache fresh and --no-cache not set
    CLI->>Tools: Register tools from cache
  else Need refresh (stale/missing or --no-cache)
    CLI->>Creds: Check for stored credentials
    alt No stored credentials
      CLI->>Tools: Use existing cache or fail
    else Credentials available
      CLI->>Server: LoadOrFetch with timeout (3s)
      alt Fetch succeeds
        Server-->>Cache: Update schema cache
        Cache-->>Tools: Register updated tools
      else Fetch fails
        alt Stale cache exists and --no-cache not set
          Cache-->>Tools: Silently use stale cache
        else No fallback available
          Server-->>CLI: Return transport error
        end
      end
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit peeks at cached schema light,
If it's old, a quick probe takes flight—
With creds in pocket and a timeout small,
Fetch once, fall back, or loudly stall.
Cached tools hum softly through the night.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.81% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: implementing auto-refresh of the tools schema cache with a login hook and reduced 1-hour TTL at startup.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/eva-10283-auto-refresh-tools-cache

Comment @coderabbitai help to get the list of available commands and usage tips.

@gitvelocity-reviewer

Copy link
Copy Markdown

📊 Code Quality Score: 52/100

65 × 0.8 (Large ESF) = 52

Category Score Factors
🔭 Scope 12/20 6 files touched; new startup behavior affecting all dynamic tool invocations; cross-cutting within single CLI binary; no new public API endpoints or DB migrations
🏗️ Architecture 13/20 New ensureFreshSchema startup phase; staticTokenSource abstraction for non-interactive token handling; hasStoredCredentials as testable var; clean separation of interactive vs non-interactive token sources; follows existing patterns
⚙️ Implementation 14/20 Pre-cobra argv parsing for --no-cache sniffing; nuanced error-swallowing logic (fallbackUsable decision); token source selection to prevent interactive fallback; handles --no-cache= form; cobra internal command detection (__complete prefix)
⚠️ Risk 10/20 Changes startup behavior for all dynamic tool invocations; TTL reduction from 24h to 1h increases network calls; well-designed fallback logic mitigates risk; no DB/auth system changes; error propagation is principled and tested
✅ Quality 13/15 545 lines of tests with ~2.8:1 test-to-production ratio; all branches covered; edge cases tested (--no-cache=false, leading flags, cobra internals); auth header verification in integration tests; CLAUDE.md/README/docs updated; 30s vs 3s timeout asymmetry undocumented
🔒 Perf / Security 3/5 staticTokenSource prevents interactive fallback during background refreshes (security); 3s bounded timeout prevents startup hangs; no benchmarks or explicit threat model

Was this score accurate? 👍 Yes · 👎 No

Scored by GitVelocity · How are scores calculated?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cmd/startup_test.go (1)

125-129: ⚡ Quick win

Assert unexpected ensureFreshSchema() errors in success-path tests.

Line 125, Line 199, Line 228, Line 252, Line 277, Line 292, and Line 305 call ensureFreshSchema() without checking err. That can mask regressions with false-positive passes when the function fails before the side-effect assertion.

Suggested pattern
-    ensureFreshSchema()
+    if err := ensureFreshSchema(); err != nil {
+        t.Fatalf("ensureFreshSchema: %v", err)
+    }

Also applies to: 199-203, 228-232, 252-256, 277-281, 292-296, 305-309

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/startup_test.go` around lines 125 - 129, The tests call
ensureFreshSchema() but ignore its error return, which can hide failures; update
each call (e.g., in the tests around ensureFreshSchema() followed by f.calls()
assertions) to capture and assert the error like: err := ensureFreshSchema(); if
err != nil { t.Fatalf("ensureFreshSchema failed: %v", err) } before continuing
with the side-effect assertions (such as using f.calls()), and apply this change
to all occurrences referenced in the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@cmd/startup_test.go`:
- Around line 125-129: The tests call ensureFreshSchema() but ignore its error
return, which can hide failures; update each call (e.g., in the tests around
ensureFreshSchema() followed by f.calls() assertions) to capture and assert the
error like: err := ensureFreshSchema(); if err != nil {
t.Fatalf("ensureFreshSchema failed: %v", err) } before continuing with the
side-effect assertions (such as using f.calls()), and apply this change to all
occurrences referenced in the comment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e7d25a0d-56a0-4441-99ee-83b01c9781e2

📥 Commits

Reviewing files that changed from the base of the PR and between 172d942 and 3aa7976.

📒 Files selected for processing (8)
  • CLAUDE.md
  • README.md
  • cmd/auth.go
  • cmd/dynamic.go
  • cmd/integration_test.go
  • cmd/root.go
  • cmd/startup_test.go
  • docs/agents.md

CodeRabbit quick win: seven call sites discarded the error return,
which could mask regressions with false-positive passes when the
function fails before the side-effect assertion.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@conradchu

Copy link
Copy Markdown
Contributor Author

✅ Addressed CodeRabbit's nitpick: all seven success-path ensureFreshSchema() call sites in cmd/startup_test.go now assert the error is nil (t.Fatalf on unexpected error) in commit a105df7, so a failure before the side-effect assertion can no longer pass silently.

@conradchu conradchu merged commit 50cd9cb into main Jun 11, 2026
5 checks passed
@gitvelocity-reviewer

Copy link
Copy Markdown

📊 Code Quality Score: 51/100

64 × 0.8 (Large ESF) = 51.2 → 51

Category Score Factors
🔭 Scope 12/20 Single subsystem (CLI startup/auth) but cross-cutting across startup, auth, and dynamic tool registration; no new public APIs or DB migrations; additive behavioral change at two lifecycle points
🏗️ Architecture 13/20 New startup lifecycle hook in Execute(); staticTokenSource interface implementation; schemaRefreshClient() principal-selection factory; hasStoredCredentials as stubbable var; mcpClientWith() transport cloner; careful separation of startup probe vs post-login refresh vs manual refresh
⚙️ Implementation 15/20 Pre-cobra argv parsing with --no-cache sniffing in multiple forms; complex swallow-vs-propagate error logic based on fallbackUsable AND noCache state; staticTokenSource preventing interactive fallback; correct ordering (freshness before credentials); cobra internal __complete handling; context-bounded network calls
⚠️ Risk 8/20 Adds bounded (3s) network call at startup for stale-cache users; careful fallback preserves stale cache on failure; static commands explicitly excluded preserving offline behavior; TTL reduction 24h→1h is behavioral but well-justified; fully reversible
✅ Quality 13/15 559 lines of tests for ~200 lines production code; 18 test cases covering stale/missing/fresh cache, fetch failures, unreachable server, no credentials, 12 static command cases, --no-cache variants, rejected token non-interactive fallback, auth header verification; stubCredentials and setArgs helpers for hermetic testing; excellent inline documentation
🔒 Perf / Security 3/5 Explicit 3s timeout on startup probe; correct principal selection (tool availability differs per principal); staticTokenSource prevents credential escalation during silent probes; no benchmarks

Was this score accurate? 👍 Yes · 👎 No

Scored by GitVelocity · How are scores calculated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant