feat: add /cache stats — prefix hash/drift exposure and cache-hit summary by encyc · Pull Request #2336 · Hmbown/CodeWhale

encyc · 2026-05-29T01:43:01Z

Closes #2264 (low-risk 0.8.x slice).

What

New /cache stats subcommand that surfaces prefix-cache stability diagnostics in three sections:

Prefix Stability — percentage, check/change counts, WARNING when drifted
Prefix Fingerprint — full SHA-256 combined hash, short id, drift warning
Cache Hit Rate — aggregated hit/miss/input tokens from per-turn telemetry, low-hit advisory below 80%

Plumbing

Layer	Change
`PrefixCacheChange` event	+`pinned_combined_hash: String` — carries pinned prefix fingerprint to TUI
`turn_loop.rs`	Extracts `pinned_fingerprint().combined_sha256` before `check_and_update`
`App`	+`last_pinned_prefix_hash: Option<String>` — updated per-turn
`ui.rs`	Stores hash from event into App

Files

crates/tui/src/commands/debug.rs      | +212  (format_cache_stats, format_tokens, "stats" branch, 6 tests)
crates/tui/src/commands/mod.rs        |   +1  (usage string)
crates/tui/src/core/events.rs         |   +4  (pinned_combined_hash field)
crates/tui/src/core/engine/turn_loop.rs|   +6  (populate hash)
crates/tui/src/tui/app.rs             |   +5  (field + init)
crates/tui/src/tui/ui.rs              |   +4  (store hash from event)

6 files, +232 −1

Greptile Summary

This PR adds /cache stats, a new subcommand that surfaces prefix-cache stability, a pinned SHA-256 fingerprint, and a per-session cache-hit rate summary. It also includes a largely undocumented second feature: a "whale routes" single-column model picker for DeepSeek providers, with a new whale_routes.rs taxonomy and matching refactor of model_picker.rs.

/cache stats plumbing (debug.rs, events.rs, turn_loop.rs, app.rs, ui.rs): the new format_cache_stats function aggregates per-turn telemetry, displays prefix stability and fingerprint data, and fires a low-hit advisory below 80%. The six accompanying tests cover the main paths. The pinned_combined_hash field is populated after check_and_update in both branches (the PR description says "before", which is a minor doc discrepancy but not a functional issue).
Whale-routes model picker (whale_routes.rs, model_picker.rs): introduces six named model+effort combinations (Blue Whale → Porpoise) and replaces the two-column picker with a single-column list for DeepSeek providers. model_picker.rs contains a compile error in render_whale_routes — popup_height calls .min() on a usize (from whale_route_row_count()) against a u16 (from area.height), which will not compile. The sort_order field on WhaleRoute also silently doubles as an array index without an explicit invariant test.

Confidence Score: 2/5

Not safe to merge: render_whale_routes in model_picker.rs will not compile due to a type mismatch in the popup_height expression.

The /cache stats wiring is clean and the logic is correct, but the PR includes a second undocumented feature (whale-routes model picker) that contains a definite compile error: (row_count + 4).min(area.height.saturating_sub(4)) mixes usize with u16, which Rust will reject. This would prevent the entire TUI crate from building.

crates/tui/src/tui/model_picker.rs — the render_whale_routes method has the compile error. crates/tui/src/tui/whale_routes.rs should also add an assertion that sort_order == array index.

Important Files Changed

Filename	Overview
crates/tui/src/tui/model_picker.rs	Substantial refactor adding whale-route single-column layout for DeepSeek providers; contains a compile error — `usize` vs `u16` type mismatch in `popup_height` computation that prevents this code from building.
crates/tui/src/tui/whale_routes.rs	New file defining six canonical whale-route (model + effort) combinations with lookup helpers; well-tested, but `sort_order` doubles silently as the array index, a load-bearing invariant with no direct assertion.
crates/tui/src/commands/debug.rs	Adds `format_cache_stats` and `format_tokens` for the new `/cache stats` subcommand, with six unit tests; main logic is correct and well-guarded.
crates/tui/src/core/engine/turn_loop.rs	Extracts `pinned_fingerprint().combined_sha256` in both changed and stable branches and populates the new event field; straightforward and correct.
crates/tui/src/tui/app.rs	Adds `last_pinned_prefix_hash: Option<String>` field, initialises it to `None`, and clears it on session reset; clean.
crates/tui/src/tui/ui.rs	Stores `pinned_combined_hash` from the event into `App`, converting empty strings to `None`; correct.

Sequence Diagram

sequenceDiagram
    participant Engine as turn_loop.rs
    participant PM as PrefixStabilityManager
    participant Event as PrefixCacheChange event
    participant UI as ui.rs (event loop)
    participant App as App state
    participant Cmd as /cache stats

    Engine->>PM: check_and_update(system_text, tools)
    PM-->>Engine: Ok(_) or Err(change)
    Engine->>PM: pinned_fingerprint()
    PM-->>Engine: combined_sha256
    Engine->>Event: "emit PrefixCacheChange { pinned_combined_hash, stability_pct, changed }"
    Event->>UI: received in run_event_loop
    UI->>App: "last_pinned_prefix_hash = Some(hash)"
    UI->>App: "prefix_stability_pct = Some(pct)"
    UI->>App: "prefix_change_count += 1 (if changed)"

    Note over Cmd,App: User runs /cache stats
    Cmd->>App: read prefix_stability_pct, last_pinned_prefix_hash
    Cmd->>App: read session.turn_cache_history
    Cmd-->>Cmd: format_cache_stats(app) → String message

Comments Outside Diff (1)

crates/tui/src/tui/model_picker.rs, line 371 (link)

popup_height mixes usize (from whale_route_row_count()) with u16 (from area.height.saturating_sub(4)) in the same .min() call. Rust requires both sides of .min() to be the same type, so this will not compile. The as u16 cast only applies to the final result and does not resolve the intermediate type mismatch.

_{Reviews (3): Last reviewed commit: "fix: clear last_pinned_prefix_hash on mo..." | Re-trigger Greptile}

gemini-code-assist

Code Review

This pull request introduces a new /cache stats command to the TUI, displaying prefix-cache stability, fingerprints, and aggregated cache-hit summaries. The implementation updates the engine's turn loop to propagate the pinned prefix combined hash via events and stores it in the app state. The review feedback highlights three key issues: first, the token aggregation logic in format_cache_stats should correctly handle non-cache-aware turns and infer missing miss tokens to prevent skewed statistics; second, the pinned_hash is extracted before the cache is updated, making the reported hash one turn behind; and third, an empty pinned_combined_hash should clear the stored hash in the app state rather than leaving a stale value.

gemini-code-assist · 2026-05-29T01:44:18Z

+        let (hit, miss, input) = app.session.turn_cache_history.iter().fold(
+            (0u64, 0u64, 0u64),
+            |(hit, miss, input), rec| {
+                (
+                    hit + u64::from(rec.cache_hit_tokens.unwrap_or(0)),
+                    miss + u64::from(rec.cache_miss_tokens.unwrap_or(0)),
+                    input + u64::from(rec.input_tokens),
+                )
+            },
+        );


In format_cache_stats, the fold logic aggregates cache hit and miss tokens across all turns. However, if cache_hit_tokens is None (indicating the turn is not cache-aware), defaulting cache_hit_tokens and cache_miss_tokens to 0 will pollute the statistics. Furthermore, if cache_hit_tokens is Some but cache_miss_tokens is None, the miss tokens should be inferred as input_tokens.saturating_sub(hit) (just like in format_cache_history), otherwise they are under-reported as 0, leading to an incorrect 100% hit rate. We should only aggregate cache statistics for cache-aware turns and correctly infer missing cache_miss_tokens.

let (hit, miss, input) = app.session.turn_cache_history.iter().fold( (0u64, 0u64, 0u64), |(hit, miss, input), rec| { if let Some(h) = rec.cache_hit_tokens { let m = rec.cache_miss_tokens.unwrap_or_else(|| rec.input_tokens.saturating_sub(h)); ( hit + u64::from(h), miss + u64::from(m), input + u64::from(rec.input_tokens), ) } else { (hit, miss, input + u64::from(rec.input_tokens)) } }, );

gemini-code-assist · 2026-05-29T01:44:18Z

+                let pinned_hash = pm
+                    .pinned_fingerprint()
+                    .map(|fp| fp.combined_sha256.clone())
+                    .unwrap_or_default();
                match pm.check_and_update(&system_text, tools_ref) {


The pinned_hash is extracted from pm before pm.check_and_update is called. Since check_and_update is what actually updates the pinned fingerprint, extracting it beforehand means the sent pinned_combined_hash will always be one turn behind (and empty on the very first turn). Extracting the fingerprint after the update ensures the TUI receives the correct, up-to-date hash.

let change_res = pm.check_and_update(&system_text, tools_ref); let pinned_hash = pm .pinned_fingerprint() .map(|fp| fp.combined_sha256.clone()) .unwrap_or_default(); match change_res {

gemini-code-assist · 2026-05-29T01:44:18Z

+                        if !pinned_combined_hash.is_empty() {
+                            app.last_pinned_prefix_hash = Some(pinned_combined_hash);
+                        }


If pinned_combined_hash is empty (e.g., when switching to a model or provider that does not support prefix caching), we should clear app.last_pinned_prefix_hash by setting it to None instead of leaving the stale hash from the previous provider/model intact.

app.last_pinned_prefix_hash = (!pinned_combined_hash.is_empty()).then_some(pinned_combined_hash);

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b571e8f6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T01:47:51Z

+                let pinned_hash = pm
+                    .pinned_fingerprint()
+                    .map(|fp| fp.combined_sha256.clone())
+                    .unwrap_or_default();
                match pm.check_and_update(&system_text, tools_ref) {


Use the post-update pinned hash

In sessions where the prefix manager is unpinned or detects drift, this captures pinned_hash before check_and_update, but that call is what pins the first fingerprint and re-pins after a change. As a result /cache stats still shows Pinned hash: unavailable after the first turn, and after a drift warning it reports the old baseline hash until another stable turn happens, even though the event field is documented as the current pinned prefix hash.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T01:47:51Z

+                    hit + u64::from(rec.cache_hit_tokens.unwrap_or(0)),
+                    miss + u64::from(rec.cache_miss_tokens.unwrap_or(0)),


Don't treat missing cache telemetry as zero misses

When a provider reports cache_hit_tokens but omits cache_miss_tokens, the existing /cache table infers misses as input_tokens - hit, but /cache stats adds zero here. In that common partial-telemetry case, a turn like 10k input / 8k hit / missing miss is summarized as 100% hit rate instead of 80%, hiding low cache effectiveness; records with no hit telemetry are also folded into a misleading 0.0% of 0 advisory instead of being excluded like the table does.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T01:47:51Z

+    /// Current pinned prefix combined hash (SHA-256, 64 hex chars).
+    /// Updated per-turn via PrefixCacheChange events; surfaced by
+    /// `/cache stats` for cache-hit debugging.
+    pub last_pinned_prefix_hash: Option<String>,


Clear the cached fingerprint with model-scoped telemetry

This new fingerprint is model/provider scoped, but clear_model_scoped_telemetry() (used by model and provider switches) only clears the token fields and turn_cache_history, leaving last_pinned_prefix_hash behind. After switching models or providers, /cache stats can therefore show the previous cache scope's pinned hash alongside an empty/new hit-rate history until another prefix event overwrites it, which makes the diagnostics point at the wrong request prefix.

Useful? React with 👍 / 👎.

greptile-apps · 2026-05-29T01:53:31Z

+        if avg_pct < 80.0 {
+            out.push_str("  NOTE: cache hit rate is low (< 80%). Check prefix stability above or consider /compact.\n");
+        }


Spurious low-hit-rate advisory when no cache-aware turns exist

When history is non-empty but every record has cache_hit_tokens == None (e.g. a model that doesn't report cache telemetry), the fold leaves turns == 0, total_cache == 0, and avg_pct == 0.0. Because 0.0 < 80.0, the advisory fires anyway, producing output like Turns recorded: 0 … NOTE: cache hit rate is low. The user sees a "fix your caching" prompt when the real situation is simply that the model doesn't expose cache stats. Guard the advisory on turns > 0.

Suggested change

if avg_pct < 80.0 {

out.push_str(" NOTE: cache hit rate is low (< 80%). Check prefix stability above or consider /compact.\n");

}

if turns > 0 && avg_pct < 80.0 {

out.push_str(" NOTE: cache hit rate is low (< 80%). Check prefix stability above or consider /compact.\n");

}

greptile-apps · 2026-05-29T01:53:32Z

+        let mut turns = 0u64;
+        let (hit, miss, input) = app.session.turn_cache_history.iter().fold(
+            (0u64, 0u64, 0u64),
+            |(hit, miss, input), rec| {
+                let Some(hit_tokens) = rec.cache_hit_tokens else {
+                    return (hit, miss, input);
+                };
+                let h = u64::from(hit_tokens);
+                let m = u64::from(
+                    rec.cache_miss_tokens
+                        .unwrap_or(rec.input_tokens.saturating_sub(hit_tokens)),
+                );
+                turns += 1;
+                (hit + h, miss + m, input + u64::from(rec.input_tokens))
+            },
+        );


turns mutated as a side-effect inside fold

Mutating an outer variable from inside the fold closure works in Rust but is surprising and hides the counter from fold's own accumulator. Folding into a 4-tuple keeps all state visible and eliminates the mutable capture.

Suggested change

let mut turns = 0u64;

let (hit, miss, input) = app.session.turn_cache_history.iter().fold(

(0u64, 0u64, 0u64),

|(hit, miss, input), rec| {

let Some(hit_tokens) = rec.cache_hit_tokens else {

return (hit, miss, input);

};

let h = u64::from(hit_tokens);

let m = u64::from(

rec.cache_miss_tokens

.unwrap_or(rec.input_tokens.saturating_sub(hit_tokens)),

);

turns += 1;

(hit + h, miss + m, input + u64::from(rec.input_tokens))

},

);

let (turns, hit, miss, input) = app.session.turn_cache_history.iter().fold(

(0u64, 0u64, 0u64, 0u64),

|(turns, hit, miss, input), rec| {

let Some(hit_tokens) = rec.cache_hit_tokens else {

return (turns, hit, miss, input);

};

let h = u64::from(hit_tokens);

let m = u64::from(

rec.cache_miss_tokens

.unwrap_or(rec.input_tokens.saturating_sub(hit_tokens)),

);

(turns + 1, hit + h, miss + m, input + u64::from(rec.input_tokens))

},

);

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…mary (Hmbown#2264) Surfaces the current pinned prefix fingerprint hash and stability diagnostics via a new `/cache stats` subcommand, per the low-risk 0.8.x slice agreed on in Hmbown#2264. Changes: - EngineEvent::PrefixCacheChange now carries pinned_combined_hash from PrefixStabilityManager so the TUI layer can surface it - App.last_pinned_prefix_hash stores the hash per-turn for /cache stats - New /cache stats command renders three sections: - Prefix Stability: pct, checks, changes, last-change description - Prefix Fingerprint: full SHA-256 hash, short id, drift warning - Cache Hit Rate: aggregated token totals + low-hit advisory - format_tokens() helper for compact K/M token display - 6 unit tests covering: no-data, stable, drift warning, hit-rate summary, low-hit advisory, token formatting Refs: Hmbown#2264

- Move pinned_hash extraction after check_and_update() so the reported hash reflects the current prefix state, not the previous turn - Skip non-cache-aware turns (cache_hit_tokens=None) in /cache stats aggregation; infer miss tokens when cache_miss_tokens is None - Clear last_pinned_prefix_hash to None when pinned_combined_hash is empty (e.g. switching to a non-caching model/provider)

clear_model_scoped_telemetry() now resets last_pinned_prefix_hash to None so /cache stats does not show the previous model/provider cache scope's fingerprint after a switch.

gemini-code-assist Bot reviewed May 29, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

greptile-apps Bot reviewed May 29, 2026

View reviewed changes

encyc added 4 commits May 29, 2026 17:20

chore: cargo fmt fixes for /cache stats PR

edae772

fix: clear last_pinned_prefix_hash on model/provider switch

6175285

clear_model_scoped_telemetry() now resets last_pinned_prefix_hash to None so /cache stats does not show the previous model/provider cache scope's fingerprint after a switch.

encyc force-pushed the feat/cache-stats-2264 branch from b8380ec to 6175285 Compare May 29, 2026 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add /cache stats — prefix hash/drift exposure and cache-hit summary#2336

feat: add /cache stats — prefix hash/drift exposure and cache-hit summary#2336
encyc wants to merge 4 commits into
Hmbown:mainfrom
encyc:feat/cache-stats-2264

encyc commented May 29, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

gemini-code-assist Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

greptile-apps Bot May 29, 2026

Uh oh!

greptile-apps Bot May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		hit + u64::from(rec.cache_hit_tokens.unwrap_or(0)),
		miss + u64::from(rec.cache_miss_tokens.unwrap_or(0)),

Conversation

encyc commented May 29, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Plumbing

Files

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

encyc commented May 29, 2026 •

edited by greptile-apps Bot

Loading