cache node evaluation#502
Conversation
|
Warning Review limit reached
More reviews will be available in 29 minutes and 50 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (15)
📝 WalkthroughWalkthroughThis PR introduces a comprehensive Nix daemon client library, evaluation result caching system, and refactors CLI-based operations (build and push) to use daemon APIs. It adds per-node evaluation output caching to avoid recomputation and threads cached results through execution planning and orchestration layers. ChangesNix Daemon Client Foundation
Evaluation Result Caching
Executor and Evaluation Integration
Plan Generation with Caching
Hive Cache Integration
CLI Apply and Cache Orchestration
Build Step Daemon Migration
Push and Remote Copy Operations
SSH and Remote Connectivity Refactoring
Log Message Handling Extraction
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/core/src/hive/plan.rs (1)
233-247:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDisable greedy evaluation for
Goal::Buildwhen cache is present.Line 246 always sets
greedy_evaluate: true, so build plans still runevaluate_taskeven whenEvaluate { cached_evaluation: Some(..) }can satisfy the step. This defeats the cache-hit fast path and causes unnecessary evaluation work.Proposed fix
Goal::Build => NodePlan { + let has_cached_evaluation = cached_evaluation.is_some(); context: Context { state: StepState::default(), modifiers: *modifiers, hive_location, should_quit, name, build_id_names: Arc::new(Mutex::new(HashMap::new())), }, steps: vec![ Step::Evaluate(Evaluate { cached_evaluation }), Step::Build(Build { target: None }), ], - greedy_evaluate: true, + greedy_evaluate: !has_cached_evaluation, ignore_failed_ping: false, },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/core/src/hive/plan.rs` around lines 233 - 247, The plan builder sets greedy_evaluate: true unconditionally for Goal::Build, forcing Evaluate steps to run even when Evaluate { cached_evaluation: Some(..) } can satisfy the step; update the Goal::Build branch in NodePlan construction to set greedy_evaluate to false whenever the initial Evaluate has cached_evaluation.is_some() (i.e., inspect the cached_evaluation used when creating Step::Evaluate in that branch) so the planner will respect the cache fast-path and skip running evaluate_task; modify the NodePlan creation logic that constructs Context, steps (Step::Evaluate(Evaluate { cached_evaluation })), and the greedy_evaluate field to conditionally assign greedy_evaluate = cached_evaluation.is_none() (or equivalent) for Goal::Build.
🧹 Nitpick comments (1)
crates/cli/src/apply.rs (1)
128-134: 💤 Low valueUnnecessary Arc clone when dereferencing.
*cache.clone()creates a temporary Arc just to immediately dereference it. Since therefpattern borrows in place,*cachesuffices.♻️ Suggested simplification
- let mut cached_evaluations = if let Some(ref cache) = *cache.clone() + let mut cached_evaluations = if let Some(ref cache) = *cache && let HiveLocation::Flake { ref prefetch, .. } = *location🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/cli/src/apply.rs` around lines 128 - 134, The match arm creating cached_evaluations unnecessarily calls cache.clone() then dereferences it; replace the deref of the temporary Arc with a direct deref of the existing Arc (use *cache rather than *cache.clone()) in the if-let pattern that checks Some(ref cache) and the HiveLocation::Flake branch so the code reads if let Some(ref cache) = *cache && let HiveLocation::Flake { ref prefetch, .. } = *location { ... } ensuring you only borrow the inner value instead of cloning the Arc.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@crates/core/src/hive/plan.rs`:
- Around line 233-247: The plan builder sets greedy_evaluate: true
unconditionally for Goal::Build, forcing Evaluate steps to run even when
Evaluate { cached_evaluation: Some(..) } can satisfy the step; update the
Goal::Build branch in NodePlan construction to set greedy_evaluate to false
whenever the initial Evaluate has cached_evaluation.is_some() (i.e., inspect the
cached_evaluation used when creating Step::Evaluate in that branch) so the
planner will respect the cache fast-path and skip running evaluate_task; modify
the NodePlan creation logic that constructs Context, steps
(Step::Evaluate(Evaluate { cached_evaluation })), and the greedy_evaluate field
to conditionally assign greedy_evaluate = cached_evaluation.is_none() (or
equivalent) for Goal::Build.
---
Nitpick comments:
In `@crates/cli/src/apply.rs`:
- Around line 128-134: The match arm creating cached_evaluations unnecessarily
calls cache.clone() then dereferences it; replace the deref of the temporary Arc
with a direct deref of the existing Arc (use *cache rather than *cache.clone())
in the if-let pattern that checks Some(ref cache) and the HiveLocation::Flake
branch so the code reads if let Some(ref cache) = *cache && let
HiveLocation::Flake { ref prefetch, .. } = *location { ... } ensuring you only
borrow the inner value instead of cloning the Arc.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6c140ad1-f7c0-4c47-b152-bda763fd7687
📒 Files selected for processing (10)
.sqlx/query-7b75a4af74b33f56f2ceeea1e462d28066719fab4319880dfdd2de215c08ba6a.json.sqlx/query-7d06a5ac5518429aa99edd5d878b6332ad23bf70286f29665f5b3e556aa32c5c.jsoncrates/cli/src/apply.rscrates/cli/src/main.rscrates/core/src/cache/migrations/20260523003752_attribute_lookup.sqlcrates/core/src/cache/mod.rscrates/core/src/hive/executor.rscrates/core/src/hive/mod.rscrates/core/src/hive/plan.rscrates/core/src/hive/steps/evaluate.rs
aaa2a63 to
33547fa
Compare
33547fa to
24f1b43
Compare
24f1b43 to
e0e2123
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (1)
crates/core/src/hive/node.rs (1)
346-395: ⚡ Quick winConsider adding test coverage for
force_quiet: true.The tests verify behavior when
force_quiet: false, but there's no test case verifying that passingforce_quiet: truecorrectly appends-qto the SSH arguments.🧪 Suggested test case
#[test] fn test_ssh_args_force_quiet() { let target = Target::from_host("hello-world"); let subcommand_modifiers = SubCommandModifiers { ssh_verbosity: 2, // Even with verbosity set, force_quiet should win ..Default::default() }; let args = target.create_ssh_args(subcommand_modifiers, true).unwrap(); assert!(args.contains(&"-q".to_string())); assert!(!args.iter().any(|a| a.starts_with("-v"))); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/core/src/hive/node.rs` around lines 346 - 395, Add a unit test that exercises Target::create_ssh_args with force_quiet set to true to ensure "-q" is appended and any "-v"/verbosity flags are suppressed; construct a Target (e.g., Target::from_host("hello-world")), pass a SubCommandModifiers with ssh_verbosity > 0, call target.create_ssh_args(subcommand_modifiers, true).unwrap(), then assert the returned Vec<String> contains "-q" and contains no entries starting with "-v" (and also assert behavior is consistent with create_ssh_opts if desired).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/cli/src/apply.rs`:
- Around line 178-187: The tokio::spawn call in apply() currently
fire-and-forgets the cache persistence task (the block that calls
cache.store_evaluation(prefetch, &cache_name, evaluated_path)), which can let
apply() return before writes complete; instead, collect the JoinHandle returned
by tokio::spawn (for the task that calls cache.store_evaluation) into a
Vec<JoinHandle<()>> (or similar) inside apply(), and before returning from
apply() await/join all handles (e.g., via futures::future::join_all or awaiting
each handle) to ensure store_evaluation completes; update the code around
tokio::spawn, the async move closure, and apply()'s return path so the handles
are awaited and any JoinError is handled/logged appropriately.
In `@crates/core/src/cache/mod.rs`:
- Around line 253-255: The SQL query in crates::core::cache module that filters
on evaluation_cache.node_name using json_each($4) currently has a hard-coded
"limit 100" which can drop valid cache hits; remove the fixed "limit 100" (or
replace it with a configurable parameter passed into the query) so all matching
nodes from json_each($4) are returned; locate the SQL string that references
evaluation_cache.node_name and json_each($4) and either delete the "limit 100"
clause or wire a new limit parameter and use that in the query execution call so
large node selections are not truncated.
- Around line 497-505: The deletion in gc_evaluation_cache currently discards
errors by assigning the execute result to `_`; change this so failures are
handled: capture the result of the sqlx::query!(...).execute(&self.pool).await,
and either propagate the error from gc_evaluation_cache (return a Result and use
`?`) or log the error (e.g., via tracing::error/process logger) with context
including the path identifiers (flake_path_digest, flake_path_name,
output_path_digest, output_path_name) so DB cleanup failures are visible; update
the gc_evaluation_cache function signature to return Result if propagating, or
keep the signature and log the error when execute returns Err.
- Around line 428-433: The GC query currently fetches a single unordered LIMIT
50 slice (the EvaluationPaths query that assigns evaluation_paths), so shutdown
GC can miss stale rows in larger caches; change it to a deterministic, batched
sweep: add an explicit ORDER BY on a stable key (e.g., primary key or timestamp)
and iterate in a loop fetching batches (LIMIT N with a cursor/last_seen key or
OFFSET) until no more rows, performing deletion per-batch, so every row is
eventually considered; update the code around evaluation_paths / the query_as!
call and the surrounding shutdown GC routine to use the ordered, paged
fetch-and-delete loop instead of the single fetch_all.
In `@crates/nix_client/src/lib.rs`:
- Around line 754-760: The NixDaemonClient protocol error is reporting the wrong
wanted version and operation: in the block checking self.writer.version() <
ProtocolVersion::from_parts(1, 22) update the
NixDaemonClientError::NixDaemonProtocolVersion construction to use wanted:
ProtocolVersion::from_parts(1, 22) and change operation: "QueryMissing".into()
to operation: "QueryDerivationOutputMap".into() so the error matches the actual
version check and operation name.
- Around line 381-389: The handlers for STDERR_READ and STDERR_WRITE currently
call unimplemented!() which will panic if the daemon sends those messages;
replace the panics with a graceful error return or no-op handling: for
STDERR_READ call self.read_value().await? to obtain the payload length and then
return an Err with a clear error (or Ok(()) if you choose to ignore stderr), and
for STDERR_WRITE either read the incoming bytes (using existing read logic) and
discard them or return a descriptive error; update the match arm for STDERR_READ
and STDERR_WRITE in the same block so they propagate Result errors rather than
panicking, referencing STDERR_READ, STDERR_WRITE, and read_value().
---
Nitpick comments:
In `@crates/core/src/hive/node.rs`:
- Around line 346-395: Add a unit test that exercises Target::create_ssh_args
with force_quiet set to true to ensure "-q" is appended and any "-v"/verbosity
flags are suppressed; construct a Target (e.g.,
Target::from_host("hello-world")), pass a SubCommandModifiers with ssh_verbosity
> 0, call target.create_ssh_args(subcommand_modifiers, true).unwrap(), then
assert the returned Vec<String> contains "-q" and contains no entries starting
with "-v" (and also assert behavior is consistent with create_ssh_opts if
desired).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f2664e82-1eb9-43d3-a937-30ba15932425
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (32)
.sqlx/query-07792cf21fcc4ffda47c7315ef486d6a4423b7a842ed72b2abe70e2590f74264.json.sqlx/query-4ac5886b4ab863eb9763b807933fbfad43987f3d0f0277df279b8725c37f1c4a.json.sqlx/query-58f5c376a95b8092beb1daf4e5ba3031e7d4254538d9ea0ec3077109574be2c0.json.sqlx/query-6d7658c36b965ffe57cbaaa5550cd0a9c2b45ae2feeda76f7002b334bac28674.json.sqlx/query-7d06a5ac5518429aa99edd5d878b6332ad23bf70286f29665f5b3e556aa32c5c.json.sqlx/query-d2e92efc45d4c8b951656a48329273b352f5c02457bbbaea8aabb737622ccb8d.json.sqlx/query-d9569def6f629c7d6e25b70f56c8e8cddf0ea447deded6bd5987f2097bb92a91.jsonCargo.tomlcrates/cli/src/apply.rscrates/cli/src/main.rscrates/core/Cargo.tomlcrates/core/src/cache/migrations/20260523003752_attribute_lookup.sqlcrates/core/src/cache/mod.rscrates/core/src/commands/common.rscrates/core/src/commands/mod.rscrates/core/src/commands/noninteractive.rscrates/core/src/commands/pty/mod.rscrates/core/src/errors.rscrates/core/src/hive/executor.rscrates/core/src/hive/mod.rscrates/core/src/hive/node.rscrates/core/src/hive/plan.rscrates/core/src/hive/steps/activate.rscrates/core/src/hive/steps/build.rscrates/core/src/hive/steps/evaluate.rscrates/core/src/hive/steps/keys.rscrates/core/src/hive/steps/ping.rscrates/core/src/hive/steps/push.rscrates/core/src/lib.rscrates/nix_client/Cargo.tomlcrates/nix_client/src/lib.rscrates/nix_client/src/store_path.rs
✅ Files skipped from review due to trivial changes (4)
- .sqlx/query-d9569def6f629c7d6e25b70f56c8e8cddf0ea447deded6bd5987f2097bb92a91.json
- .sqlx/query-d2e92efc45d4c8b951656a48329273b352f5c02457bbbaea8aabb737622ccb8d.json
- .sqlx/query-4ac5886b4ab863eb9763b807933fbfad43987f3d0f0277df279b8725c37f1c4a.json
- .sqlx/query-6d7658c36b965ffe57cbaaa5550cd0a9c2b45ae2feeda76f7002b334bac28674.json
e0e2123 to
c856bf1
Compare
c856bf1 to
baac012
Compare
baac012 to
0201df4
Compare
0201df4 to
e68ad9d
Compare
e68ad9d to
73ab702
Compare
73ab702 to
ade3365
Compare
ade3365 to
25e0d68
Compare
25e0d68 to
ff6eeda
Compare
Summary by CodeRabbit
Release Notes
New Features
Performance Improvements
Refactoring