Skip to content

feature: log per-op left/right/result memory and cache high-water mark#514

Open
bimalgaudel wants to merge 4 commits into
masterfrom
gaudel/feature/detailed_logging_with_hwmark
Open

feature: log per-op left/right/result memory and cache high-water mark#514
bimalgaudel wants to merge 4 commits into
masterfrom
gaudel/feature/detailed_logging_with_hwmark

Conversation

@bimalgaudel
Copy link
Copy Markdown
Member

@bimalgaudel bimalgaudel commented May 11, 2026

Summary

Adds per-binary-op memory breakdown to the eval trace logger and a
cache+result high-water-mark estimate, intended for manual inspection
of eval traces while profiling memory peaks. Each cache event now
emits one self-contained line with both the per-entry and total cache
sizes; each binary-op event emits one EvalDetail line with
left=, right=, result=, and hw= fields. Log columns are
labeled key=value to keep lines self-describing, but the format is
not yet a stable contract — downstream parsers should expect churn.

Changes

  • CacheManager gains alive(key) and entry_size_in_bytes(key),
    used by the trace to count cached children at most once when
    computing the high-water mark.
  • The high-water-mark formula accounts for cached children whose
    canon_phase() != 1, where mult_by_phase materializes a fresh
    buffer alongside the cached pre-phase data.
  • CacheStat absorbs the per-entry memory field; the short-lived
    CacheDetail struct from an earlier iteration is removed.
  • log::bytes(...) is unified to one variadic that accepts both
    ResultPtr and CacheManager arguments and returns Bytes.
  • Also: tightens CacheManager's templated constructor with a
    requires(!same_as<…, CacheManager>) guard so it no longer shadows
    the implicit copy/move overloads. Unrelated to the logging feature,
    kept in this PR for ergonomics.

Copy link
Copy Markdown
Member

@evaleev evaleev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

Adds two new trace-mode log structures to the evaluator:

  • EvalDetailBinary — emits left/right/result bytes plus an estimated cache+result high-water-mark on every binary op
  • CacheDetail — emits a per-event Store/Access/Release record with the key and current data size

Also tightens CacheManager's templated constructor with a requires(!same_as<…, CacheManager>) guard so it can't shadow the implicit copy/move ctors.

Body is empty on the PR — would be worth a sentence on the motivation (profiling memory peaks in eval traces?).

Correctness

The high-water-mark estimate may under-count when canon_phase() != 1 (eval.hpp:386–389):

auto hwmark = log::bytes(cache) + log::bytes(result).value;
if (!cache.alive(node.left())) hwmark += log::bytes(left).value;
if (!cache.alive(node.right())) hwmark += log::bytes(right).value;

Rationale assumes that when a child is still alive() in the cache, left/right aliases the cached buffer and shouldn't be double-counted. That's true when phase == 1 (entry::access() returns the same shared pointer), but mult_by_phase allocates a fresh post-phase result while the cache still holds the pre-phase data — so both buffers exist live simultaneously and alive()==true would skip the new allocation. Minor and only affects the estimate, but worth a comment so future maintainers don't think the formula is exact.

The requires clause on the ctor is a real fix, but unrelated to the feature. It guards the templated ctor from out-competing the implicit copy/move when CacheManager is passed by lvalue. Worth either splitting into its own commit or calling out in the PR body, since it changes overload resolution behavior independently of logging.

Style / Project conventions

  • Redundant logs at every cache event. Both log::cache(node, cache) (existing CacheStat) and the new log::cache(log::CacheDetail{...}) are emitted side-by-side at eval.hpp:317–325 and eval.hpp:332–338. Consumers will see two lines per event. Consider folding the new fields into CacheStat or making CacheDetail a replacement rather than an addition.
  • Brace-init inconsistency at eval.hpp:393: .mem_hwmark = {hwmark} uses extra braces while the other Bytes members are constructed via log::bytes(...) (which already returns Bytes). Either .mem_hwmark = Bytes{hwmark} or refactor so hwmark is built up as Bytes from the start.
  • API asymmetry surfaced by this PR: log::bytes(CacheManager) returns size_t while log::bytes(ResultPtr...) returns Bytes, which forces .value accesses in the new code. Not new in this PR, but it's the first caller that mixes both — would be cleaner to harmonize the return type.
  • The new EvalDetailBinary / CacheDetail structs lack doc-comments while the existing EvalStat/CacheStat are similarly bare, so this matches local convention. A one-liner per struct would still help.

Tests

  • No new tests. The new CacheManager::alive(key) method is testable in isolation (tests/unit/test_cache_manager.cpp) — worth adding a small case that exercises pre-store / post-store / post-final-access transitions.
  • No coverage of the new log lines themselves; presumably they're consumed elsewhere — confirm there's at least one integration test path that exercises EvalTrace=On so a regression in the format would surface.

Performance / Risk

  • All new work is gated by if constexpr (trace(EvalTrace)), so zero impact when traces are off. Risk to existing flows is essentially nil.
  • When trace is on, each binary op adds one hash lookup (cache.alive) per child and one extra cache.size_in_bytes() traversal (already happens via log::bytes(cm)). Negligible.

Security

No external input, no allocations of attacker-controllable size — nothing to flag.

Suggested follow-ups (non-blocking)

  1. PR body: a sentence on intended consumer (manual inspection? a downstream parser?) so log-format stability expectations are clear.
  2. Split the requires-clause hardening into its own commit, or note it in the description.
  3. Either replace log::cache(node, cache) with the new detail variant, or document that both are emitted intentionally.
  4. Brief unit test for CacheManager::alive(key_type const&).

Overall: clean, low-risk, mostly mechanical addition. Ship-able once the redundant-log question is resolved and the ctor change is acknowledged.

@bimalgaudel bimalgaudel marked this pull request as draft May 12, 2026 16:41
Drop EvalDetailBinary; collapse to a single Eval line per op, with
optional left/right populated only for Sum/Product. Add an alloc field
that distinguishes fresh-allocation ops from SumInplace (alloc=0B), so
summing the column gives total bytes allocated and `grep alloc=0B`
isolates inplace ops. Fix hwmark undercounts in Permute and SumInplace
for cached inputs whose canon_phase != 1 or that the permute wrapper
turned into a fresh buffer — previously these silently biased hw down.
Replace the sequant.nu helper with sequant_trace.py — a stdlib-only
parser exposing trace_eval/trace_cache/trace_term plus a one-pass
parse() that returns NamedTuple rows (EvalRow/CacheRow/TermRow with
an extras dict for unrecognized fields). Rewrite evaluation.rst with
Python recipes and regenerated example outputs that reflect the new
trace columns (alloc, hw, left/right) and the per-mode field-population
rules. Add a note that the parser is iteration-blind: callers must
slice rows by line range or term before aggregating across repeated
contexts (the sample mpqc_out.txt is capped at one iteration for this
reason).
@bimalgaudel bimalgaudel marked this pull request as ready for review May 12, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants