feat(autocontext-memory): add recent focus window rollup with active tool guard#1514
feat(autocontext-memory): add recent focus window rollup with active tool guard#1514lzbjut wants to merge 3 commits into
Conversation
|
A few extra notes to make review easier: Review structureThis PR is intentionally split into 2 commits:
Behavior details
ScopeThe diff is limited to:
No Verification run locally
Both passed locally. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
AgentScopeJavaBot
left a comment
There was a problem hiding this comment.
🤖 AI Review
This PR addresses the root cause of issue #1482 with a well-scoped, two-layer fix in AutoContextMemory. First, isCurrentRoundToolInFlight correctly gates Strategy 6/7 so that an in-flight tool_use / tool_result pair is never summarized away before the main model consumes the structured tool result, which prevents the ReAct loop from re-issuing the same tool call after compression. Second, a new Strategy 1 (rollupHistoryOutsideFocusWindow) introduces a recency-weighted msgCount-pressure path that preserves the current round and a configurable recent focus window (rounds + token floor). The strategy renumbering, event type, and metadata schema are consistent across AutoContextMemory, CompressionEvent, Prompts, PromptConfig, PromptProvider, and MsgUtils.isFinalAssistantResponse. Test coverage is thorough (active tool round guard, msgCount-only path, token-only path, generation merge, token floor, no-fallthrough). The main residual concerns are (1) a minor offload-context leak when the previous rollup summary is merged into the new one, and (2) a misleading inline comment on the MIN_HISTORY_ROLLUP_MESSAGES threshold. No blocking issues.
| // old summary and newly aged-out rounds, leave them untouched for safety. | ||
| startIndex--; | ||
| previousRollupGeneration = getRollupGeneration(rawMessages.get(startIndex)); | ||
| } |
There was a problem hiding this comment.
[major] Memory leak in offloadContext when an adjacent previous rollup summary is merged into the new rollup.
Here startIndex-- extends the rollup range to include the previous summary message, then the whole range is offloaded under a fresh UUID and the previous summary is dropped from working memory. The previous summary's offloaduuid entry in offloadContext is never cleared, so every generation merge leaks one offload entry for the lifetime of the memory instance (in long sessions this map grows unbounded).
Suggested fix - clear the orphaned UUID before merging:
if (startIndex > 0 && isRecentFocusWindowRollupSummary(rawMessages.get(startIndex - 1))) {
Msg previousSummary = rawMessages.get(startIndex - 1);
Object oldUuid = getCompressionMetadata(previousSummary).get(OFFLOAD_UUID_KEY);
if (oldUuid instanceof String s) {
clear(s);
}
startIndex--;
previousRollupGeneration = getRollupGeneration(rawMessages.get(startIndex));
}A similar concern applies if generateRecentFocusHistoryRollup throws (the LLM call uses .block() and only resumes InterruptedException): the offload(uuid, messagesToRollup) call at line 412 has already inserted the entry, so a model failure leaves an orphan. Calling offload after the summary is built, or wrapping the LLM call so the entry is removed on error, would close that loop too.
There was a problem hiding this comment.
Fixed in 5585c3a. The new rollup offload now happens only after summary generation succeeds, so model failures no longer leave a fresh orphaned UUID. For rolling merges, the previous summary offload UUID is carried through the rollup range and cleared after the working-memory replacement succeeds. Added tests for both the merge cleanup and the model-failure path.
|
|
||
| private static final Logger log = LoggerFactory.getLogger(AutoContextMemory.class); | ||
| // Two complete text rounds: user + assistant, user + assistant. | ||
| private static final int MIN_HISTORY_ROLLUP_MESSAGES = 4; |
There was a problem hiding this comment.
[minor] The comment // Two complete text rounds: user + assistant, user + assistant. does not match the actual guard. MIN_HISTORY_ROLLUP_MESSAGES = 4 is a raw message-count threshold: a single tool round (USER + ASSISTANT(tool_use) + TOOL(tool_result) + ASSISTANT(final) = 4 messages) also passes, and Strategy 1 would then roll up just one round. Either tighten the gate to actually require >=2 rounds (e.g. rollupRange.roundCount() >= 2) or update the comment to describe the message-count semantics so the intent is clear. Today the rollup payload + LLM cost can be triggered by a single 4-message round, which is probably not the intended floor.
There was a problem hiding this comment.
Fixed in 5585c3a by updating the comment to describe the actual raw-message threshold semantics. I left the behavior unchanged here to keep this follow-up focused on the review issue rather than changing the rollup trigger policy.
| } | ||
| } | ||
|
|
||
| return hasToolBlock && !hasFinalAssistant; |
There was a problem hiding this comment.
[praise] Nice surgical fix for the root cause of #1482: gating Strategy 6/7 on hasToolBlock && !hasFinalAssistant is the right boundary - it lets the main model finish consuming structured role=tool results before any summarization touches the current round, which is exactly what was missing in the original report (compression of in-flight tool pairs caused the ReAct loop to re-issue the same call until maxIters). Pairing it with the msgCount-pressure early-exit (if (msgCountReached && !tokenCounterReached) return false;) is also a nice hard guarantee - preserving the recent focus window even when no historical strategy fired.
|
|
||
| log.warn("All compression strategies exhausted but context still exceeds threshold"); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
[minor] compressIfNeeded is now ~160 lines with seven strategies plus two new gating branches (msgCount-only early return, in-flight tool guard). Readability would improve if each strategy block were extracted into a small private helper returning a boolean (tryHistoryRollup, tryToolInvocationCompress, ...), so this method becomes a flat sequence of if (tryXxx(...)) return true;. Not blocking, but the cyclomatic complexity will keep growing as new strategies are added.
Summary
Implements a recency-weighted AutoContextMemory compression path for #1482.
Verification
mvn -pl agentscope-extensions/agentscope-extensions-autocontext-memory -am -Dtest=AutoContextMemoryTest,MsgUtilsTest testmvn -pl agentscope-core,agentscope-extensions/agentscope-extensions-autocontext-memory -am testFixes #1482.