Skip to content

feat(autocontext-memory): add recent focus window rollup with active tool guard#1514

Open
lzbjut wants to merge 3 commits into
agentscope-ai:mainfrom
lzbjut:fix/autocontext-focus-window-rollup
Open

feat(autocontext-memory): add recent focus window rollup with active tool guard#1514
lzbjut wants to merge 3 commits into
agentscope-ai:mainfrom
lzbjut:fix/autocontext-focus-window-rollup

Conversation

@lzbjut
Copy link
Copy Markdown

@lzbjut lzbjut commented May 28, 2026

Summary

Implements a recency-weighted AutoContextMemory compression path for #1482.

  • Protects in-flight tool rounds from current-round compression before tool results are consumed by the model.
  • Adds Strategy 1 recent focus window history rollup for msgCount pressure.
  • Preserves the current round and recent focus window as a hard guarantee on the msgCount path.
  • Keeps token-pressure strategies unchanged except for the active tool round guard.

Verification

  • mvn -pl agentscope-extensions/agentscope-extensions-autocontext-memory -am -Dtest=AutoContextMemoryTest,MsgUtilsTest test
  • mvn -pl agentscope-core,agentscope-extensions/agentscope-extensions-autocontext-memory -am test

Fixes #1482.

@lzbjut lzbjut requested a review from a team May 28, 2026 07:10
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 28, 2026

CLA assistant check
All committers have signed the CLA.

@lzbjut
Copy link
Copy Markdown
Author

lzbjut commented May 28, 2026

A few extra notes to make review easier:

Review structure

This PR is intentionally split into 2 commits:

  1. fix(autocontext-memory): skip current-round compression when tool round is in-flight

  2. feat(autocontext-memory): add Strategy 1 recent focus window history rollup

    • Adds a msgCount-pressure path that rolls complete historical rounds outside the recent focus window into one rolling summary.
    • The token-pressure path is intentionally unchanged except for the active tool round guard above.
    • If Strategy 1 succeeds but message count is still above threshold, the code intentionally does not fall through to current-round compression on the msgCount path. Preserving the current/recent focus window is a hard guarantee.

Behavior details

  • New default config:
    • historyRollupKeepRecentRounds = 3
    • historyRollupKeepRecentTokens = 8000
  • New compression event:
    • RECENT_FOCUS_WINDOW_COMPACT
  • The rollup event metadata records:
    • roundCount
    • messageCountCompacted
    • messageCountBefore
    • messageCountAfter
    • rollupGeneration
    • tokenBefore
    • tokenAfter
    • offloaduuid
  • The applied log now includes both compacted messages and before/after working-memory size:
    • Strategy 1: APPLIED - Rolled up X messages across Y old rounds into focus window summary (working memory: A -> B)

Scope

The diff is limited to:

agentscope-extensions/agentscope-extensions-autocontext-memory/

No agentscope-core files are changed.

Verification run locally

  • mvn -pl agentscope-extensions/agentscope-extensions-autocontext-memory -am -Dtest=AutoContextMemoryTest,MsgUtilsTest test
  • mvn -pl agentscope-core,agentscope-extensions/agentscope-extensions-autocontext-memory -am test

Both passed locally.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

❌ Patch coverage is 86.90476% with 33 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ope/core/memory/autocontext/AutoContextMemory.java 86.07% 13 Missing and 20 partials ⚠️

📢 Thoughts on this report? Let us know!

AgentScopeJavaBot

This comment was marked as outdated.

@AgentScopeJavaBot AgentScopeJavaBot added enhancement New feature or request area/ext/memory Memory/session extension implementations labels May 28, 2026
Copy link
Copy Markdown
Collaborator

@AgentScopeJavaBot AgentScopeJavaBot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Review

This PR addresses the root cause of issue #1482 with a well-scoped, two-layer fix in AutoContextMemory. First, isCurrentRoundToolInFlight correctly gates Strategy 6/7 so that an in-flight tool_use / tool_result pair is never summarized away before the main model consumes the structured tool result, which prevents the ReAct loop from re-issuing the same tool call after compression. Second, a new Strategy 1 (rollupHistoryOutsideFocusWindow) introduces a recency-weighted msgCount-pressure path that preserves the current round and a configurable recent focus window (rounds + token floor). The strategy renumbering, event type, and metadata schema are consistent across AutoContextMemory, CompressionEvent, Prompts, PromptConfig, PromptProvider, and MsgUtils.isFinalAssistantResponse. Test coverage is thorough (active tool round guard, msgCount-only path, token-only path, generation merge, token floor, no-fallthrough). The main residual concerns are (1) a minor offload-context leak when the previous rollup summary is merged into the new one, and (2) a misleading inline comment on the MIN_HISTORY_ROLLUP_MESSAGES threshold. No blocking issues.

// old summary and newly aged-out rounds, leave them untouched for safety.
startIndex--;
previousRollupGeneration = getRollupGeneration(rawMessages.get(startIndex));
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] Memory leak in offloadContext when an adjacent previous rollup summary is merged into the new rollup.

Here startIndex-- extends the rollup range to include the previous summary message, then the whole range is offloaded under a fresh UUID and the previous summary is dropped from working memory. The previous summary's offloaduuid entry in offloadContext is never cleared, so every generation merge leaks one offload entry for the lifetime of the memory instance (in long sessions this map grows unbounded).

Suggested fix - clear the orphaned UUID before merging:

if (startIndex > 0 && isRecentFocusWindowRollupSummary(rawMessages.get(startIndex - 1))) {
    Msg previousSummary = rawMessages.get(startIndex - 1);
    Object oldUuid = getCompressionMetadata(previousSummary).get(OFFLOAD_UUID_KEY);
    if (oldUuid instanceof String s) {
        clear(s);
    }
    startIndex--;
    previousRollupGeneration = getRollupGeneration(rawMessages.get(startIndex));
}

A similar concern applies if generateRecentFocusHistoryRollup throws (the LLM call uses .block() and only resumes InterruptedException): the offload(uuid, messagesToRollup) call at line 412 has already inserted the entry, so a model failure leaves an orphan. Calling offload after the summary is built, or wrapping the LLM call so the entry is removed on error, would close that loop too.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5585c3a. The new rollup offload now happens only after summary generation succeeds, so model failures no longer leave a fresh orphaned UUID. For rolling merges, the previous summary offload UUID is carried through the rollup range and cleared after the working-memory replacement succeeds. Added tests for both the merge cleanup and the model-failure path.


private static final Logger log = LoggerFactory.getLogger(AutoContextMemory.class);
// Two complete text rounds: user + assistant, user + assistant.
private static final int MIN_HISTORY_ROLLUP_MESSAGES = 4;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] The comment // Two complete text rounds: user + assistant, user + assistant. does not match the actual guard. MIN_HISTORY_ROLLUP_MESSAGES = 4 is a raw message-count threshold: a single tool round (USER + ASSISTANT(tool_use) + TOOL(tool_result) + ASSISTANT(final) = 4 messages) also passes, and Strategy 1 would then roll up just one round. Either tighten the gate to actually require >=2 rounds (e.g. rollupRange.roundCount() >= 2) or update the comment to describe the message-count semantics so the intent is clear. Today the rollup payload + LLM cost can be triggered by a single 4-message round, which is probably not the intended floor.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5585c3a by updating the comment to describe the actual raw-message threshold semantics. I left the behavior unchanged here to keep this follow-up focused on the review issue rather than changing the rollup trigger policy.

}
}

return hasToolBlock && !hasFinalAssistant;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[praise] Nice surgical fix for the root cause of #1482: gating Strategy 6/7 on hasToolBlock && !hasFinalAssistant is the right boundary - it lets the main model finish consuming structured role=tool results before any summarization touches the current round, which is exactly what was missing in the original report (compression of in-flight tool pairs caused the ReAct loop to re-issue the same call until maxIters). Pairing it with the msgCount-pressure early-exit (if (msgCountReached && !tokenCounterReached) return false;) is also a nice hard guarantee - preserving the recent focus window even when no historical strategy fired.


log.warn("All compression strategies exhausted but context still exceeds threshold");
return false;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] compressIfNeeded is now ~160 lines with seven strategies plus two new gating branches (msgCount-only early return, in-flight tool guard). Readability would improve if each strategy block were extracted into a small private helper returning a boolean (tryHistoryRollup, tryToolInvocationCompress, ...), so this method becomes a flat sequence of if (tryXxx(...)) return true;. Not blocking, but the cyclomatic complexity will keep growing as new strategies are added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ext/memory Memory/session extension implementations enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AutoContextMemory导致工具重复调用,直到maxIters才结束,触发压缩后有概率出现这个问题

3 participants