Skip to content

feat: Tree-Search BFS, Distributed GRPO & TRACE Masking#451

Open
RUFFY-369 wants to merge 4 commits into
math-inc:mainfrom
RUFFY-369:feat/opengauss-2.0
Open

feat: Tree-Search BFS, Distributed GRPO & TRACE Masking#451
RUFFY-369 wants to merge 4 commits into
math-inc:mainfrom
RUFFY-369:feat/opengauss-2.0

Conversation

@RUFFY-369
Copy link
Copy Markdown

@RUFFY-369 RUFFY-369 commented May 13, 2026

What does this PR do?

This PR upgrades the OpenGauss agent framework by transitioning the core execution engine from a sequential ReAct loop to a concurrent Tree-Search Breadth-First Search (BFS) architecture. It introduces high-speed containerized branching, Best-of-N trajectory scoring, a standalone Group Relative Policy Optimization (GRPO) mathematical engine, and deterministic TRACE reward-path masking.

Related Issue

Fixes #450

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • environments/agent_loop.py: Replaced the linear generation loop with a concurrent BFS generation engine capable of spawning G parallel branches per turn.
  • tools/environments/docker.py: Implemented a sub-second Unix tar-pipe sandbox cloning mechanism for low-overhead, localized branch container provisioning.
  • environments/gauss_base_env.py: Refactored the base evaluation layer to dynamically resolve physical branch context bindings (res.task_id) and execute concurrent Best-of-N trajectory selection.
  • tools/rl_training_tool.py: Created the GaussGRPOEngine to natively calculate group-relative advantages ($A_i$) and execute reference-aligned clipped surrogate updates over distributed clusters.
  • agent/trace_masking.py: Introduced a linear-time trajectory sanitizer that isolates user/assistant tool interactions from environment stdout noise prior to backpropagation.
  • tests/test_trace_masking.py: Delivered an isolated, deterministic test suite validating the TRACE reward assignment heuristic across varied telemetry topologies.

How to Test

  1. Execute TRACE Heuristic Regression Suite:
    Run the newly introduced unit tests to verify linear-time trajectory sanitization:
    pytest tests/test_trace_masking.py -v
    
  2. Verify Tree-Search BFS Pipeline:
    Initiate a multi-branch pilot execution (e.g., G=2) over the baseline cohort:
    python environments/benchmarks/tblite/tblite_env.py evaluate \
     --config environments/benchmarks/tblite/local.yaml
    
    
  3. Monitor Dynamic Guest Sandbox State:
    In a separate terminal shell, verify the creation, parallel execution, and automatic garbage-collection of guest containers:
    docker ps --filter "name=-branch-"
    
    

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform:

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

📊 Comparative Performance Report

We executed a rigorous head-to-head pilot benchmark isolating OpenGauss 2.0 performance against the legacy baseline:

📋 TL;DR (Bottom Line Up Front)

  • 💨 Faster Convergence: The agent required 12.74% fewer turns on average to reach final states.
  • 🔥 The Headline Win: In high-complexity debugging tasks, OpenGauss 2.0 saved 70% of the API budget (concluding in just 18 turns vs hitting the 60-turn timeout limit).
  • 🛠️ Production Stability: Custom memory-mapped tar-pipe container isolation kept guest sandbox provisioning overhead to under 1 second per branch.

🔬 Methodology & Scope Selection

To maximize statistical signal while protecting production API limits, we evaluated both branches across an identical 5-task representative cohort selected from the TBLite benchmark, spanning distinct operational domains (System Admin, Scientific Computing, ML, Software Eng, Debugging).

📈 High-Level Telemetry Comparison

Measurement 🔴 Legacy Baseline (G=1) 🟢 OpenGauss 2.0 (G=4) 📉 The Improvement
Average Turns Used 51.8 turns 45.2 turns Saved 6.6 turns per task (-12.7%)
Reliability Easily deadlocked Escaped feedback loops Structural Convergence
Sandbox Boot Overhead ~110s (Docker limits) < 1.0s (Warm cache) Instant executions

🔬 Task-by-Task Turn Allocation

Task / Category Baseline Turns Challenger Turns (Tree-Search) Impact
system-administration 60 (Limit Hit) 60 (Limit Hit) Stable
scientific-computing 60 (Limit Hit) 60 (Limit Hit) Stable
general-ml 19 42 Expanded Exploration
software-engineering 60 (Limit Hit) 46 Saved 14 turns (Fast Path)
debugging 60 (Limit Hit) 18 Saved 42 turns (-70%) 🚀

cc @gauss-math-inc @jesse-michael-han

@RUFFY-369 RUFFY-369 changed the title Feat/opengauss 2.0 feat: Tree-Search BFS, Distributed GRPO & TRACE Masking May 13, 2026
@RUFFY-369
Copy link
Copy Markdown
Author

📎 Housekeeping Note for Maintainers:
While compiling the submission against the repository boilerplates, I noticed that several absolute links in the contributing checklists and ISSUE/PR templates still reference the upstream NousResearch/gauss-agent repository (likely a legacy artifact from the initial codebase fork of hermes-agent).

Once this PR is evaluated, I’d be happy to submit a quick subsequent docs: / chore: PR to systematically scrub the .github/ directory and update the absolute URLs to point directly to the OpenGauss organization/repo to keep the checklist links warm for future contributors!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: OpenGauss 2.0 — Tree-Search BFS Agent Loop, Distributed GRPO Infrastructure & TRACE Reward Masking

1 participant