fix: adapt code interpreter warm pool to agent-sandbox v0.4.6#387
fix: adapt code interpreter warm pool to agent-sandbox v0.4.6#387ranxi2001 wants to merge 2 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates Kubernetes/agent-sandbox dependencies and adjusts workload manager + e2e logic to support newer warm-pool controller ownership patterns and SandboxClaim adoption, while also refreshing build/codegen tooling.
Changes:
- Refactor warm-pool pod counting/ready checks to support both direct Pod ownership and SandboxWarmPool → Sandbox → Pod ownership.
- Update sandbox creation flow to handle SandboxClaim adoption by polling claim/sandbox readiness and storing claim identity correctly.
- Bump Kubernetes/tooling versions, regenerate CRDs, and update Docker build images.
Reviewed changes
Copilot reviewed 13 out of 18 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/e2e_test.go | Refactors warm-pool pod discovery to handle new/old ownership chains and reuses shared listing logic. |
| pkg/workloadmanager/sandbox_helper.go | Adds CreatedAt to placeholder sandbox store entries. |
| pkg/workloadmanager/k8s_client.go | Adds dynamic-client getters for Sandbox and SandboxClaim. |
| pkg/workloadmanager/handlers_test.go | Updates annotation constant usage and adds coverage for SandboxClaim-adoption behavior. |
| pkg/workloadmanager/handlers.go | Splits “wait for ready” paths for direct sandboxes vs claim-adopted sandboxes; fixes stored identity for claims. |
| pkg/workloadmanager/codeinterpreter_controller_test.go | Adds tests ensuring SandboxTemplate network policy management is set to unmanaged. |
| pkg/workloadmanager/codeinterpreter_controller.go | Forces SandboxTemplate NetworkPolicyManagement to unmanaged and updates existing templates accordingly. |
| manifests/charts/base/crds/runtime.agentcube.volcano.sh_agentruntimes.yaml | Regenerated CRD schema changes (likely from dependency/codegen bump). |
| hack/update-codegen.sh | Updates code-generator version and changes module discovery approach. |
| go.mod | Updates Go version and bumps k8s/controller-runtime/agent-sandbox deps. |
| go.sum | Updates module sums for the dependency/tooling bumps. |
| docker/Dockerfile.router | Updates Go builder image version. |
| docker/Dockerfile.picod | Updates Go builder image and hardens apt install layer. |
| docker/Dockerfile | Updates Go builder image version. |
Files not reviewed (4)
- client-go/clientset/versioned/fake/clientset_generated.go: Generated file
- client-go/informers/externalversions/factory.go: Generated file
- client-go/informers/externalversions/runtime/v1alpha1/agentruntime.go: Generated file
- client-go/informers/externalversions/runtime/v1alpha1/codeinterpreter.go: Generated file
Comments suppressed due to low confidence (1)
pkg/workloadmanager/sandbox_helper.go:60
- When
CreationTimestampis zero,createdAtis set fromtime.Now(), butexpiresAtuses a separatetime.Now()call. To keep timestamps consistent (and avoid tiny negative/odd deltas in tests/metrics), compute the default expiry fromcreatedAt(e.g.,createdAt.Add(DefaultSandboxTTL)).
createdAt := sandboxCR.GetCreationTimestamp().Time
if createdAt.IsZero() {
createdAt = time.Now()
}
var expiresAt time.Time
if sandboxCR.Spec.Lifecycle.ShutdownTime != nil {
expiresAt = sandboxCR.Spec.Lifecycle.ShutdownTime.Time
} else {
expiresAt = time.Now().Add(DefaultSandboxTTL)
}
| return nil | ||
| } | ||
|
|
||
| func (s *Server) waitForDirectSandboxReady(ctx context.Context, sandbox *sandboxv1alpha1.Sandbox, resultChan <-chan SandboxStatusUpdate) (*sandboxv1alpha1.Sandbox, error) { |
| select { | ||
| case result := <-resultChan: |
| // if warmpool is used, the pod name is stored in sandbox's annotation `agents.x-k8s.io/pod-name` | ||
| sandboxNameForPod := createdSandbox.Name | ||
| sandboxPodName := createdSandbox.Name | ||
| if podName, exists := createdSandbox.Annotations[sandboxv1alpha1.SandboxPodNameAnnotation]; exists { |
| warmPoolSandboxes := make(map[string]struct{}, len(sandboxList.Items)) | ||
| for _, sandbox := range sandboxList.Items { |
| for _, owner := range sandbox.OwnerReferences { | ||
| if owner.Kind == ownerKindSandboxWarmPool && owner.Name == codeInterpreterName { | ||
| warmPoolSandboxes[sandbox.Name] = struct{}{} |
| _, ownedByWarmPoolSandbox := warmPoolSandboxes[owner.Name] | ||
| if (owner.Kind == ownerKindSandboxWarmPool && owner.Name == codeInterpreterName) || | ||
| (owner.Kind == "Sandbox" && ownedByWarmPoolSandbox) { |
| func (f *recordingStore) UpdateSandbox(ctx context.Context, sandbox *types.SandboxInfo) error { | ||
| f.fakeStore.UpdateSandbox(ctx, sandbox) | ||
| if f.updateErr != nil { | ||
| return f.updateErr | ||
| } | ||
| copied := *sandbox | ||
| f.lastUpdated = &copied | ||
| return nil | ||
| } |
Signed-off-by: ranxi2001 <ranxi169@163.com>
Signed-off-by: ranxi2001 <ranxi169@163.com>
5316358 to
5867183
Compare
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #387 +/- ##
===========================================
+ Coverage 47.57% 58.83% +11.26%
===========================================
Files 30 34 +4
Lines 2819 3267 +448
===========================================
+ Hits 1341 1922 +581
+ Misses 1338 1147 -191
- Partials 140 198 +58
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Hi @ranxi2001, thanks for your work. First, this PR looks more like a feature than a bug fix. Please confirm and update the PR label if needed.
Could you elaborate on this a bit more? Specifically, at which step would this become a problem? |
|
/remove-kind bug |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds compatibility with the current stable
sigs.k8s.io/agent-sandboxGo module release,v0.4.6, for AgentCube's CodeInterpreter warm pool integration.AgentCube's existing CodeInterpreter warm pool path was built against
agent-sandbox v0.1.1. Since then, agent-sandbox has changed both its public API surface and its warm-pool adoption behavior. Supporting the current stable release requires AgentCube to handle these compatibility changes:SandboxPodNameAnnotationmoved out of the internalcontrollerspackage and is now exposed from the public sandbox API package;SandboxWarmPool -> Sandbox -> Pod, andSandboxClaimreports the serving Sandbox through status;agent-sandbox v0.4.6defaultsSandboxTemplate.networkPolicyManagementto managed NetworkPolicy, which blocks AgentCube's current Router / WorkloadManager data path unless AgentCube opts out or provides matching allow rules.This PR:
sigs.k8s.io/agent-sandboxtov0.4.6and aligns the Kubernetes / controller-runtime dependency stack;sandboxv1alpha1.SandboxPodNameAnnotationconstant instead of importing an internal agent-sandbox controller package;SandboxClaim.Status.SandboxStatus.Name, fetches the adopted Sandbox, waits until it is Ready, and uses that Sandbox for Pod IP / entrypoint probing;SandboxClaimname in the AgentCube session store when the request kind isSandboxClaim, so delete / GC still operate on the claim resource;SandboxTemplateobjects tonetworkPolicyManagement: Unmanagedto preserve the existing AgentCube Router / WorkloadManager traffic path;SandboxWarmPool -> Sandbox -> Podownership shape;v0.35.4generator stack;hack/update-codegen.shso code generation usesk8s.io/code-generator v0.35.4without mutating project dependencies throughgo get -d;Which issue(s) this PR fixes:
Refs #386
Special notes for your reviewer:
agent-sandbox v0.4.6, which is the current Go module@latest. I did not includev0.5.0rc1in this PR because it is not listed as a canonical Go module release and, more importantly, it has already moved the APIs AgentCube uses fromv1alpha1tov1beta1(SandboxClaimSpec.TemplateRef-> requiredWarmPoolRef,SandboxSpec.Replicas->OperatingMode). I will track thev0.5.x/v1beta1migration as a separate follow-up because it changes warm-pool claim semantics rather than being a small patch on top of the current v0.4.6 adaptation.networkPolicyManagement: Unmanagedkeeps the behavior AgentCube had withagent-sandbox v0.1.1. If reviewers prefer to keep agent-sandbox managed NetworkPolicies, the alternative is to add explicit allow rules foragentcube-routerandworkloadmanager.hack/update-codegen.shwas also aligned to the new Kubernetes minor version because the oldcode-generator v0.34.1path downgradedagent-sandboxduring generation.Tests run:
go test ./pkg/workloadmanager -count=1make lintgo test -race ./pkg/workloadmanager -count=1go test -race -v -coverprofile=coverage.out -coverpkg=./pkg/... ./pkg/...go list ./... | grep -v '^github.com/volcano-sh/agentcube/test/e2e$' | xargs go test -count=1make gen-checkgo test ./test/e2e -run '^$' -count=1make build-allCodespell: successPython SDK Tests: successPython Lint: successCopyright Check: successCodegen Check: successAgentcube CI Workflow: successLint / golangci-lint: successTest Coverage: successAgentcube E2E Tests: successgo test ./test/e2e -run 'TestCodeInterpreter(BasicInvocation|FileOperations)$' -count=1passedgo test ./test/e2e -run 'TestCodeInterpreterWarmPool$' -count=1passedgo test ./test/e2e -run 'TestCodeInterpreterWarmPoolLoad$' -count=1passed with 100 / 100 successful requeststest/e2e/test_codeinterpreter.py: 3 tests OKtest/e2e/test_langchain_agentcube_sandbox.py: 4 tests OKtest/e2e/test_mcp_code_interpreter.py: 5 tests OKtest/e2e/test_mcp_code_interpreter_stdio.py: 1 test OK/v1endpoint returned the expected final answer42Does this PR introduce a user-facing change?: