Skip to content

SandboxAgent: default PSS runtime identity (runAsNonRoot/runAsUser) per-runtime image #1900

@francescomagalini

Description

@francescomagalini

Problem

SandboxAgent (kagent.dev/v1alpha2) renders into an agents.x-k8s.io/v1alpha1/Sandbox whose spec.podTemplate.spec is built by buildPodTemplate in go/core/internal/controller/translator/agent/manifest_builder.go.

When the SandboxAgent CR omits spec.declarative.deployment.securityContext and .podSecurityContext (the common case), the rendered Sandbox has no runtime identity:

  • manifest_builder.go:462SecurityContext: dep.PodSecurityContext (nil)
  • manifest_builder.go:484securityContext: buildContainerSecurityContext(manifestCtx.deployment.SecurityContext, needCodeExecIsolation) returns nil when base == nil and needCodeExecIsolation == false (manifest_builder.go:416-433)

The agent-sandbox controller copies Sandbox.Spec.PodTemplate.Spec verbatim into the Pod (kubernetes-sigs/agent-sandbox v0.4.6 sandbox_controller.go:760-783), so the Pod also has no runtime identity.

In Pod-Security-Standards-restricted environments (very common on managed Kubernetes — AKS, EKS, GKE — and on clusters with Kyverno/OPA enforcing PSS), Pods without runAsNonRoot: true + a numeric non-zero runAsUser are rejected at admission. SandboxAgents fail to start.

Reproducer

apiVersion: kagent.dev/v1alpha2
kind: SandboxAgent
metadata:
  name: sandbox-agent-test
  namespace: kagent
spec:
  type: Declarative
  declarative:
    modelConfig: default-model-config
    systemMessage: "smoke test"
  sandbox:
    network:
      allowedDomains: [api.github.com]

On a cluster enforcing PSS-restricted, the spawned Pod is denied:

Every container must effectively set runAsNonRoot: true and a numeric non-zero runAsUser at Pod or container level.

Proposed fix

buildContainerSecurityContext (and pod-level defaulting in buildPodTemplate) should set PSS-restricted runtime identity when none is supplied. The UID is known per image:

Image Source USER
kagent-dev/kagent/app python/Dockerfile 1001
kagent-dev/kagent/golang-adk (lean) go/Dockerfile 65532
kagent-dev/kagent/golang-adk:<tag>-full (SRT) go/Dockerfile.full 1001
kagent-dev/kagent/skills-init docker/skills-init/Dockerfile ${PYTHON_UID}

Suggested behaviour:

  1. buildContainerSecurityContext(base, needCodeExecIsolation, runtime, needsSRT) returns a *corev1.SecurityContext with RunAsNonRoot: ptr.To(true) and RunAsUser: ptr.To(uidForRuntimeImage(runtime, needsSRT)) when base == nil.
  2. buildPodTemplate sets corev1.PodSpec.SecurityContext = &corev1.PodSecurityContext{RunAsNonRoot: ptr.To(true), RunAsUser: ptr.To(uidForRuntimeImage(...))} when dep.PodSecurityContext == nil.
  3. When the user supplies spec.declarative.deployment.{securityContext,podSecurityContext}, those override — no surprise.

This keeps SandboxAgent runnable out-of-the-box on PSS-restricted clusters without requiring every consumer to hand-roll a securityContext in their CR.

Workaround currently in use

Ameide is shipping a Kyverno ClusterPolicy that mutates the Sandbox CR at admission to add runAsNonRoot: true + runAsUser: 1001 (the default kagent-dev/kagent/app image's UID) when missing. Linkable on request — we'd be glad to retire it once the upstream defaulting lands.

Versions

  • kagent: v0.9.4
  • agent-sandbox: v0.4.6
  • Kubernetes: 1.33 (AKS)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions