Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ Each provider file auto-registers with its registry on import (side-effect regis
- **Phase 4.5: Authorization** (optional) -- invoke external policy engine (OPA, Cedar, Topaz) when an authorization block is configured. Decisions: `permit`, `deny`, `require-escalation`. Skipped entirely when no authorization block resolves for the task.
- **Phase 5: Execution** -- run the tool, capture stdout/stderr/exit code/duration, compute hashes.
- **Phase 6: Evidence Generation** -- build canonical evidence payload, attest execution, verify evidence if required.
- **Phase 6.5: Post-exec Verify** (optional) -- run `workflow.verify` / `task.verify` after a successful command. This is an operator-local postcondition check recorded separately from evidence; verify failures can still fail the task or downgrade to warnings according to `verify.on_failure`.
- **Phase 7: Audit** -- write structured append-only audit record with declared/resolved identity, authorization proof summary, delegation chain, trust level, authorization decision, and runtime instance attribution.
- **Phase 8: Cleanup** -- delete temporary files, destroy ephemeral materialization and derived handoff credentials.

Expand All @@ -85,6 +86,46 @@ The architecture composes with emerging standards rather than inventing new prot
- **SPIFFE/WIMSE** -- identity profiles use URI-formatted principals (`agent://`, `spiffe://`) for interoperability with workload identity infrastructure.
- **OAuth 2.0** -- auth modes map to standard grant types: `service` (Client Credentials), `delegated` (Authorization Code), `on-behalf-of` (JWT Authorization Grant, RFC 7523), `exchange` (Token Exchange, RFC 8693).

### Scheduler/Child Trust Boundary

The six-layer identity model enables a meaningful trust boundary between
the scheduler and its child tasks, but only when the child is configured
to be narrower than the parent.

The credential flow traces through four control surfaces:

1. **Operator provisions** -- credentials enter the system via env vars,
Vault, managed identity, or files. The operator controls
`SCHEDULER_PROVIDER_PATH` and the scheduler's execution environment.
2. **Scheduler resolves** -- at dispatch time, the scheduler calls the
identity provider to resolve a credential session. Trust evaluation
and authorization gates run before any credential is materialized.
3. **Provider narrows** -- when `child_credential_policy` is `downscope`,
the provider mints a per-task restricted key via the credential
issuer's API, scoped to exactly the permissions the child declared.
Scope hierarchy validation ensures the child cannot escalate.
The key is revoked in cleanup.
4. **Child receives scoped creds** -- the child task runs with only the
narrowed credentials. For shell tasks, these are injected as env vars.
For agent tasks, auth-profile forwarding directs the gateway to use
the appropriate profile.

**Trust boundary definition:** the operator controls the scheduler env
and provider directory. Everything downstream narrows only. A child
MUST NOT receive broader credentials than its parent. If the provider
directory or scheduler env is compromised, the trust model is broken --
these are root-of-trust assumptions, not runtime invariants.

**Credential strategies:** the current implementation supports both
precreated keys (operator creates restricted keys ahead of time, provider
resolves by scope name) and dynamic key minting (provider mints a
per-task restricted key via the credential issuer's API and revokes it on
cleanup). Both use the same manifest syntax; the provider's
`key_strategy` configuration determines which path runs.

For the full trust architecture with concrete guarantees and
non-guarantees, see `openclaw-scheduler/docs/trust-architecture.md`.

## Near-Term Roadmap

1. Stabilize manifest and target outputs.
Expand Down
49 changes: 49 additions & 0 deletions docs/execution-identity.md
Original file line number Diff line number Diff line change
Expand Up @@ -1722,6 +1722,14 @@ Evidence verification occurs after execution (Phase 5) has already completed. A
- when `verify.required` is `false`, a verification failure is recorded as a warning but does not affect the exit code
- the evidence envelope (including the failed verification status) is always written to the audit record so that operators can investigate

### Phase 6.5: Post-execution Verify

- only enter this phase when the main command exited successfully and a workflow/task `verify` block resolves
- run the declared verify shell in the task's effective execution context
- treat `verify` as an operator-local postcondition separate from evidence attestation; the attested evidence payload reflects the main command result, not the later verify shell outcome
- when `verify.on_failure` is `error`, return a non-zero status after cleanup and audit
- when `verify.on_failure` is `warn`, record the verify failure as a warning without changing the exit code

### Phase 7: Audit

- write append-only audit record
Expand Down Expand Up @@ -2100,6 +2108,47 @@ Authorization evaluation results SHOULD return:

## Security Model

### Execution Principal Separation

The security model rests on a separation between the control plane
(agentcli) and the execution runtime (openclaw-scheduler or another
backend).

**agentcli's role:** declare identity, compile trust constraints, validate
delegation chains (cycle detection via DFS on scope hierarchy), and verify
authorization proofs at apply time. agentcli does not persist runtime
credentials and does not own the dispatch queue.

**Scheduler's role:** resolve credentials at dispatch time, enforce trust
gates, apply child credential policies, mint or resolve scoped credentials
via identity providers, materialize them into execution environments, and
record audit trails with full identity provenance.

**When the boundary is a real security boundary:** the scheduler/child
separation is a meaningful security boundary when the child is narrower
than the parent in identity, credentials, tools, state, or
network/filesystem scope. The `child_credential_policy` field (`none`,
`inherit`, `downscope`, `independent`) controls this. When the policy is
`downscope`, the provider mints a per-task restricted key scoped to
exactly the permissions the child declared -- the child literally cannot
access the parent's full credential set.

**When the boundary is operational:** if the child inherits the parent's
full credentials without narrowing, the boundary provides lifecycle
isolation, attribution, context isolation, and blast radius containment
for crashes -- but not credential-based access control. This is still
valuable, but operators should not treat it as a security guarantee unless
narrowing is actually configured.

The honest summary: if you cannot make the child meaningfully narrower in
identity, tools, state, or network/filesystem scope, then the sub-agent
boundary is mostly an execution/lifecycle boundary, not a strong security
boundary. The stronger design is scheduler as broker/orchestrator, child
as bounded actor, with explicit narrowing at each level.

For the runtime perspective on this architecture, see
`openclaw-scheduler/docs/trust-architecture.md`.

### Secret Handling

- manifests MUST NOT contain raw client secrets or bearer tokens
Expand Down
20 changes: 20 additions & 0 deletions docs/field-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ constraints). When in doubt, the source code is authoritative.
| `authorization` | object | No | Authorization reference (v0.2). See [Authorization Reference Fields](#authorization-reference-fields). |
| `evidence` | object | No | Evidence reference (v0.2). See [Evidence Reference Fields](#evidence-reference-fields). |
| `child_credential_policy` | string | No | Child credential flow policy for triggered children. See [Child Credential Policy Fields](#child-credential-policy-fields). |
| `verify` | object | No | Post-success verification command. See [Task Verify Fields](#task-verify-fields). |

---

Expand Down Expand Up @@ -74,6 +75,7 @@ constraints). When in doubt, the source code is authoritative.
| `authorization` | object | No | Authorization reference (v0.2). See [Authorization Reference Fields](#authorization-reference-fields). |
| `evidence` | object | No | Evidence reference (v0.2). See [Evidence Reference Fields](#evidence-reference-fields). |
| `child_credential_policy` | string | No | Child credential flow policy for triggered children. See [Child Credential Policy Fields](#child-credential-policy-fields). |
| `verify` | object | No | Post-success verification command. See [Task Verify Fields](#task-verify-fields). |
| `on_failure` | object | No | Failure handler shorthand. See [On-Failure Fields](#on-failure-fields). |
| `delete_after_run` | boolean | No | Remove the compiled job after first successful execution. |

Expand Down Expand Up @@ -272,6 +274,22 @@ When `ref` is present, the referenced profile is loaded first, then inline field
|-------|------|----------|--------|-------------|
| `child_credential_policy` | string | No | `none`, `inherit`, `downscope`, `independent` | Controls how a child task receives or derives credentials relative to its parent. Workflow-level values act as defaults for tasks. |

`child_credential_policy: "downscope"` is validated as a capability warning when a backend lacks `credential_handoff`: the scheduler can still persist the job, but child narrowing will not be enforceable at dispatch. This is intentionally softer than `identity.presentation.handoff != "none"`, which is a hard compatibility requirement because the active runtime/backend must advertise explicit handoff semantics up front.

---

## Task Verify Fields

Runs a shell command after the main task succeeds. Workflow-level `verify` acts as the default for tasks; a task-level `verify` replaces the workflow block and omitted optional fields fall back to built-in defaults.

In the v0.2 execution pipeline, `verify` runs after evidence generation. Evidence and attestation therefore describe the main command result; the `verify` outcome is recorded separately and can still flip the final task status according to `on_failure`. If operators need end-to-end proof that includes the verification step, model that requirement in the evidence payload rather than assuming `verify` is part of the attested result.

| Field | Type | Required | Values | Description |
|-------|------|----------|--------|-------------|
| `shell` | string | Yes | -- | Shell command to run after a successful task execution. |
| `timeout_seconds` | integer | No | `>= 1` | Timeout for the verify command. Default: `30`. |
| `on_failure` | string | No | `error`, `warn` | Whether a verify failure should fail the task or be surfaced as a warning. Default: `error`. |

### Identity Subject Fields

| Field | Type | Required | Values | Description |
Expand Down Expand Up @@ -332,6 +350,8 @@ When `ref` is present, the referenced profile is loaded first, then inline field
| `cleanup` | string | No | `always`, `on-success`, `on-failure`, `never` | When credential cleanup runs. |
| `default_redaction` | boolean | No | -- | Whether credential values are redacted by default in audit output. |

`identity.presentation.handoff` is stricter than `child_credential_policy`: any non-`none` handoff mode requires explicit `credential_handoff` support from the active runtime/backend during capability negotiation, because the handoff boundary itself must be modeled first-class.

### Identity Presentation Bindings

Each element in the `bindings` array is an object with these fields:
Expand Down
10 changes: 9 additions & 1 deletion src/apply.js
Original file line number Diff line number Diff line change
Expand Up @@ -240,18 +240,25 @@ export async function applyManifestToScheduler(

let effectiveResult = resolveEffectiveFeatures('openclaw-scheduler', null);
let handoffVersion = '1';
let capabilityWarnings = [];
if (hasV02Features) {
const runtimeCaps = querySchedulerCapabilities(schedulerRunner);
effectiveResult = resolveEffectiveFeatures('openclaw-scheduler', runtimeCaps);
handoffVersion = effectiveResult.handoff_version || '1';

const capabilityErrors = validateManifestCapabilities(compiled, effectiveResult);
const { errors: capabilityErrors, warnings } = validateManifestCapabilities(compiled, effectiveResult);
capabilityWarnings = warnings;
if (capabilityErrors.length > 0) {
throw Object.assign(
new Error(capabilityErrors.map(error => error.message).join('; ')),
{ code: 'unsupported_capability', capability_errors: capabilityErrors }
);
}
if (capabilityWarnings.length > 0) {
for (const warning of capabilityWarnings) {
Comment thread
amittell marked this conversation as resolved.
process.stderr.write(`warning: ${warning.message}\n`);
}
}
}
const effectiveFeatures = effectiveResult.features;

Expand Down Expand Up @@ -455,6 +462,7 @@ export async function applyManifestToScheduler(
source: effectiveResult.source,
negotiated: effectiveResult.negotiated,
handoff_version: effectiveResult.handoff_version || null,
...(capabilityWarnings?.length > 0 ? { warnings: capabilityWarnings } : {}),
},
handoff: {
field_version: handoffVersion,
Expand Down
55 changes: 45 additions & 10 deletions src/capabilities.js
Original file line number Diff line number Diff line change
Expand Up @@ -67,21 +67,27 @@ export function resolveEffectiveFeatures(targetName, runtimeCapabilities) {

/**
* Check if the manifest's requirements are satisfied by the effective features.
* Returns an array of mismatch errors (empty if all satisfied).
* Returns { errors: [...], warnings: [...] } where errors are hard gates and
* warnings are soft advisories that do not block apply.
*/
export function validateManifestCapabilities(compiledOutput, effectiveFeatures) {
const errors = [];
const warnings = [];
const features = effectiveFeatures.features || effectiveFeatures;

if (!compiledOutput || !compiledOutput.jobs) return errors;
if (!compiledOutput || !compiledOutput.jobs) return { errors, warnings };

// Apply-time gating is intentionally limited to features that must exist to
// persist or hand off the compiled durable job spec. Runtime identity
// resolution, delegation validation, and child_credential_policy remain
// execution-time concerns: persisted identity declarations may already be
// sufficient for dispatch, delegation chains are only known after a concrete
// session is resolved, and child_credential_policy is a runtime column that
// all v23+ schedulers accept regardless of whether providers are loaded.
// Hard gates: features that must exist to persist or hand off the compiled
// durable job spec. These block apply when absent.
//
// Soft warnings: runtime_identity_resolution and credential_handoff (for
// child_credential_policy) are checked here as advisories. Persisted identity
// declarations may already be sufficient for dispatch and
// child_credential_policy is a runtime column that all v23+ schedulers accept
// regardless of whether providers are loaded, but a missing runtime feature
// means execution will fail -- so we surface the gap early. Delegation
// validation remains execution-time only (chains are only known after a
// concrete session is resolved).
for (const job of compiledOutput.jobs) {
// Check authorization hook requirement
if (job.authorization || job.authorization_ref) {
Expand Down Expand Up @@ -130,7 +136,36 @@ export function validateManifestCapabilities(compiledOutput, effectiveFeatures)
});
}
}

// Soft warning: identity with a real provider requires runtime resolution
const identityProvider = job.identity?.provider ?? null;
if (identityProvider && identityProvider !== 'none') {
if (!features.runtime_identity_resolution) {
warnings.push({
code: 'capability_warning',
feature: 'runtime_identity_resolution',
required_by: `job "${job.name || job.id}"`,
message: `Job "${job.name || job.id}" declares identity provider "${identityProvider}" but the runtime does not support runtime_identity_resolution; credentials will not resolve at execution time`,
});
}
}

// Soft warning: downscope policy requires credential_handoff at runtime.
// Note: identity.presentation.handoff (above) is a hard error because
// the scheduler cannot persist the handoff contract without the feature.
// child_credential_policy is a soft warning because the column is always
// accepted by v23+ schedulers -- enforcement happens at dispatch time.
if (job.child_credential_policy === 'downscope') {
Comment thread
amittell marked this conversation as resolved.
if (!features.credential_handoff) {
warnings.push({
code: 'capability_warning',
feature: 'credential_handoff',
required_by: `job "${job.name || job.id}"`,
message: `Job "${job.name || job.id}" declares child_credential_policy="downscope" but the runtime does not support credential_handoff; child credential scoping will not be enforced`,
});
}
}
}

return errors;
return { errors, warnings };
}
3 changes: 2 additions & 1 deletion src/cli.js
Original file line number Diff line number Diff line change
Expand Up @@ -281,14 +281,15 @@ export async function runCli(
const caps = querySchedulerCapabilities(runner);
const effective = resolveEffectiveFeatures('openclaw-scheduler', caps);
const compiled = getTarget('openclaw-scheduler').compile(manifest);
const compatibilityErrors = validateManifestCapabilities(compiled, effective);
const { errors: compatibilityErrors, warnings: compatibilityWarnings } = validateManifestCapabilities(compiled, effective);
return formatOutput({
ok: true,
capabilities: caps,
effective: effective,
compatibility: {
ok: compatibilityErrors.length === 0,
errors: compatibilityErrors,
warnings: compatibilityWarnings,
},
}, { mode: outputMode, pretty });
}
Expand Down
5 changes: 5 additions & 0 deletions src/compiler/openclaw-scheduler.js
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,11 @@ export function compileManifestToScheduler(manifest, { includeExplain = false }

child_credential_policy: plan.child_credential_policy ?? null,

// verify fields
verify_shell: plan.verify?.shell ?? null,
verify_timeout_s: plan.verify?.timeout_seconds ?? null,
verify_on_failure: plan.verify?.on_failure ?? null,

delete_after_run: plan.delete_after_run ? 1 : 0
};
validateSchedulerStringLimits(targetErrors, taskPath, job);
Expand Down
14 changes: 14 additions & 0 deletions src/compiler/shared.js
Original file line number Diff line number Diff line change
Expand Up @@ -507,11 +507,24 @@ function resolveBudgets(task) {
};
}

export function resolveVerify(workflow, task) {
const workflowVerify = workflow.verify || null;
const taskVerify = task.verify || null;
if (!taskVerify && !workflowVerify) return null;
const effective = taskVerify || workflowVerify;
return {
shell: effective.shell,
timeout_seconds: effective.timeout_seconds ?? 30,
on_failure: effective.on_failure ?? 'error',
};
}

export function normalizedTaskPlan(workflow, task, taskIdToCompiledId) {
const modelPolicy = resolveModelPolicy(workflow, task);
const identity = resolveIdentity(workflow, task);
const contract = resolveContract(workflow, task);
const childCredentialPolicy = resolveChildCredentialPolicy(workflow, task);
const verify = resolveVerify(workflow, task);
const intent = resolveIntent(task);
const output = resolveOutput(task);
const budgets = resolveBudgets(task);
Expand Down Expand Up @@ -562,6 +575,7 @@ export function normalizedTaskPlan(workflow, task, taskIdToCompiledId) {
authorization: resolveAuthorization(workflow, task),
evidence: resolveEvidence(workflow, task),
child_credential_policy: childCredentialPolicy,
verify,
delete_after_run: task.delete_after_run ?? null,
parent_compiled_id: task.trigger ? taskIdToCompiledId.get(task.trigger.parent) : null,
};
Expand Down
Loading