Skip to content

[codex] Add CLI model option pass-through#84

Open
rawwerks wants to merge 7 commits into
mainfrom
codex/cli-model-options
Open

[codex] Add CLI model option pass-through#84
rawwerks wants to merge 7 commits into
mainfrom
codex/cli-model-options

Conversation

@rawwerks
Copy link
Copy Markdown
Contributor

@rawwerks rawwerks commented May 15, 2026

Summary

  • Adds forwarded CLI controls for prose run --model ... --reasoning-effort ....
  • Passes model controls through shared harness run options without adding them to the canonical OpenProse prompt.
  • Forwards Codex controls to ThreadOptions.model and ThreadOptions.modelReasoningEffort.
  • Adds Codex environment fallbacks: PROSE_CODEX_MODEL and PROSE_CODEX_REASONING_EFFORT.
  • Forwards Claude controls to model and effort through the Claude Agent SDK.
  • Updates the real harness smoke workflow to discover current provider model IDs and effort support before exercising the flags.
  • Documents usage and expands CLI/harness/smoke-discovery test coverage.

Use Case / Run Evidence

This came from a current OpenProse/Codex session where prose run could select --harness but could not select the underlying Codex model or reasoning effort, even though @openai/codex-sdk exposes those thread options.

The user-visible friction was simple: model selection should be a thin pass-through to the selected harness, not a missing feature or a Contract Markdown semantic.

Local run receipt: runs/20260515-102114-8cd0b6.

Design Boundary

This belongs in tools/cli/ because it is shell/harness forwarding behavior. It does not change Contract Markdown, Forme, Prose VM semantics, or stdlib contracts.

The implementation keeps OpenProse harness/model agnostic by default: the CLI forwards controls only to harness adapters that know how to apply them. Codex receives model / modelReasoningEffort; Claude receives model / effort.

CI Model Discovery

The real harness smoke no longer hardcodes exact provider model names.

  • Codex smoke discovers the current Codex catalog with codex debug models and selects an API-supported model that advertises reasoning-effort support.
  • Claude smoke discovers models through Anthropic's Models API and selects a model whose capabilities match the current Claude Code effort path. The current native path still emits enabled thinking when effort is set, so CI intentionally selects models compatible with that behavior.
  • Repo variables may pin or filter selection when needed: PROSE_SMOKE_CODEX_MODEL, PROSE_SMOKE_CODEX_MODEL_PATTERN, PROSE_SMOKE_CLAUDE_MODEL, and PROSE_SMOKE_CLAUDE_MODEL_PATTERN.

Examples

prose run std/evals/inspector --model gpt-5.4 --reasoning-effort high
PROSE_CODEX_MODEL=gpt-5.4-mini PROSE_CODEX_REASONING_EFFORT=low prose run std/evals/inspector

The canonical prompt remains a Prose command, without model-control flags being treated as caller inputs:

prose run flow.prose.md

Testing

  • cd tools/cli && npm test: 254 tests passed
  • cd tools/cli && npm test -- tests/scripts/smoke-harness.test.mjs tests/harnesses/harnesses.test.ts: 17 tests passed
  • cd tools/cli && npm run typecheck: passed
  • cd tools/cli && npm run build: passed
  • node tools/cli/scripts/smoke-harness.mjs --help: passed
  • ruby -e 'require "yaml"; YAML.load_file(".github/workflows/cli-real-harness-smoke.yml")': passed
  • git diff --check: passed
  • GitHub Actions on d6b8fc0: CLI Real Harness Smoke passed for both codex-sdk and claude-sdk
  • GitHub Actions on d6b8fc0: CLI Release Check, OpenProse Smoke, and CodeQL passed
  • safe-push origin codex/cli-model-options: reviewer gate approved and branch pushed

Audit Notes

A read-only subagent audit found two smoke-discovery issues: Claude discovery was checking the wrong thinking capability for the live path, and it allowed xhigh even though the Claude harness rejects it. This PR now has targeted smoke-discovery tests for both cases.

Residual Risk / Follow-ups

  • Local real-provider smoke could not run because this machine has no OPENAI_API_KEY or ANTHROPIC_API_KEY; live provider validation was completed by GitHub Actions with repository secrets.
  • Future work can decide whether model controls should also affect local deterministic commands that later launch activations indirectly, such as serve.

@rawwerks rawwerks marked this pull request as ready for review May 15, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant