[codex] Add VM benchmark baselines by proboscis · Pull Request #444 · proboscis/doeff

proboscis · 2026-06-11T10:07:50Z

Summary

Adds additive benchmark baselines for vm-perf-bench-v2 without changing VM runtime code.

Added Criterion dispatch benches under packages/doeff-vm-core/benches/ for perform/resume at depth 1 and depth 8 reperform, Resume vs Transfer, fiber create/return, and raw step throughput.
Expanded benchmarks/benchmark_runner.py into Python end-to-end benchmark cases with JSON output, smoke mode, and baseline comparison.
Committed one baseline JSON from this machine at benchmarks/results/20260611-CA-20035844.json.
Added make bench-smoke and a Python 3.12 CI smoke step with no performance gating.
Documented run and comparison commands in README.md.

Reference: vm-perf-bench-v2.

Python Baseline

Machine: CA-20035844, macOS arm64, Python 3.14.3, 20 runs.

Benchmark	Mean ms/run	Unit	Mean us/unit	Throughput/s
run_trivial	0.001	run	1.385	721794.7
run_state_get_put_loop	0.561	iteration	5.614	178129.4
run_reader_ask_loop	0.310	iteration	3.104	322206.2
spawn_gather_100	16.796	task	167.960	5953.8
spawn_gather_1000	145.457	task	145.457	6874.9
await_sleep_0_round_trip	7.819	await	78.195	12788.6
python_callable_boundary	2.595	call	2.595	385317.5

Verification

make bench-smoke passed.
uv run python benchmarks/benchmark_runner.py --runs 20 passed and wrote benchmarks/results/20260611-CA-20035844.json.
PYO3_PYTHON=/Users/s22625/.orch/worktrees/vm-perf-bench-v2/6f8ba6_codex_20260611-185233/.venv/bin/python cargo bench --features python_bridge --bench dispatch passed.
uv run python benchmarks/benchmark_runner.py --smoke --no-output --compare benchmarks/results/20260611-CA-20035844.json passed.
uv run ruff check benchmarks/benchmark_runner.py tests/test_benchmark_runner.py passed.
uv run ruff format --check benchmarks/benchmark_runner.py tests/test_benchmark_runner.py passed.
uv run pytest passed: 838 passed, 84 skipped.

Note: cargo fmt --check across packages/doeff-vm-core still reports pre-existing formatting diffs in src/vm/step.rs; this PR only rustfmt-formatted the new bench file (rustfmt --check packages/doeff-vm-core/benches/dispatch.rs passed).

Add VM benchmark baselines

c395253

proboscis marked this pull request as ready for review June 11, 2026 10:34

proboscis merged commit 80bea23 into main Jun 11, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add VM benchmark baselines#444

[codex] Add VM benchmark baselines#444
proboscis merged 1 commit into
mainfrom
issue/vm-perf-bench-v2/run-20260611-185233

proboscis commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

proboscis commented Jun 11, 2026

Summary

Python Baseline

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant