Skip to content

[codex] Add VM benchmark baselines#444

Merged
proboscis merged 1 commit into
mainfrom
issue/vm-perf-bench-v2/run-20260611-185233
Jun 11, 2026
Merged

[codex] Add VM benchmark baselines#444
proboscis merged 1 commit into
mainfrom
issue/vm-perf-bench-v2/run-20260611-185233

Conversation

@proboscis

Copy link
Copy Markdown
Owner

Summary

Adds additive benchmark baselines for vm-perf-bench-v2 without changing VM runtime code.

  • Added Criterion dispatch benches under packages/doeff-vm-core/benches/ for perform/resume at depth 1 and depth 8 reperform, Resume vs Transfer, fiber create/return, and raw step throughput.
  • Expanded benchmarks/benchmark_runner.py into Python end-to-end benchmark cases with JSON output, smoke mode, and baseline comparison.
  • Committed one baseline JSON from this machine at benchmarks/results/20260611-CA-20035844.json.
  • Added make bench-smoke and a Python 3.12 CI smoke step with no performance gating.
  • Documented run and comparison commands in README.md.

Reference: vm-perf-bench-v2.

Python Baseline

Machine: CA-20035844, macOS arm64, Python 3.14.3, 20 runs.

Benchmark Mean ms/run Unit Mean us/unit Throughput/s
run_trivial 0.001 run 1.385 721794.7
run_state_get_put_loop 0.561 iteration 5.614 178129.4
run_reader_ask_loop 0.310 iteration 3.104 322206.2
spawn_gather_100 16.796 task 167.960 5953.8
spawn_gather_1000 145.457 task 145.457 6874.9
await_sleep_0_round_trip 7.819 await 78.195 12788.6
python_callable_boundary 2.595 call 2.595 385317.5

Verification

  • make bench-smoke passed.
  • uv run python benchmarks/benchmark_runner.py --runs 20 passed and wrote benchmarks/results/20260611-CA-20035844.json.
  • PYO3_PYTHON=/Users/s22625/.orch/worktrees/vm-perf-bench-v2/6f8ba6_codex_20260611-185233/.venv/bin/python cargo bench --features python_bridge --bench dispatch passed.
  • uv run python benchmarks/benchmark_runner.py --smoke --no-output --compare benchmarks/results/20260611-CA-20035844.json passed.
  • uv run ruff check benchmarks/benchmark_runner.py tests/test_benchmark_runner.py passed.
  • uv run ruff format --check benchmarks/benchmark_runner.py tests/test_benchmark_runner.py passed.
  • uv run pytest passed: 838 passed, 84 skipped.

Note: cargo fmt --check across packages/doeff-vm-core still reports pre-existing formatting diffs in src/vm/step.rs; this PR only rustfmt-formatted the new bench file (rustfmt --check packages/doeff-vm-core/benches/dispatch.rs passed).

@proboscis proboscis marked this pull request as ready for review June 11, 2026 10:34
@proboscis proboscis merged commit 80bea23 into main Jun 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant