Add performance benchmarks and review report by rjrodger · Pull Request #10 · jsonicjs/yaml

rjrodger · 2026-04-30T21:35:39Z

Introduces a reproducible benchmark and profiling harness for both the TS and Go parsers, plus a markdown report (bench/REPORT.md) documenting the hotspots found by review and the measurements that confirm them.

Layout:
bench/fixtures/generate.mjs synthetic fixture generator (11 shapes)
bench/ts/bench.mjs per-fixture sweep + optional cpuprofile
bench/ts/scaling.mjs per-byte scaling sweep (N = 100..4000)
go/bench_test.go one Benchmark per fixture
go/scaling_test.go scaling sweep under go test -run
bench/README.md how to run / reproduce
bench/REPORT.md findings, profiles, recommendations

Key findings measured (see REPORT.md for detail):

O(N^2) flow-context rescan in the plain-scalar handler (src/yaml.ts:922, go/yaml.go:1342): cache missing in Go, only partial in TS.
JSON-based deep copy for alias resolution in both languages.
regexp.MustCompile inside the Go lexer hot path.
Character-by-character string concatenation in block scalar folding and continuation-line scanning.
700k+ allocations per 52KB parse in Go, driven by unsized maps and per-token boxing.

Generated artifacts (fixtures, .cpuprof, .cpuprofile, ts-*.json, *.log) are gitignored — regenerate with:
node bench/fixtures/generate.mjs
node bench/ts/bench.mjs
(cd go && go test -bench=. -benchmem -run=^$ .)

Introduces a reproducible benchmark and profiling harness for both the TS and Go parsers, plus a markdown report (bench/REPORT.md) documenting the hotspots found by review and the measurements that confirm them. Layout: bench/fixtures/generate.mjs synthetic fixture generator (11 shapes) bench/ts/bench.mjs per-fixture sweep + optional cpuprofile bench/ts/scaling.mjs per-byte scaling sweep (N = 100..4000) go/bench_test.go one Benchmark per fixture go/scaling_test.go scaling sweep under `go test -run` bench/README.md how to run / reproduce bench/REPORT.md findings, profiles, recommendations Key findings measured (see REPORT.md for detail): - O(N^2) flow-context rescan in the plain-scalar handler (src/yaml.ts:922, go/yaml.go:1342): cache missing in Go, only partial in TS. - JSON-based deep copy for alias resolution in both languages. - regexp.MustCompile inside the Go lexer hot path. - Character-by-character string concatenation in block scalar folding and continuation-line scanning. - 700k+ allocations per 52KB parse in Go, driven by unsized maps and per-token boxing. Generated artifacts (fixtures, .cpuprof, .cpuprofile, ts-*.json, *.log) are gitignored — regenerate with: node bench/fixtures/generate.mjs node bench/ts/bench.mjs (cd go && go test -bench=. -benchmem -run=^\$ .)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbd10e1031

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-30T21:38:40Z

+  for (let i = 0; i < iters; i++) {
+    const parser = makeParser()
+    const t0 = performance.now()
+    try { parser(src) } catch { /* fall through; still report wall time */ }


Fail scaling runs on parse errors instead of timing them

The scaling harness currently swallows parser exceptions and still records elapsed time, so a broken parse at any size/shape is reported as a valid performance datapoint. In practice this can silently corrupt the benchmark output (including slope calculations) by mixing failure paths with successful parses, which undermines the report’s conclusions. Please surface the error (or at least mark that row as failed and exclude it from metrics) rather than continuing as if the parse succeeded.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-30T21:38:40Z

+// TestScaling is a "benchmark" that runs under go test so we can see the
+// growth curve for each input shape. Not a pass/fail test.
+func TestScaling(t *testing.T) {


Keep scaling benchmark out of default go test execution

Declaring the scaling sweep as TestScaling makes it run on every go test ./... invocation, even though this function is intentionally non-assertive and exists only for timing output. Since the repo workflow runs go test -v ./... in CI, this introduces repeated performance benchmarking work into normal correctness runs, increasing job time and flakiness risk without improving test signal. This should be moved to a benchmark (Benchmark...) or gated behind an explicit opt-in flag/env var.

Useful? React with 👍 / 👎.

claude and others added 2 commits April 21, 2026 15:53

Merge branch 'main' into claude/optimize-yaml-parsing-IItaC

6a689e7

rjrodger merged commit f0f67db into main Apr 30, 2026
3 of 6 checks passed

chatgpt-codex-connector Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add performance benchmarks and review report#10

Add performance benchmarks and review report#10
rjrodger merged 2 commits into
mainfrom
claude/optimize-yaml-parsing-IItaC

rjrodger commented Apr 30, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rjrodger commented Apr 30, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants