Skip to content

perf: use ASCII bitsets for strip chars#789

Draft
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/strip-ascii-bitset
Draft

perf: use ASCII bitsets for strip chars#789
He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin:perf/strip-ascii-bitset

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 26, 2026

Motivation:

std.stripChars, std.lstripChars, and std.rstripChars already avoid boxed Set[Int] for single/BMP delimiter sets on current master. This draft keeps that Unicode-aware path and narrows the ASCII-only multi-character delimiter case by replacing the per-call java.util.BitSet allocation with primitive masks.

Key Design Decision:

  • ASCII-only delimiter sets use primitive masks before the BMP fallback.
  • JVM and Scala.js use four Int masks; Scala Native uses two Long masks.
  • Non-ASCII BMP delimiter sets keep the upstream BitSet fast path.
  • Surrogate-containing delimiter sets still fall back to codepoint iteration.
  • chars is forced before str, preserving existing error evaluation order.

Modification:

  • Add platform-specific ASCII strip mask helpers.
  • Route stripChars, lstripChars, and rstripChars through the ASCII mask path when chars contains only ASCII delimiters.
  • Keep existing Unicode/BMP/codepoint behavior unchanged.
  • Extend strip tests for ASCII, Unicode fallback, mask boundaries, and evaluation order.

Benchmark Results:

Rebased onto upstream/master at 0ae7b78a93c4e643f9bcfb6dd1d99d9fe7a522a9.

JMH command:

./mill --no-server -j 1 bench.runRegressions \
  bench/resources/go_suite/lstripChars.jsonnet \
  bench/resources/go_suite/rstripChars.jsonnet \
  bench/resources/go_suite/stripChars.jsonnet \
  bench/resources/cpp_suite/bench.09.jsonnet
Case master ms/op PR ms/op Result
lstripChars.jsonnet 0.113 / 0.115 0.115 / 0.113 mixed/no stable delta
rstripChars.jsonnet 0.116 / 0.116 0.121 / 0.115 mixed/no stable delta
stripChars.jsonnet 0.112 / 0.119 0.115 / 0.117 mixed/no stable delta
bench.09.jsonnet 0.041 / 0.042 0.042 / 0.042 neutral

Scala Native hyperfine, original go_suite strip files, 50 runs, compared against source-built jrsonnet 0.5.0-pre98:

Case master PR jrsonnet Result
lstripChars.jsonnet 6.7 +/- 2.4 ms 2.7 +/- 1.3 ms 1.1 +/- 1.0 ms too short/noisy, PR faster in this run
rstripChars.jsonnet 5.7 +/- 1.2 ms 3.2 +/- 0.6 ms 4.8 +/- 2.1 ms too short/noisy, PR faster in this run
stripChars.jsonnet 6.0 +/- 2.3 ms 3.9 +/- 1.2 ms 3.4 +/- 1.7 ms too short/noisy, PR faster in this run

Scaled temporary Native strip stress cases, 30 runs, --shell=none, outputs checked equal across master/PR/jrsonnet:

Case master PR jrsonnet Result
lstrip-scaled.jsonnet 8.0 +/- 3.0 ms 7.1 +/- 0.5 ms 143.9 +/- 3.6 ms small PR improvement
rstrip-scaled.jsonnet 7.9 +/- 0.6 ms 7.8 +/- 0.5 ms 143.2 +/- 7.0 ms neutral/slightly positive
strip-scaled.jsonnet 7.6 +/- 0.7 ms 7.3 +/- 0.9 ms 142.6 +/- 4.3 ms small PR improvement

Analysis:

Current master already has the main BMP BitSet strip fast path, so this PR no longer closes a visible go_suite strip gap on JVM JMH. The remaining value is narrower: a GC-friendly ASCII delimiter cleanup that can be slightly positive in scaled Native strip stress cases, while original go_suite cases are too small/noisy to justify moving this out of draft.

References:

Result:

Keep as draft. This is safe after rebase and tests, but it should not be marked ready unless a future benchmark shows a stable user-visible improvement or we explicitly decide the GC/allocation cleanup is worth merging on its own.

@He-Pin He-Pin marked this pull request as draft April 26, 2026 17:53
@He-Pin He-Pin force-pushed the perf/strip-ascii-bitset branch from 9004941 to 7f64a1d Compare May 7, 2026 19:49
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 8, 2026

Closing as low priority after rebase: current master already has the main strip-char fast path, and this PR shows no stable standout delta on the go_suite strip docs cases or Native hyperfine.

@He-Pin He-Pin closed this May 8, 2026
@He-Pin He-Pin reopened this May 8, 2026
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 8, 2026

Reopened. This is not negative; keep it as draft as a small GC-friendly strip delimiter cleanup. Current docs strip benchmarks do not show standout gains, so it should not be moved to ready yet.

@He-Pin He-Pin force-pushed the perf/strip-ascii-bitset branch 2 times, most recently from f569b54 to 7f64a1d Compare May 10, 2026 04:38
Port the useful part of He-Pin/sjsonnet jit commit dd90b11 onto current master. The new path avoids Set[Int] allocation for ASCII-only stripChars, lstripChars, and rstripChars calls, preserves the existing codepoint-aware fallback for non-ASCII strip sets, and uses platform-specific masks selected from local measurements.

Upstream-source: dd90b11
@He-Pin He-Pin force-pushed the perf/strip-ascii-bitset branch from 7f64a1d to 4addea1 Compare May 10, 2026 09:38
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 10, 2026

Rebased to latest upstream/master (0ae7b78a). Full __.reformat + __.test passed locally. Fresh JMH is mixed/no stable positive signal; Native scaled strip stress cases are slightly positive, so I kept this as draft rather than moving it ready. PR body has the updated benchmark table and current decision.

Motivation:
The ASCII bitset optimization made stripChars, lstripChars, and rstripChars validate chars before str, contradicting the builtin call path and changing observable error ordering.

Modification:
Restore str-before-chars extraction for all three strip helpers while keeping the optimized strip implementation. Update the regression test to assert the existing first-argument error behavior.

Result:
The optimization keeps existing builtin semantics for type and error ordering.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant