Skip to content

perf: add ASCII-safe substr fast path#834

Merged
stephenamar-db merged 1 commit into
databricks:masterfrom
He-Pin:split/pr776-ascii-substr
May 11, 2026
Merged

perf: add ASCII-safe substr fast path#834
stephenamar-db merged 1 commit into
databricks:masterfrom
He-Pin:split/pr776-ascii-substr

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 10, 2026

Motivation:
Long printable ASCII strings can use UTF-16 length and substring indexes directly because every UTF-16 code unit is also one Jsonnet codepoint. std.substr currently still pays codepoint offset checks for these proven-safe long literals.

Key Design Decision:
Reuse the existing _asciiSafe runtime marker, but only set it for long (>1024 chars) string literals that are printable ASCII and JSON-render safe. This keeps parser overhead off short-string-heavy workloads and avoids broad runtime ASCII scans. JVM uses byte-level SWAR for the long-string safety scan; Scala Native and Scala.js keep a scalar single-pass scan because the byte-buffer SWAR variant regressed Native due UTF-8 allocation cost.

Modification:

  • Add CharSWAR.isAsciiJsonSafe on JVM, Native, and JS.
  • Use JVM byte-SWAR to check long strings for ASCII + JSON-safe bytes.
  • Mark long ASCII JSON-safe literals in Parser.makeStr.
  • Propagate _asciiSafe through Val.Str.concat when both operands are safe.
  • Let std.length and std.substr use direct String.length / substring only for proven-safe strings.
  • Add Unicode/ASCII regression coverage for long ASCII length, substr bounds, and concat propagation.

Benchmark Results:
JMH, lower is better (./mill --no-server -j 1 bench.runRegressions ...):

benchmark master ms/op this PR ms/op result
go_suite/substr.jsonnet 0.056 0.045 ~20% faster
jdk17_suite/split_resolve.jsonnet 0.147 0.146 neutral
cpp_suite/realistic2.jsonnet 44.529 45.024 neutral/no stable regression signal

Scala Native hyperfine, lower is better (--shell=none --warmup 5 --runs 20):

command mean note
master native go_suite/substr.jsonnet 7.7 ± 1.8 ms baseline
this PR native go_suite/substr.jsonnet 6.5 ± 1.0 ms neutral/slightly faster vs master
jrsonnet go_suite/substr.jsonnet 4.0 ± 0.6 ms still ahead

Analysis:
The broad #776 ASCII/string fast-path ideas were narrowed aggressively. Runtime ASCII scans in std.length/std.substr caused guard regressions, and all-platform UTF-8 byte-SWAR regressed Scala Native. This version uses byte-SWAR only where it measured well (JVM), keeps Native/JS allocation-free, and only takes fast paths when the marker is already proven, preserving Jsonnet codepoint semantics for all non-ASCII strings.

References:

Result:
Draft PR for review. JVM substr benchmark is positive; Native is neutral/slightly positive with no-shell hyperfine; full ./mill --no-server -j 1 __.reformat && ./mill --no-server -j 1 __.test passed locally.

@He-Pin He-Pin marked this pull request as ready for review May 10, 2026 11:50
@He-Pin He-Pin marked this pull request as draft May 10, 2026 11:50
@He-Pin He-Pin marked this pull request as ready for review May 10, 2026 11:51
@He-Pin He-Pin force-pushed the split/pr776-ascii-substr branch from 340215d to a64324e Compare May 10, 2026 12:02
@He-Pin He-Pin marked this pull request as draft May 10, 2026 12:02
@He-Pin He-Pin force-pushed the split/pr776-ascii-substr branch from a64324e to c6cea80 Compare May 10, 2026 12:21
@He-Pin He-Pin marked this pull request as ready for review May 10, 2026 12:36
@stephenamar-db
Copy link
Copy Markdown
Collaborator

rebase

Motivation:
std.substr on long ASCII strings repeatedly pays codepoint-offset scans even when parser-time analysis can prove the literal is printable ASCII and JSON-render safe.

Modification:
Mark long ASCII JSON-safe literals with the existing _asciiSafe flag using a single platform CharSWAR scan, propagate the flag through string concatenation, and let std.length/std.substr use direct UTF-16 length/substring only for proven-safe values. Add UnicodeHandlingTests coverage for long ASCII length/substr boundaries and concat propagation.

Result:
Focused JVM JMH improves go_suite/substr from 0.056 ms/op to 0.046-0.047 ms/op with split_resolve unchanged and realistic2 in the same noise range. Scala Native hyperfine is neutral against master on the same case.

References:
Extracted from ideas in databricks#776, especially commit a190a80 (ASCII fast paths and asciiSafe propagation), narrowed to avoid the broader join/parseInt changes.
@He-Pin He-Pin force-pushed the split/pr776-ascii-substr branch from c6cea80 to 35937b9 Compare May 11, 2026 04:01
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 11, 2026

@stephenamar-db rebased

@stephenamar-db stephenamar-db merged commit 50acf20 into databricks:master May 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants