Skip to content

perf: speed up constant string joins#825

Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/constant-array-join
May 11, 2026
Merged

perf: speed up constant string joins#825
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/constant-array-join

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 7, 2026

Motivation:

std.join over repeated constant string arrays (for example std.join('', std.repeat(['e'], n))) currently walks every element and repeatedly appends the same separator/value pair. This PR keeps the optimization focused on that repeated/constant-array case while preserving Jsonnet laziness and null-skipping behavior.

Key Design Decision

Expose a best-effort constantEval from array views and use it only when std.join can prove every element is the same already-known string or null. If the element is lazy, non-string, or otherwise unsafe to inspect eagerly, the code falls back to the existing element-by-element join path.

Modification:

  • Add constant/repeated array views for repeated values.
  • Fast-path string std.join for constant/repeated string arrays with exact result sizing.
  • Preserve existing pre-sized array-copy behavior for array separators.
  • Add directional golden coverage for repeated strings, Unicode strings, null skipping, array joins, and zero-length lazy-error behavior.

Benchmark Results:

JMH on JDK 21, ./mill -j 1 bench.runRegressions ..., lower ms/op is better; ops/ms = 1 / ms_per_op, higher is better.

Case master ms/op PR ms/op master ops/ms PR ops/ms Delta
cpp_suite/bench.09.jsonnet 0.041 0.037 24.39 27.03 +10.8% ops/ms
cpp_suite/large_string_join.jsonnet 0.547 0.252 1.83 3.97 +117.1% ops/ms
sjsonnet_suite/array_copy_views.jsonnet guard rerun 7.770 7.677 0.129 0.130 +1.2% ops/ms

Scala Native hyperfine, lower is better, comparing locally built Scala Native binaries and source-built jrsonnet 0.5.0-pre98.

Case master PR jrsonnet Result
scaled repeated join (std.repeat(['abcd'], 50000)) 8.1 ± 1.2 ms 6.9 ± 1.0 ms 4.8 ± 1.2 ms PR is ~14.8% faster than master; still ~1.44x jrsonnet
cpp_suite/large_string_join.jsonnet 7.0 ± 1.2 ms 5.5 ± 1.0 ms 5.2 ± 0.9 ms PR is ~21.4% faster than master; roughly at jrsonnet parity
cpp_suite/bench.09.jsonnet tiny CLI run 6.5 ± 1.7 ms 8.3 ± 1.3 ms 5.7 ± 0.8 ms startup/noise dominated; superseded by scaled repeated-join run above

Analysis:

The semantic guard is intentionally conservative: only already-constant string/null elements take the new direct materialization path. Lazy errors are not forced early, as covered by std.join('-', std.repeat([error 'unused'], 0)). Non-constant arrays and array-separator joins retain the existing master behavior.

References:

Result:

  • ./mill -j 1 __.reformat && ./mill -j 1 __.test passed locally.
  • PR comments were reviewed; there are no review threads. The maintainer request was to rebase this PR one-at-a-time, which this update addresses.

@He-Pin He-Pin marked this pull request as draft May 7, 2026 19:20
@He-Pin He-Pin marked this pull request as ready for review May 7, 2026 19:53
@He-Pin He-Pin force-pushed the perf/constant-array-join branch from 7dd77cb to 9e7c7f1 Compare May 7, 2026 19:54
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 8, 2026

Closing as superseded by #828, which carries the TOML and repeated join work with current docs-aligned benchmark data.

@He-Pin He-Pin closed this May 8, 2026
@He-Pin He-Pin reopened this May 8, 2026
@He-Pin He-Pin marked this pull request as draft May 8, 2026 05:21
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 8, 2026

Reopened and moved back to draft rather than closing. It is not negative, but it is currently superseded by #828, so it should not be ready for review unless we decide to keep it as a separate smaller join-only PR.

@He-Pin He-Pin marked this pull request as ready for review May 8, 2026 05:22
@He-Pin He-Pin marked this pull request as draft May 8, 2026 05:23
@He-Pin He-Pin marked this pull request as ready for review May 8, 2026 05:23
@He-Pin He-Pin marked this pull request as draft May 8, 2026 05:23
@He-Pin He-Pin marked this pull request as ready for review May 8, 2026 05:24
@stephenamar-db
Copy link
Copy Markdown
Collaborator

rebase

@He-Pin He-Pin force-pushed the perf/constant-array-join branch from 9e7c7f1 to ca2d708 Compare May 9, 2026 19:17
@stephenamar-db
Copy link
Copy Markdown
Collaborator

Let's do it one at a time here - please rebase this one and I'll merge

@He-Pin He-Pin force-pushed the perf/constant-array-join branch 2 times, most recently from 8e8401c to ca2d708 Compare May 10, 2026 04:37
Motivation:

std.join over std.repeat([string], n) currently walks every repeated element and repeatedly appends the same string/separator pattern.

Modification:

Teach repeated/constant arrays to expose a constant Eval and let std.join materialize repeated string results directly while preserving null skipping and lazy error behavior. Keep the existing array-join copy paths for non-string separators.

Result:

Adds directional golden coverage for repeated string, unicode string, null, array, and zero-length lazy-error join cases.

References:

PR: databricks#825
@He-Pin He-Pin force-pushed the perf/constant-array-join branch from ca2d708 to ce421ff Compare May 10, 2026 07:17
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 10, 2026

Rebased onto latest master and refreshed the local validation/benchmark data in the PR description. I also added directional golden coverage for repeated constant joins.

@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented May 10, 2026

@stephenamar-db I have rebased ,sorry for many conflicts

@stephenamar-db stephenamar-db merged commit 8602c69 into databricks:master May 11, 2026
5 checks passed
@He-Pin He-Pin deleted the perf/constant-array-join branch May 11, 2026 03:29
stephenamar-db pushed a commit that referenced this pull request May 11, 2026
Motivation:

`std.manifestTomlEx` spends avoidable time and allocation while
rendering TOML: regex matching for bare keys, intermediate escaped
strings, per-value renderer allocation, `Seq` path copies, `partition`,
and repeated sorting during table traversal.

Key Design Decision

Keep this PR TOML-only. The earlier repeated-join work is now handled
separately in #825, so this branch was intentionally reduced to a single
focused TOML rendering commit after rebasing onto latest master.

Modification:

- Write TOML strings and keys directly to the existing `StringWriter`.
- Replace bare-key regex matching with a small ASCII scanner.
- Avoid string concatenations and unnecessary `StringWriter.flush()`
calls while rendering arrays.
- Reuse a `TomlRenderer` per table level and keep table paths in a
mutable buffer.
- Use sorted visible keys plus section flags instead of `partition` plus
extra sorting.
- Add directional golden coverage for TOML key escaping, non-section
ordering, nested tables, and table arrays.

Benchmark Results:

JMH on JDK 21, `./mill -j 1 bench.runRegressions
bench/resources/go_suite/manifestTomlEx.jsonnet`, lower `ms/op` is
better; `ops/ms = 1 / ms_per_op`, higher is better.

| Case | master ms/op | PR ms/op | master ops/ms | PR ops/ms | Delta |
| --- | ---: | ---: | ---: | ---: | ---: |
| `go_suite/manifestTomlEx.jsonnet` | 0.067 | 0.053 | 14.93 | 18.87 |
+26.4% ops/ms |

Scala Native hyperfine, lower is better, comparing locally built Scala
Native binaries and source-built `jrsonnet 0.5.0-pre98`.

| Case | master | PR | jrsonnet | Result |
| --- | ---: | ---: | ---: | --- |
| `go_suite/manifestTomlEx.jsonnet` | 5.3 ± 1.3 ms | 5.3 ± 1.3 ms | 4.0
± 1.1 ms | startup/noise dominated; PR and master are neutral |
| scaled TOML manifestation loop | 29.0 ± 1.6 ms | 12.1 ± 1.0 ms | 10.0
± 0.9 ms | PR is ~58.3% faster than master; still ~1.21x jrsonnet |

Analysis:

The output remains compatible with current sjsonnet behavior: local
master and this PR produced identical output on the benchmark inputs and
the new directional TOML golden test. The scaled Native comparison
includes jrsonnet as a performance reference, but jrsonnet's TOML
formatting differs from sjsonnet on these inputs, so
sjsonnet-vs-sjsonnet output equality is the semantic gate.

References:

- PR: #828
- Rebased head: `3efa453a5b6f48d6dfd0e1cd034922d6c8e2ff03`
- Base: `0ae7b78a93c4e643f9bcfb6dd1d99d9fe7a522a9`
- Split dependency: #825 now owns the repeated constant join
optimization.

Result:

- `./mill --no-server -j 1 __.reformat` passed locally.
- `./mill --no-server -j 1 __.test` passed locally.
- PR comments were reviewed; there are no review threads. The maintainer
rebase request is addressed by this force-pushed TOML-only branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants