perf: buffer json file output by He-Pin · Pull Request #870 · databricks/sjsonnet

He-Pin · 2026-05-23T16:19:20Z

Motivation:

The Native stdout buffering follow-up showed that downstream buffering can materially reduce large-output write overhead. JSON -o output still sent ByteRenderer chunks directly to the file output stream, relying only on ByteBuilder's internal flush threshold.

Key Design Decision:

Keep the change local to the JSON output-file fast path. Rather than changing ByteBuilder thresholds globally, wrap the file output stream in a BufferedOutputStream with the same 256 KiB output buffer size used for the Native stdout buffering follow-up. YAML, expect-string, stdout, and renderer semantics stay unchanged.

Modification:

Add OutputBufferSize = 256 * 1024 in SjsonnetMainBase.
Wrap JSON output-file ByteRenderer targets in BufferedOutputStream(out, OutputBufferSize).
Flush the buffered stream at the same completion boundary before closing the underlying file output stream.

Benchmark Results:

Workload: jrsonnet/tests/realworld/entry-kube-prometheus.jsonnet -J vendor -o /tmp/fileout-*.json

Candidate was benchmarked on the Scala Native 0.5.12 stacked exploration branch after the Native stdout buffering commit.

Order	Clean	Candidate	Result
Forward mean	217.372 ms	205.062 ms	-5.7%
Forward median	196.625 ms	183.491 ms	-6.7%
Reverse mean	210.517 ms	177.174 ms	-15.8%
Reverse median	193.394 ms	175.878 ms	-9.1%

Output equality matched by cmp.

Validation:

./mill --no-server --ticker false --color false __.reformat
./mill --no-server --ticker false --color false -j 1 __.test — 444 passed, 0 failed
./mill --no-server --ticker false --color false bench.runRegressions

Analysis:

This preserves the existing rendering pipeline and only changes the buffering layer for file output. It avoids global ByteBuilder threshold changes, keeps stdout behavior separate, and does not affect YAML or expect-string paths.

References:

Native stdout buffering PR: perf: buffer native stdout writes #869
Scala Native 0.5.12 migration PR: chore: upgrade Scala Native to 0.5.12 #867
Related performance stack context: perf(stdlib): skip getBytes(UTF_8) for ASCII-safe base64 inputs #863, perf: bulk-write safe runs in BaseRenderer.escape #864, perf: hand-rolled YAML quote scanner removes fastparse allocations in Native #865, perf: skip UTF-8 encode for clean-ASCII long strings in renderer #866, fix: sort ASCII-safe strings by runtime kind #868

Result:

Large JSON file output writes are buffered more effectively while preserving byte-identical output and the existing flush/close contract.

Motivation: The Native stdout buffering win showed downstream buffering can materially reduce large-output write overhead. JSON outputFile rendering still wrote ByteRenderer chunks directly to the file output stream. Modification: Wrap the JSON outputFile ByteRenderer target in a BufferedOutputStream with a 256 KiB buffer. Keep YAML, expect-string, stdout, and renderer semantics unchanged. Result: Native kube-prometheus file-output A/B on the 0.5.12 stack improved in both orders: forward median 196.625 ms clean vs 183.491 ms candidate; reverse median 175.878 ms candidate vs 193.394 ms clean. Output matched by cmp; reformat, full tests, and bench regressions passed.

He-Pin marked this pull request as ready for review May 23, 2026 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: buffer json file output#870

perf: buffer json file output#870
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/json-file-output-buffering

He-Pin commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant