Skip to content

perf: buffer json file output#870

Open
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/json-file-output-buffering
Open

perf: buffer json file output#870
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/json-file-output-buffering

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 23, 2026

Motivation:

The Native stdout buffering follow-up showed that downstream buffering can materially reduce large-output write overhead. JSON -o output still sent ByteRenderer chunks directly to the file output stream, relying only on ByteBuilder's internal flush threshold.

Key Design Decision:

Keep the change local to the JSON output-file fast path. Rather than changing ByteBuilder thresholds globally, wrap the file output stream in a BufferedOutputStream with the same 256 KiB output buffer size used for the Native stdout buffering follow-up. YAML, expect-string, stdout, and renderer semantics stay unchanged.

Modification:

  • Add OutputBufferSize = 256 * 1024 in SjsonnetMainBase.
  • Wrap JSON output-file ByteRenderer targets in BufferedOutputStream(out, OutputBufferSize).
  • Flush the buffered stream at the same completion boundary before closing the underlying file output stream.

Benchmark Results:

Workload: jrsonnet/tests/realworld/entry-kube-prometheus.jsonnet -J vendor -o /tmp/fileout-*.json

Candidate was benchmarked on the Scala Native 0.5.12 stacked exploration branch after the Native stdout buffering commit.

Order Clean Candidate Result
Forward mean 217.372 ms 205.062 ms -5.7%
Forward median 196.625 ms 183.491 ms -6.7%
Reverse mean 210.517 ms 177.174 ms -15.8%
Reverse median 193.394 ms 175.878 ms -9.1%

Output equality matched by cmp.

Validation:

  • ./mill --no-server --ticker false --color false __.reformat
  • ./mill --no-server --ticker false --color false -j 1 __.test — 444 passed, 0 failed
  • ./mill --no-server --ticker false --color false bench.runRegressions

Analysis:

This preserves the existing rendering pipeline and only changes the buffering layer for file output. It avoids global ByteBuilder threshold changes, keeps stdout behavior separate, and does not affect YAML or expect-string paths.

References:

Result:

Large JSON file output writes are buffered more effectively while preserving byte-identical output and the existing flush/close contract.

Motivation:
The Native stdout buffering win showed downstream buffering can materially reduce large-output write overhead. JSON outputFile rendering still wrote ByteRenderer chunks directly to the file output stream.

Modification:
Wrap the JSON outputFile ByteRenderer target in a BufferedOutputStream with a 256 KiB buffer. Keep YAML, expect-string, stdout, and renderer semantics unchanged.

Result:
Native kube-prometheus file-output A/B on the 0.5.12 stack improved in both orders: forward median 196.625 ms clean vs 183.491 ms candidate; reverse median 175.878 ms candidate vs 193.394 ms clean. Output matched by cmp; reformat, full tests, and bench regressions passed.
@He-Pin He-Pin marked this pull request as ready for review May 23, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant