docs: add native C++ export engine architecture plan by kaili-yang · Pull Request #678 · siddharthvaddem/openscreen

kaili-yang · 2026-06-02T01:14:21Z

Summary

It's a blueprint and a draft about the export optimization. Most optimization methods are just industry-standard, safe, conservative plays. They’re fine for quick, low-risk iterations.
Adds docs/export-optimize-native-cpp-plan.md, a design document for a native C++ export engine intended to replace the current WebCodecs-based pipeline as the primary export path.
Welcome to improve it.

What's in the doc

Prior art — documents the three CapCut (https://www.capcut.com) export optimisation strategies this plan draws on: full-stack hardware acceleration, background pre-render cache, and on-demand trim-aware decode.
Why WebCodecs has a hard ceiling — explains the structural constraints (single-threaded serial loop, no GPU zero-copy, opaque HW encoder selection, real-time audio bottleneck) that cannot be addressed by incremental JS fixes.
Target architecture — a standalone openscreen-export-helper C++ binary following the same child-process pattern as the existing openscreen-screencapturekit-helper and openscreen-wgc-capture-helper. The helper owns the full decode → GPU composite → HW encode → mux pipeline while the renderer stays untouched.
Hardware acceleration stack — per-platform priority table covering VideoToolbox (macOS), NVENC / AMF / Quick Sync (Windows), VAAPI (Linux), and a libx264 software fallback.
Phased delivery roadmap — six milestones from skeleton encode (no effects) through full feature parity (cursor, webcam PiP, audio, GIF), each milestone independently shippable with a fallback to the existing WebCodecs path.

P1-C: Prefer hardware acceleration on all platforms including Windows. Previously Windows tried software first to avoid known driver bugs, but modern hardware encoders (NVENC, VideoToolbox, VAAPI) are 5–10× faster. Software remains the fallback if hardware configure/encode throws. P1-D: Thread encoder latency mode through VideoExporterConfig. Medium and good quality presets now use latencyMode "realtime" which skips encoder lookahead for ~3–5× faster encode throughput. Source quality keeps "quality" mode for maximum compression efficiency. P1-B: Reuse the Pixi TextureSource resource across frames instead of destroying and recreating the GPU texture on every frame. Updates the backing resource and calls source.update() to re-upload, eliminating per-frame GPU alloc/free churn (~5–15 % overhead on long exports).

This reverts commit 959d3e9.

coderabbitai · 2026-06-02T01:14:35Z

📝 Walkthrough

Walkthrough

New draft doc outlines a standalone native C++ export helper process spawned from Electron, replacing the WebCodecs browser pipeline. Specifies hardware-acceleration stack (VideoToolbox/NVENC/AMF/Quick Sync/VAAPI), GPU zero-copy decode/composite/encode stages, JSON IPC contract, phased feature delivery, and performance targets (under ~15s for typical 1080p with acceleration).

Changes

Native C++ Export Helper Design Plan

Layer / File(s)	Summary
Rationale & Architecture `docs/export-optimize-native-cpp-plan.md` (lines 1–77)	Problem statement (WebCodecs performance ceiling and browser-process limitations), prior art context, and the proposed architecture: separate helper process for isolation, parallelism, and direct OS API access. Specifies platform-specific hardware acceleration backends (VideoToolbox, NVENC, AMF, Quick Sync, VAAPI with FFmpeg/software fallbacks) and emphasizes GPU zero-copy as the cross-stage optimization requirement.
Technical Pipeline & IPC Contract `docs/export-optimize-native-cpp-plan.md` (lines 80–126)	Functional pipeline breakdown: decode (GOP-level skipping within trim regions), composite (multi-pass GPU shader assembly and baked shadow reuse), encode (overlapped with composite), offline audio (time-stretch and re-encode), and MP4 muxing with `faststart`. JSON-based CLI input and newline-delimited progress events on stdout; SIGTERM-based cancellation and TypeScript fallback to WebCodecs for unsupported configs.
Build, Deployment & Feature Roadmap `docs/export-optimize-native-cpp-plan.md` (lines 129–157)	CMake build conventions, per-platform/arch packaging, FFmpeg static-link subset, macOS code-signing/notarization. Phased delivery milestones from skeleton (basic decode→pass-through→encode) through effects, cursor overlay, webcam PiP+audio, and GIF output. Performance targets: sub-15s for typical 2-minute 1080p with hardware acceleration, with software fallback parity to current WebCodecs path.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

A helper process rises, GPU-bound and fleet,
Leaving WebCodecs behind on the browser's back street.
Decode, composite, encode—each stage flows,
From trim region to faststart MP4.
Near real-time exports, where CapCut goes. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'docs: add native C++ export engine architecture plan' clearly and specifically describes the main change—adding a design document for a new native C++ export engine.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description effectively communicates the purpose, motivation, and scope of the change, but deviates from the template structure by omitting several standard sections.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/export-optimize-native-cpp-plan.md`:
- Around line 121-122: Update the docs to make stdin the canonical input channel
instead of a large JSON command-line argument: replace the line describing Input
as “a single JSON object passed as a command-line argument” with language that
the exporter reads the full JSON job description from stdin (and that argv
should only contain a tiny bootstrap token/flags), and keep Output described as
newline-delimited JSON progress events on stdout (`ready`, `progress`, `done`,
`error`); explicitly call out that large payloads (trim maps, cursor/effects)
must be provided via stdin to avoid argv length limits.
- Around line 46-49: The fenced diagram block showing the process boundary (the
three-line block that starts with "Electron Renderer  ──IPC──►  Electron Main 
──spawn──►  openscreen-export-helper") should include a language tag (e.g., use
```text) to satisfy markdownlint MD040; update that fenced code block to begin
with a language identifier like "text" so the diagram is properly tagged.
- Around line 123-126: Define and implement an explicit cancellation/error
contract for the native helper invoked by NativeExporter: ensure that on SIGTERM
or any encoding/muxing error the helper stops the pipeline, cleans up temporary
artifacts (partial files, temp dirs), and exits non‑zero; only on successful
completion write output to a temp path and publish the final file via an atomic
rename/move into the target path. Update NativeExporter (the TypeScript wrapper
around VideoExporterConfig → helper JSON) to pass a temp-output path, listen for
helper exit codes/events, delete temp artifacts on non‑zero/error/SIGTERM, and
only surface success callbacks (and call onProgress finalization) after the
atomic rename; keep VideoExporter (WebCodecs) as the documented fallback for
systems where the helper binary is unavailable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 48382e63-d821-4b99-84b9-82c4ce9d2e50

📥 Commits

Reviewing files that changed from the base of the PR and between d2dd44a and fb57d5d.

📒 Files selected for processing (1)

docs/export-optimize-native-cpp-plan.md

coderabbitai · 2026-06-02T01:17:15Z

+```
+Electron Renderer  ──IPC──►  Electron Main  ──spawn──►  openscreen-export-helper
+  (React / UI)                (Node.js)                   (C++ encode engine)
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Nit: add a language tag to fenced diagram block.

markdownlint MD040 is valid here; use something like ```text for the process-boundary diagram. nit, but cleaner CI/docs hygiene.

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 46-46: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/export-optimize-native-cpp-plan.md` around lines 46 - 49, The fenced diagram block showing the process boundary (the three-line block that starts with "Electron Renderer ──IPC──► Electron Main ──spawn──► openscreen-export-helper") should include a language tag (e.g., use ```text) to satisfy markdownlint MD040; update that fenced code block to begin with a language identifier like "text" so the diagram is properly tagged.

coderabbitai · 2026-06-02T01:17:15Z

+- **Input**: a single JSON object passed as a command-line argument describing the full export job (paths, effects, quality, trim, cursor data, etc.)
+- **Output**: newline-delimited JSON progress events on stdout (`ready`, `progress`, `done`, `error`)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Input transport is inconsistent and lowkey risky at scale.

Doc says “stdin/stdout JSON contract” but then defines input as a JSON command-line arg. For large jobs (trim maps, cursor/effects payloads), argv length limits (especially on Windows) can fail export startup. recommend making stdin the canonical input channel and keeping argv to a tiny bootstrap token only.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/export-optimize-native-cpp-plan.md` around lines 121 - 122, Update the docs to make stdin the canonical input channel instead of a large JSON command-line argument: replace the line describing Input as “a single JSON object passed as a command-line argument” with language that the exporter reads the full JSON job description from stdin (and that argv should only contain a tiny bootstrap token/flags), and keep Output described as newline-delimited JSON progress events on stdout (`ready`, `progress`, `done`, `error`); explicitly call out that large payloads (trim maps, cursor/effects) must be provided via stdin to avoid argv length limits.

coderabbitai · 2026-06-02T01:17:15Z

+- **Cancellation**: SIGTERM
+
+The JS side (`NativeExporter`) wraps this in a thin TypeScript class that translates the existing `VideoExporterConfig` into the helper's JSON format and maps progress events back to the `onProgress` callback. The `VideoExporter` (WebCodecs) remains as a fallback for systems where the helper binary is unavailable.
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cancellation contract needs explicit cleanup + atomic output semantics.

SIGTERM-only is kinda under-specified for long-running encode/mux. Please define required behavior on cancel/error: stop pipeline, delete temp artifacts, and only publish output via atomic rename on success. otherwise users can end up with corrupt/partial files that look “done-ish”.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/export-optimize-native-cpp-plan.md` around lines 123 - 126, Define and implement an explicit cancellation/error contract for the native helper invoked by NativeExporter: ensure that on SIGTERM or any encoding/muxing error the helper stops the pipeline, cleans up temporary artifacts (partial files, temp dirs), and exits non‑zero; only on successful completion write output to a temp path and publish the final file via an atomic rename/move into the target path. Update NativeExporter (the TypeScript wrapper around VideoExporterConfig → helper JSON) to pass a temp-output path, listen for helper exit codes/events, delete temp artifacts on non‑zero/error/SIGTERM, and only surface success callbacks (and call onProgress finalization) after the atomic rename; keep VideoExporter (WebCodecs) as the documented fallback for systems where the helper binary is unavailable.

davideme · 2026-06-02T15:33:46Z

+- **Background pre-render cache.** While the user scrubs the timeline and previews effects, CapCut silently renders affected segments into a low-bitrate segment cache. When export is triggered, cached segments are assembled directly without re-rendering — reducing the export to a mux-and-encode pass over pre-computed frames.
+- **On-demand decode.** Only frames that survive trim boundaries are decoded. The demuxer seeks to the nearest keyframe before each active segment and skips the rest at the packet level, so a 10-minute source with a 30-second active region decodes approximately 30 seconds of video, not 10 minutes.
+
+The architecture proposed here applies the first and third strategies directly. The second (segment pre-render cache) is a longer-term addition that can layer on top once the core C++ pipeline is in place.


Why C++? I'm not against it, I see no argument in favor or against it.

davideme · 2026-06-02T15:36:22Z

+The helper selects the best available backend at runtime, in priority order:
+
+| Platform | Decode | Composite | Encode |
+|---|---|---|---|
+| macOS (Apple Silicon) | VideoToolbox | Metal compute | VideoToolbox (H.264 / HEVC) |
+| macOS (Intel) | VideoToolbox | Metal compute | VideoToolbox |
+| Windows (NVIDIA) | NVDEC | D3D11 compute | NVENC |
+| Windows (AMD) | AMF decoder | D3D11 compute | AMF encoder |
+| Windows (Intel) | Quick Sync | D3D11 compute | Quick Sync |
+| Linux | VAAPI / NVDEC | OpenGL compute | VAAPI / NVENC |
+| All (fallback) | FFmpeg software | CPU | libx264 / libx265 |
+
+The critical optimisation at each stage is **GPU zero-copy**: the decoded frame lives on a GPU surface, the compositor reads and writes GPU textures, and the encoder consumes the GPU surface directly — no pixel data crosses the CPU bus until the final muxed file is written to disk.


Why not using an abstraction like libavcodec library from FFmpeg project? that already maintain all these backend layer.

the C++ helper does use FFmpeg (libavcodec / libavformat / libswscale). FFmpeg is the abstraction layer over every hardware backend listed in this document — VideoToolbox, NVENC, AMF, Quick Sync, VAAPI. The question is not whether to use FFmpeg, but where to run it.

There are two realistic ways to run FFmpeg-backed code from an Electron app:

Option A — WebAssembly (ffmpeg.wasm). Compile FFmpeg to WASM and run it inside the renderer or main process. This is a real project and works for simple transcodes. The problem is that WASM runs inside the browser's sandbox, which means it cannot open a VideoToolbox session, cannot acquire an NVENC encoder context, cannot use D3D11VA or VAAPI, and cannot share GPU surfaces with the compositor. Every frame would be a CPU copy. WASM is also single-threaded by default; SharedArrayBuffer threads are available but cannot call native OS APIs. For a pipeline whose entire performance argument is GPU zero-copy and hardware encode, WASM erases the benefit entirely.

Option B — Native process or addon. Compile FFmpeg into a native binary (or N-API addon) that runs outside the browser sandbox with full OS API access. This is what this document proposes, and it is exactly how CapCut, DaVinci Resolve, and every other professional desktop video tool works.

So the stack is: C++ process → libavcodec (FFmpeg) → platform HW API (VideoToolbox / NVENC / VAAPI). FFmpeg is not an alternative to this plan; it is a core dependency of it. The C++ layer exists specifically to host FFmpeg outside the sandbox where it can actually reach the hardware.

kaili-yang added 4 commits June 1, 2026 12:11

Revert "perf: export phase 1 — hw encoder, latency mode, texture reuse"

6505187

This reverts commit 959d3e9.

docs: add CapCut prior art section to native C++ export plan

b7c437b

docs: rename export plan to export-optimize-native-cpp-plan

fb57d5d

kaili-yang requested a review from siddharthvaddem as a code owner June 2, 2026 01:14

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

davideme reviewed Jun 2, 2026

View reviewed changes

kaili-yang requested a review from davideme June 3, 2026 22:43

ThairaHub mentioned this pull request Jun 11, 2026

Merge upstream final 54 commits + orphaned-PR harvest joaothaira/openscreen#19

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add native C++ export engine architecture plan#678

docs: add native C++ export engine architecture plan#678
kaili-yang wants to merge 4 commits into
siddharthvaddem:mainfrom
kaili-yang:perf/export-phase1

kaili-yang commented Jun 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Uh oh!

coderabbitai Bot Jun 2, 2026

Uh oh!

coderabbitai Bot Jun 2, 2026

Uh oh!

davideme Jun 2, 2026

Uh oh!

davideme Jun 2, 2026

Uh oh!

kaili-yang Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		- Input: a single JSON object passed as a command-line argument describing the full export job (paths, effects, quality, trim, cursor data, etc.)
		- Output: newline-delimited JSON progress events on stdout (`ready`, `progress`, `done`, `error`)

		- Cancellation: SIGTERM

		The JS side (`NativeExporter`) wraps this in a thin TypeScript class that translates the existing `VideoExporterConfig` into the helper's JSON format and maps progress events back to the `onProgress` callback. The `VideoExporter` (WebCodecs) remains as a fallback for systems where the helper binary is unavailable.

Conversation

kaili-yang commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in the doc

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

davideme Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

davideme Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

kaili-yang Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kaili-yang commented Jun 2, 2026 •

edited

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading