Skip to content

Fix video export stall when trim regions cause long decoder gaps#4

Merged
EtienneLescot merged 6 commits into
getopenscreen:mainfrom
m8i-51:fix/export-stall-trim-region
Jun 20, 2026
Merged

Fix video export stall when trim regions cause long decoder gaps#4
EtienneLescot merged 6 commits into
getopenscreen:mainfrom
m8i-51:fix/export-stall-trim-region

Conversation

@m8i-51

@m8i-51 m8i-51 commented Jun 19, 2026

Copy link
Copy Markdown

Description

Fixes a false-positive encoder stall that causes video export to abort when a project has trim regions covering a large portion of the source recording.

Motivation

StreamingVideoDecoder reads the source file sequentially — it cannot seek. Frames inside trimmed regions must still be decoded to maintain P/B-frame state, but are discarded. For a recording with a large trim in the middle, this discard phase can take tens of seconds of wall time.

During the discard phase the encoder queue is empty and lastEncoderOutputAt goes stale. When real frames arrive after the trim and fill the queue, the timeout has already elapsed — a false positive that aborts a healthy export.

Fix: reset the timer at the start of each queue-full wait instead of measuring from the last encoder output.

+ const stallWaitStartAt = Date.now();
  while (encoder.encodeQueueSize >= maxEncodeQueue) {
-   if (Date.now() - this.lastEncoderOutputAt > ENCODER_STALL_TIMEOUT_MS) {
+   if (Date.now() - stallWaitStartAt > ENCODER_STALL_TIMEOUT_MS) {

This was originally opened as siddharthvaddem#682, but that repo was archived before it could be reviewed/merged. Re-opening against this fork since the bug is still present in videoExporter.ts here.

Type of Change

  • New Feature
  • Bug Fix
  • Refactor / Code Cleanup
  • Documentation Update
  • Other (please specify)

Testing

The stall detection path exercises VideoEncoder directly, which is a browser WebCodecs API not available in the Node test environment. I verified the fix manually with a project that previously reproduced the issue consistently.

  1. Create a project with a trim region that removes a large span of the source recording (e.g. trimming out several minutes from the middle)
  2. Export to MP4
  3. Before: export aborts with "The hardware video encoder stopped responding"
  4. After: export completes normally

Verified against current main (post v1.5.0):

  • tsc --noEmit — 0 errors
  • vitest --run — 225/225 passing

Checklist

  • I have performed a self-review of my code.
  • I have added any necessary screenshots or videos.
  • I have linked related issue(s) and updated the changelog if applicable.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved encoder queue monitoring and timeout detection for more reliable video exports.
    • Enhanced error messaging to better distinguish between hardware and software encoder-related issues.
  • Tests

    • Expanded test coverage for encoder queue management, including cancellation and timeout scenarios.

When a recording has a large trim region (e.g. 400s–828s removed),
the decoder must sequentially decode and discard all frames in that
region to maintain P/B-frame state. On a 3320x2160 source this can
take 40–50 seconds of wall time.

During that decode pass the encoder queue drains to empty and
lastEncoderOutputAt stops updating. When the next segment's frames
arrive and fill the encode queue, the stall detector would compare
Date.now() against the stale lastEncoderOutputAt (~50 s ago) and
incorrectly throw a stall error, aborting the export.

Fix: measure stall timeout from when the queue-full while-loop is
entered (stallWaitStartAt), not from the last global encoder output.
This gives the encoder a fresh 15 s window to produce output each
time the queue fills up, regardless of how long the decoder spent
on trimmed frames.

Also remove VideoFrame leak-tracker debug code added during diagnosis,
and switch latencyMode to "realtime" with a smaller maxEncodeQueue
to reduce encoder internal buffering depth.
@m8i-51 m8i-51 requested a review from EtienneLescot as a code owner June 19, 2026 02:17
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@m8i-51, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 44 minutes and 59 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ee5fa62a-d2e4-46a2-a373-10a50f642b7c

📥 Commits

Reviewing files that changed from the base of the PR and between 04f68ec and c81b38a.

📒 Files selected for processing (1)
  • src/lib/exporter/videoExporter.test.ts
📝 Walkthrough

Walkthrough

Introduces an exported waitForEncoderQueueSpace async helper in videoExporter.ts that polls encoder queue size, enforces a stall timeout with encoder-preference-specific error messages, and supports injectable now/sleep. Removes the lastEncoderOutputAt instance field and replaces the inline stall loop with calls to the new helper. Adds a comprehensive test suite covering all queue-drain, timeout, and cancellation scenarios.

Changes

Encoder Stall Detection Refactor

Layer / File(s) Summary
waitForEncoderQueueSpace helper and VideoExporter integration
src/lib/exporter/videoExporter.ts
Adds the exported waitForEncoderQueueSpace async helper with injectable clock/sleep, preference-specific timeout errors, and cancellation support. Removes lastEncoderOutputAt field and replaces the inline stall loop in the export frame path with a call to the helper (closing exportFrame before rethrow). Cleans up initializeEncoder and cleanup to remove the deleted field.
Test suite for waitForEncoderQueueSpace
src/lib/exporter/videoExporter.test.ts
Adds vi and waitForEncoderQueueSpace to imports, then adds a full describe suite with a fake clock/sleep covering immediate resolution, queue-drain wait, hardware vs. software timeout error messages, a pre-call elapsed time regression, and cancellation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 A queue once watched by a field kept in time,
Now hops to a helper—much cleaner, sublime!
Fake clocks tick softly, the tests all agree,
Stalls throw their errors with true specificity.
The rabbit refactors and hops away free~ 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main change: fixing a false-positive video export stall caused by trim regions, which directly aligns with the primary bug fix described in the changeset.
Description check ✅ Passed The PR description covers the issue, motivation, fix explanation, type of change, testing approach, and includes verification steps; however, it lacks explicit checkbox selections for release impact and desktop platform impact required by the template.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EtienneLescot

EtienneLescot commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

@m8i-51 I can see the logic of the bug but I was unable to reproduce.
I tried several scenarios including a 30 minutes video with a 10+ minutes trim.
Could you please share your OS and the hardware specs? And maybe a video of the issue occuring?

From the code path, the fix looks reasonable to me: the stall timeout should measure time spent waiting for encoder queue space, not time spent decoding through a trimmed region. A synthetic regression test for this case would also be helpful since it seems hardware/source dependent.

@m8i-51

m8i-51 commented Jun 19, 2026

Copy link
Copy Markdown
Author

@EtienneLescot Specs: macOS 26.5.1, MacBook Air (M2, 8-core), 24GB RAM.

Tried reproducing on this Mac too (mid-video trim and an early trim, both ~1-2 min) — couldn't trigger it either. Apple Silicon's hardware decoder is fast enough here that the discard phase never got close to the 15s timeout, so this looks hardware/decoder-speed dependent rather than something everyone will see. Matches what you found.

I can add a regression test for the timer logic itself (mocking the queue-full wait) if that's useful — let me know.

@EtienneLescot

Copy link
Copy Markdown
Collaborator

@EtienneLescot Specs: macOS 26.5.1, MacBook Air (M2, 8-core), 24GB RAM.

Tried reproducing on this Mac too (mid-video trim and an early trim, both ~1-2 min) — couldn't trigger it either. Apple Silicon's hardware decoder is fast enough here that the discard phase never got close to the 15s timeout, so this looks hardware/decoder-speed dependent rather than something everyone will see. Matches what you found.

I can add a regression test for the timer logic itself (mocking the queue-full wait) if that's useful — let me know.

It could be useful yes.

m8i-51 added 3 commits June 20, 2026 07:14
Extracts the queue-full wait loop into waitForEncoderQueueSpace() so
the timing logic can be unit tested without real WebCodecs. Covers
the original false-positive: a long gap before the call (e.g. decoder
discarding frames in a trim region) must not count against the
15s timeout, since the timer starts at call time, not from the
encoder's last output.
The "long gap before this call" case was mathematically identical to
the queue-drain test once now()/sleep() are injected — shifting the
fake clock's epoch doesn't change now() - stallWaitStartAt. Replaced
with a comment on why the bug can't recur: the function takes no
external "last output" timestamp to go stale in the first place.
@m8i-51

m8i-51 commented Jun 19, 2026

Copy link
Copy Markdown
Author

@EtienneLescot Update: I take back my earlier "couldn't reproduce" comment — I reproduced it.

It didn't show up with a synthetic test video, but it did with a real, heavily-edited project: a 14:45 screen recording at 3320x2160 with 10 trim regions, including one ~7 minute trim. The error is silently swallowed by the existing hardware→software encoder retry fallback (getEncoderPreferences()), so no toast ever appears — the export just looks like it resets progress partway through. Console (DevTools) shows the real error:

[VideoExporter] prefer-hardware export attempt failed: Error: The hardware video encoder stopped responding. Retrying with a safer encoder.
    at videoExporter.ts:272:23
    at async StreamingVideoDecoder.decodeAll (streamingDecoder.ts:377:9)
    at async VideoExporter.exportWithEncoderPreference (videoExporter.ts:228:7)

GIF of the progress bar silently resetting (79% → 1%) when this happens, for reference:

stall repro

So: high resolution + a large trim is what it takes — small/synthetic sources decode fast enough on Apple Silicon to never hit the 15s window.

Also pushed the regression test you asked for. Extracted the queue-wait loop into waitForEncoderQueueSpace() so the timing logic is unit-testable without real WebCodecs (9 new test cases covering the timeout boundary, both error messages, and cancellation).

@EtienneLescot EtienneLescot merged commit 8ced98d into getopenscreen:main Jun 20, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants