docs: clarify transparent WebM shows as yuv420p under ffprobe (alpha is a VP9 side-channel)#1823
Conversation
… a VP9 side-channel)
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed at 309c409a5ba9ba83beedc655571f589477f165c5. Docs-only, one Note in docs/guides/rendering.mdx — directly addresses recurring "webm baked opaque white" bug reports. Approach is right, but the framing has one technical inaccuracy and one grep-usability gap worth tightening before it lands.
🟡 Framing: "ffmpeg's native vp9 decoder silently drops the alpha side-channel" is the effect, not the cause
Two other places in this repo have already answered "why does ffprobe say yuv420p on a transparent WebM?" — and they use a different (more accurate) framing:
-
packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts:libvpx-vp9 stores the alpha plane as a Matroska
BlockAdditionalsidecar, NOT in the main stream'spix_fmt—ffprobealways reportspix_fmt=yuv420pfor VP9-with-alpha. -
Same file, further down:
ffmpeg's default VP9 decoder strips the BlockAdditional alpha track when decoding to non-rgba pixel formats; forcing the libvpx-vp9 decoder +
-pix_fmt rgbais how the alpha plane comes back.
The mechanism is: the alpha lives in a Matroska BlockAdditional sidecar; ffprobe reads the primary stream's pix_fmt and never inspects the sidecar, so it always reports yuv420p regardless of decoder choice. The user's PR body says the native decoder drops the sidecar. That's true for decode-to-frame, but ffprobe doesn't decode — it inspects. So "the native decoder drops it" doesn't explain why ffprobe (not ffmpeg -i) reports yuv420p. Users who read this and think "OK, but I'm running ffprobe, not ffmpeg — is ffprobe using the native decoder?" will be confused.
Suggested rewording for the first sentence:
ffprobereportspix_fmt=yuv420pbecause VP9 stores its alpha plane in a MatroskaBlockAdditionalsidecar, not in the primary stream's pixel format.ffprobeinspects the primary stream and never surfaces the sidecar, so a genuinely-transparent WebM always looks likeyuv420pto it.
Then the guidance to force libvpx-vp9 -pix_fmt rgba is the natural next step — you're not "coaxing ffprobe to use the right decoder", you're bypassing ffprobe and asking ffmpeg to decode-and-report.
🟡 ALPHA_MODE=1 vs alpha_mode=1 — the case bug is documented but the doc uses only the newer casing
The engine has an explicit note about this (packages/engine/src/services/videoFrameExtractor.ts:~254):
tag detection has at least three known failure modes: case-sensitivity across ffmpeg versions (
alpha_modevsALPHA_MODE), missing tags from older muxers, and mp4-as-webm rewraps that drop the sidecar.
The doc's Note only shows ALPHA_MODE=1. Users running an older ffmpeg (still shipping in Debian stable / long-lived Docker images) will grep the header for ALPHA_MODE, see nothing, and re-file the "no alpha" bug. Two options:
- Show both forms explicitly:
ALPHA_MODE=1(newer builds) /alpha_mode=1(older builds). - Suggest
ffprobe -show_streams out.webm | grep -i alpha_mode— the-isidesteps the casing question entirely.
I'd lean toward option 2 — it's one flag, works on every ffmpeg version, and it's copy-paste-safe for people who don't know their ffmpeg build.
🟡 The verify-transparency snippet is copy-paste ready but doesn't cite the tag it depends on
Both ffprobe -c:v libvpx-vp9 -show_entries stream=pix_fmt and the ffmpeg -c:v libvpx-vp9 -i ... -pix_fmt rgba snippets are the right commands, but they buy nothing if ALPHA_MODE=1 isn't set — the sidecar wouldn't be there to decode. If a user's file is somehow missing the metadata (mp4-as-webm rewrap, third-party mux), forcing the decoder returns yuv420p again, and the doc's flow doesn't tell them what that outcome means.
Consider a two-step verify:
# Step 1: does the file claim to carry alpha?
ffprobe -v error -show_streams out.webm | grep -i alpha_mode
# Expect: ALPHA_MODE=1 (or alpha_mode=1 on older ffmpeg)
# Step 2: does per-pixel alpha decode?
ffmpeg -v error -c:v libvpx-vp9 -i out.webm -frames:v 1 -f rawvideo -pix_fmt rgba - | xxd | head -1
# Expect: fully-transparent corner reads 00 00 00 00Step 1 catches the "metadata missing" failure mode; step 2 catches the "metadata present but plane empty" failure mode. Each has its own remediation.
🟢 The rest of the framing is accurate
- "MOV (ProRes 4444) and png-sequence report alpha directly, so this caveat is WebM-only" — matches
packages/engine/src/services/chunkEncoder.ts(ProRes writesyuva444p10ledirectly, no sidecar) and the pipeline inpackages/cli/src/background-removal/pipeline.ts. - The
-c:v libvpx-vp9flag is load-bearing (confirmed by the same-repo comment quoted above). - "A transparent corner reads
00 00 00 00" — correct forrgbaoutput. - Landing this in the Transparent Video "Verifying transparency" section is the right home; anyone hitting the confusion is already reading that section.
🟢 Cross-refs land cleanly
docs/guides/remove-background.mdx, docs/packages/cli.mdx, and docs/packages/engine.mdx all reference the alpha_mode=1 / yuva420p combo for the encoder. This Note is the missing companion on the inspector side. No divergence risk.
Nits
- Consider linking from this Note to the encoder guidance in
docs/packages/engine.mdx("when encoding for transparency, useformat: 'webm'withgetEncoderPreset()") — closes the loop for a user who wonders "am I even using the right encoder settings?" - The Note is inside a
<Note>component — check that Mintlify renders the fenced code block correctly nested inside it (some MDX component libraries strip the language tag). Not worth blocking on, but sanity-preview if you haven't.
What I didn't verify
- Whether Mintlify's
<Note>component wraps the code block correctly at render time — worth a preview click before merge. - Whether the
ffprobe -c:v libvpx-vp9 -show_entries stream=pix_fmt -of csv=p=0command as written is currently portable across ffprobe 4.x / 5.x / 6.x — the-c:vflag before input is documented on ffmpeg but I didn't spot-test on multiple versions. Worth a sanity-run on Ubuntu 20.04's ffprobe (4.2) if the target audience includes long-lived CI images.
Non-blocking overall — the doc is genuinely helpful and closes a recurring source of user pain. Reframe the mechanism ("BlockAdditional sidecar, ffprobe inspects primary stream") and note the tag casing and this ships clean.
— Review by Rames D Jusso
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
R2 @ 8ba036f3d4421de6f81ddc0056c997f49e14e356 — verifying against my R1 at 309c409a.
R1 findings
🟡 (a) Mechanism framing — "native decoder drops the alpha side-channel" is the effect, not the cause → ✅ RESOLVED
Doc Note at docs/guides/rendering.mdx:392-401 now reads:
ffprobereports a transparent WebM asyuv420p, notyuva420p. This is expected and does not mean the alpha is missing. VP9 stores its alpha plane in a MatroskaBlockAdditionalsidecar, not in the primary stream's pixel format, soffprobereports the primary stream aspix_fmt=yuv420peven when the file genuinely carries alpha.
That's the correct mechanism, essentially verbatim to my suggested rewording — the old "native decoder drops the side-channel" framing is gone. The doc now aligns with the two same-repo authoritative sources (packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts and packages/engine/src/services/videoFrameExtractor.ts). No confusion between decode-to-frame and inspect-primary-stream — the user reads this and immediately understands why ffprobe reports what it reports.
🟡 (b) ALPHA_MODE=1 vs alpha_mode=1 casing across ffmpeg versions → ✅ RESOLVED
Doc now uses grep -i alpha_mode:
ffprobe -v error -show_streams out.webm | grep -i alpha_mode # ALPHA_MODE=1 or alpha_mode=1
Case-insensitive grep + inline comment showing both casings. That's option 2 verbatim from my R1 — one flag, works across every ffmpeg version, copy-paste-safe. Users on Debian stable / Ubuntu 20.04 ffmpeg 4.x hit alpha_mode, users on newer builds hit ALPHA_MODE; both surface in the same grep.
Two-step verify (my optional R1 suggestion) → ✅ INCLUDED
Both commands are now in the Note (alpha_mode header check + libvpx-vp9 -pix_fmt rgba per-pixel check). Not labeled "Step 1 / Step 2" as I suggested, but functionally they ARE the two-step verify — first catch missing-metadata failure mode, then catch metadata-present-but-plane-empty. Adequate.
PR body updated → ✅ RESOLVED
Body now explains the sidecar mechanism ("VP9 stores the alpha plane in a Matroska BlockAdditional sidecar, not in the primary stream's pixel format. ffprobe inspects that primary stream, so a genuinely transparent VP9 WebM still reports yuv420p") and covers both casings. Body + doc are consistent.
Nits from R1
- Anchor on
webm-concat-copy.test.tsas authoritative source — Not explicitly done. The doc doesn't cite the source, but the framing is now self-consistent and correct, so this is moot. - Link to encoder-side guidance in
docs/packages/engine.mdx— Not addressed. Nit-level; the current Note is self-contained enough that a user hitting the confusion gets an actionable answer without leaving the page. - Mintlify
<Note>code block rendering sanity check — Can't verify from the diff; requires a preview click. TheMintlify Deploymentcheck isSUCCESS, which suggests it validates at least; a preview click before merge is still worth 10 seconds.
Adjacent-file check
Cross-refs in the wider docs tree (docs/guides/remove-background.mdx, docs/packages/cli.mdx, docs/packages/engine.mdx) reference alpha_mode=1 / yuva420p on the ENCODER side. This Note is the missing inspector-side companion; no divergence introduced. The packages/producer/tests/distributed/_smoke/webm-concat-copy.test.ts and packages/engine/src/services/videoFrameExtractor.ts comments in the code remain the technical authoritative sources; the doc Note is now consistent with them (previously it wasn't).
CI status
BLOCKED (needs 1 approval), MERGEABLE. Docs-only PR; docs checks all green (Validate docs, Mintlify Deployment SUCCESS). All CI-required checks green.
Peer reviews since R1
None. Only my R1 review on the record.
R2 verdict
LGTM — mechanism reframed correctly (Matroska BlockAdditional sidecar / primary-stream inspection), casing guidance uses grep -i (portable across ffmpeg versions), PR body updated to match. Doc now aligns with the same-repo authoritative sources. Ships clean; docs-only, low blast radius.
— Review by Rames D Jusso
Problem
Multiple users rendering transparent overlays with
--format webmconcluded their output had no transparency becauseffprobereportedpix_fmt=yuv420pinstead ofyuva420p. One even filed it as a bug ("webm bakes an opaque white background") when the render was actually correct.The confusing bit is WebM/VP9-specific: VP9 stores the alpha plane in a Matroska
BlockAdditionalsidecar, not in the primary stream's pixel format.ffprobeinspects that primary stream, so a genuinely transparent VP9 WebM still reportsyuv420p.Fix
Add a Note to the Transparent Video "Verifying transparency" section of the rendering guide explaining:
yuv420pfromffprobeis expected for a transparent WebM and does not mean alpha is missing.BlockAdditionalsidecar, not the primary streampix_fmt.ffprobe -show_streams | grep -i alpha_mode, covering bothALPHA_MODE=1and olderalpha_mode=1casing.-c:v libvpx-vp9) and inspect an RGBA frame; a transparent corner reads00 00 00 00.Docs-only change.