Skip to content

Speed up MKV subtitle open: cache parsed clusters + VINT lookup table#11397

Merged
niksedk merged 2 commits into
mainfrom
fix-matroska-redundant-cluster-scan-11374
Jun 4, 2026
Merged

Speed up MKV subtitle open: cache parsed clusters + VINT lookup table#11397
niksedk merged 2 commits into
mainfrom
fix-matroska-redundant-cluster-scan-11374

Conversation

@niksedk
Copy link
Copy Markdown
Member

@niksedk niksedk commented Jun 4, 2026

Summary

Two MatroskaFile changes that reduce the time to open a Blu-ray-rip MKV with PGS subtitles. The first fixes #11374 directly; the second is a small follow-on perf win in the same hot path.

1. Cache parsed cluster data per track in MatroskaFile.GetSubtitle

GetSubtitle is called three times on the same instance during the PGS/BluRay open flow — once in PickMatroskaTrackViewModel.TrackChanged (PickMatroskaTrackViewModel.cs:227), once in BluRaySupParser.ParseBluRaySupFromMatroska (BluRaySupParser.cs:467) for the cue preview thumbnails, and once in BluRayHelper.LoadBluRaySubFromMatroska (BluRayHelper.cs:21) to feed the OCR window. Each call walked every cluster in the file (ReadSegmentCluster), so for multi-GB Blu-ray-rip MKVs the file was effectively read in full three times.

The infrastructure for caching was already present — _subtitleRip and _subtitleRipTrackNumber are existing fields — but GetSubtitle unconditionally Clear()ed the list every call. The fix returns the cached list when the requested track matches the last one. Different tracks still invalidate the cache (the existing trackNumber mismatch path is unchanged), and the Count > 0 guard avoids caching empty results from malformed/empty tracks.

progressCallback is null at every BluRay call site, so skipping it on cache hits is a no-op for the existing flows.

On Windows, the OS file cache hides most of the cost of the second/third pass. macOS users with Blu-ray-rip MKVs that exceed the unified buffer cache feel it as a multi-minute open; that should now be one cluster-walk instead of three.

Fixes #11374

2. Replace VINT length mask/shift loop with a 256-entry lookup table

ReadVariableLengthUInt is called twice per element read on every cluster walk. The length-detection portion (counting leading zeros of the first byte) was an 8-iteration mask/shift loop. A 256-entry lookup table replaces it.

Measured with BenchmarkDotNet 0.14.0 on Apple M4, .NET 10, 4096 first-bytes per op:

Method Mean Ratio Allocated
LoopAndMask 5.192 us 1.00 -
LookupTable 1.032 us 0.20 -

~5× faster in isolation. Whole-scan wall-clock impact is small — I/O dominates — but the change is purely local, has no allocations, and is on the hottest path in the file.

Test plan

  • Open a .mkv with a single PGS subtitle track and proceed through to the OCR window — confirm no behavior change.
  • Open a .mkv with multiple subtitle tracks; click between tracks in the picker; export a track; click OK to OCR. Confirm previews/exports match the selected track (cache invalidates on track switch).
  • Open a .mkv with a text-based track (SRT/SSA in MKV) — confirm the picker preview and final import are unchanged.
  • Wall-clock measurement on a multi-GB Blu-ray-rip MKV: open → pick PGS track → OK should be roughly 3× faster on macOS.
  • LibSETests 367 tests pass.

🤖 Generated with Claude Code

niksedk and others added 2 commits June 4, 2026 19:59
GetSubtitle is called three times in the PGS/BluRay open flow on the
same MatroskaFile instance — once to populate the track-picker preview
(TrackChanged), once via ParseBluRaySupFromMatroska for the cue
thumbnails, and once via LoadBluRaySubFromMatroska to feed the OCR
window. Each call walked every cluster in the file, so for multi-GB
Blu-ray rips the file was read in full three times.

Return the previously parsed list when the requested track matches the
last one, so the cluster scan happens at most once per track per
MatroskaFile instance. Different tracks still invalidate the cache, and
the Count>0 guard avoids caching empty results.

Fixes #11374

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ReadVariableLengthUInt is called twice per element read on every cluster
walk. The length-detection portion (counting leading zeros of the first
byte) was an 8-iteration mask/shift loop. A 256-entry lookup table is
~5x faster in isolation (measured: 5.19us -> 1.03us per 4096 bytes on
M4, BenchmarkDotNet 0.14.0). Whole-scan wall-clock impact is small — I/O
dominates — but the change is purely local and has no allocations.

Existing fallthrough for invalid VINT first byte (0x00) is preserved by
collapsing the -1 (EOF) and 0 (invalid) checks into a single guard, then
checking the table result for 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@niksedk niksedk changed the title Cache parsed cluster data per track in MatroskaFile Speed up MKV subtitle open: cache parsed clusters + VINT lookup table Jun 4, 2026
@niksedk niksedk merged commit d0e9da5 into main Jun 4, 2026
1 of 3 checks passed
@niksedk niksedk deleted the fix-matroska-redundant-cluster-scan-11374 branch June 4, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(5.0rc2) Open MKV stream is inefficient

1 participant