Spotify has scripts/rematch_spotify_duration_mismatches.py (commit 0317b98) which walks all songs whose Spotify links have duration mismatches above a threshold and enqueues ('spotify', 'rematch_duration_mismatches') jobs onto the durable research worker queue. YouTube has no equivalent — to bulk-rematch, an admin has to invoke match_youtube_videos.py per song or per recording.
Ask
A scripts/rematch_youtube_videos.py (or similar name) that:
- Walks recordings where the existing YouTube link looks suspicious — the natural signals from the matcher's two-mode design:
match_confidence < 0.7
match_method = 'youtube_conservative'
--threshold-confidence <float> to override (default 0.7)
- Enqueues one
('youtube', 'match_recording') job per recording onto the research queue with payload={'rematch': True}. Existing handler picks them up.
- Honors
match_method='manual' (skip — admin already verified).
- Same flag shape as the Spotify version:
--dry-run (count + sample, no enqueue), --limit N, --debug.
Why it's worth a separate file from the admin review
The CLI is operational tooling — runnable from a Render shell, no Spotify-style "admin page workflow" coupling. Building it first would let an admin trigger broad rematches even before the review UI lands. The two issues can ship in either order.
Implementation notes
core/song_research.py already has the per-recording YouTube enqueue helper inline at line 67 — extract to a shared module function (e.g. core/youtube_rematch.py) so the new CLI calls the same thing.
- Mirror the structure of
core/spotify_rematch_mismatches.py. The "find candidates" SQL is what differs; the enqueue + sweep boilerplate is identical.
- Tests: 9 cases in
tests/test_spotify_rematch_mismatches.py is a good template — covering candidate discovery, dedup, threshold pass-through, error handling.
Spotify has
scripts/rematch_spotify_duration_mismatches.py(commit 0317b98) which walks all songs whose Spotify links have duration mismatches above a threshold and enqueues('spotify', 'rematch_duration_mismatches')jobs onto the durable research worker queue. YouTube has no equivalent — to bulk-rematch, an admin has to invokematch_youtube_videos.pyper song or per recording.Ask
A
scripts/rematch_youtube_videos.py(or similar name) that:match_confidence < 0.7match_method = 'youtube_conservative'--threshold-confidence <float>to override (default 0.7)('youtube', 'match_recording')job per recording onto the research queue withpayload={'rematch': True}. Existing handler picks them up.match_method='manual'(skip — admin already verified).--dry-run(count + sample, no enqueue),--limit N,--debug.Why it's worth a separate file from the admin review
The CLI is operational tooling — runnable from a Render shell, no Spotify-style "admin page workflow" coupling. Building it first would let an admin trigger broad rematches even before the review UI lands. The two issues can ship in either order.
Implementation notes
core/song_research.pyalready has the per-recording YouTube enqueue helper inline at line 67 — extract to a shared module function (e.g.core/youtube_rematch.py) so the new CLI calls the same thing.core/spotify_rematch_mismatches.py. The "find candidates" SQL is what differs; the enqueue + sweep boilerplate is identical.tests/test_spotify_rematch_mismatches.pyis a good template — covering candidate discovery, dedup, threshold pass-through, error handling.