Set up a recurring schedule to run scripts/rematch_spotify_duration_mismatches.py on production so duration-mismatched Spotify links get re-evaluated as new releases land and the matcher's scoring evolves.
Background
scripts/rematch_spotify_duration_mismatches.py (added in 0317b98) walks songs whose Spotify links have a > 60s duration mismatch against the canonical recording duration, and enqueues per-song ('spotify', 'rematch_duration_mismatches') jobs onto the durable research queue. The worker drains them via SpotifyMatcher.
It's a one-shot today — kicked off via Render shell. Without a recurring trigger, new mismatches accumulate silently between manual runs.
Options
1. Render Cron Job (simpler — recommended starting point)
Add via Render dashboard or render.yaml:
services:
- type: cron
name: spotify-rematch-monthly
runtime: python
schedule: "0 4 1 * *" # 04:00 UTC on the 1st of each month
buildCommand: pip install -r backend/requirements.txt
startCommand: cd backend && python scripts/rematch_spotify_duration_mismatches.py
envVars:
- fromGroup: <existing prod env-group>
- Pro: native to Render, runs with the same DB creds as the worker, no extra auth.
- Con: failures only visible in Render logs unless a notification is wired up.
2. Claude scheduled agent (more capable, more setup)
Agent runs on a cron, would need to call an admin HTTP endpoint that triggers the sweep, then post a summary back somewhere visible. Useful for the "summarise what got cleaned up" part but requires adding the admin endpoint first.
Decisions to make
- Cadence: monthly is the default in option 1 above, but weekly might be appropriate if mismatch volume is high.
- Threshold: production runs should default to
--threshold-seconds 60 until volume is low enough to tighten to 30s.
- Notification: do nothing (just rely on
/admin/research/?source=spotify&job_type=rematch_duration_mismatches to spot-check), wire a Slack/email notification on cron failure, or build the summary-posting agent (option 2).
Out of scope
Auto-unlinking stubborn mismatches the matcher can't fix. That's a separate trust call — the existing /admin/duration-mismatches review page covers human-driven cleanup today.
Set up a recurring schedule to run
scripts/rematch_spotify_duration_mismatches.pyon production so duration-mismatched Spotify links get re-evaluated as new releases land and the matcher's scoring evolves.Background
scripts/rematch_spotify_duration_mismatches.py(added in 0317b98) walks songs whose Spotify links have a > 60s duration mismatch against the canonical recording duration, and enqueues per-song('spotify', 'rematch_duration_mismatches')jobs onto the durable research queue. The worker drains them via SpotifyMatcher.It's a one-shot today — kicked off via Render shell. Without a recurring trigger, new mismatches accumulate silently between manual runs.
Options
1. Render Cron Job (simpler — recommended starting point)
Add via Render dashboard or
render.yaml:2. Claude scheduled agent (more capable, more setup)
Agent runs on a cron, would need to call an admin HTTP endpoint that triggers the sweep, then post a summary back somewhere visible. Useful for the "summarise what got cleaned up" part but requires adding the admin endpoint first.
Decisions to make
--threshold-seconds 60until volume is low enough to tighten to 30s./admin/research/?source=spotify&job_type=rematch_duration_mismatchesto spot-check), wire a Slack/email notification on cron failure, or build the summary-posting agent (option 2).Out of scope
Auto-unlinking stubborn mismatches the matcher can't fix. That's a separate trust call — the existing
/admin/duration-mismatchesreview page covers human-driven cleanup today.