[Perf] GetUsers triangle rewrite + partial album index for GetTracks#800
Merged
raymondjacobson merged 1 commit intomainfrom May 9, 2026
Merged
Conversation
Two independent fixes for the API's two most-called queries (GetUsers ~287M calls, GetTracks ~268M calls in pg_stat_statements). == GetUsers: rewrite current_user_followee_follow_count Signed-in GetUsers was 700x slower than unsigned (2-3s vs 4ms for 20 users). Drilling in: the entire delta was one personalization subquery — "of the people I follow, how many also follow this user" — at 2,246ms for 20 users. Every other personalization subquery (does_current_user_follow, does_current_user_subscribe, does_follow_current_user, artist_coin_badge) was sub-millisecond. The old shape let Postgres pick a Merge Join that walked the full follower list of the target — 492k-1.9M rows for popular users like @audius — just to intersect with my ~1,752 followees. Rewrite drives the loop from "my followees" (always small) and probes whether each follows the target. The LIMIT 1 OFFSET 0 inside the EXISTS is the same optimization fence used by the feed query (#798): it pins the planner to nested-loop semantics so the plan never flips back to merge join. Verified on the prod read replica (full GetUsers, 20 popular target users, three warm runs each): myId=0 (unsigned) : 4ms -> 2ms (unchanged, sanity) myId=20 (1752 follows) : 2-3s -> 127-155ms (15-20x) myId=755516 (1816 follows) : 2.5s -> 142-157ms (15-18x) End-to-end via local server, /v1/full/users/handle/audius (1.95M followers) with ?user_id=Wem1e: 60-85ms warm. Response shape unchanged; current_user_followee_follow_count returns the same count as before. == GetTracks album_backlink: partial album index The album_backlink subquery does ~200 random playlists_pkey lookups per popular track to filter for `is_album AND is_delete=false AND is_current=true`. ~99.98% of those lookups end up rejected (because most playlists aren't albums). For 50 popular tracks that's 10k heap probes returning 1-2 actual matches. A partial index covering only published-album playlists lets non- album lookups skip the heap entirely — the planner sees no row at the index level and moves on without ever fetching the page. Index size: ~55k rows x ~12 bytes ≈ 700 KB. Built CONCURRENTLY (no ACCESS EXCLUSIVE lock). Expected GetTracks album_backlink portion drops from ~38ms (50 popular tracks, warm) to ~10-15ms — most of GetTracks's "always-on" cost.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two independent fixes for the API's two hottest queries (GetUsers ~287M calls and GetTracks ~268M calls in
pg_stat_statements).1. GetUsers — rewrite
current_user_followee_follow_countSigned-in
GetUserswas 700× slower than unsigned (2-3 s vs 4 ms for 20 users). Profiling each personalization subquery in isolation:does_current_user_followdoes_follow_current_userdoes_current_user_subscribeartist_coin_badgecurrent_user_followee_follow_countThe old shape let Postgres pick a Merge Join that walked the full follower list of the target user — 492 k - 1.9 M rows for popular users like @audius — just to intersect with my ~1,752 followees.
The rewrite drives the loop from "my followees" (always small — at most a few thousand) and probes whether each follows the target. The
LIMIT 1 OFFSET 0inside theEXISTSis the same optimization fence used by #798 (feed): it pins the planner to nested-loop semantics so the plan never flips back to merge join.Verified on prod read replica
Full
GetUsers, 20 popular target users, three warm runs each:End-to-end through local server:
/v1/full/users/handle/audius?user_id=Wem1e(target has 1.95 M followers) → 60-85 ms warm. Response shape unchanged;current_user_followee_follow_countreturns the same count as before.2. GetTracks — partial index for
album_backlinkThe
album_backlinksubquery does ~200 randomplaylists_pkeylookups per popular track to filter foris_album = true AND is_delete = false AND is_current = true. ~99.98 % of those lookups get rejected by the filter — for 50 popular tracks that's 10,115 heap probes returning 1-2 actual matches.The partial index covers only published-album playlists, so non-album lookups skip the heap entirely — the planner sees no row at the index level and moves on without fetching the page.
CONCURRENTLYso noACCESS EXCLUSIVElock — follows the pattern from placeholder popular search param. #196 (the migration whose comment explains the prior 0195 outage).album_backlinkportion drops from ~38 ms (50 popular tracks, warm) to ~10-15 ms — most of GetTracks's "always-on" cost.Risk
count(*)over the same intersection, just with the join driven from the small side. Existing user tests (TestV1UsersRelated, TestUsersFeed, etc.) pass; full./api/...suite is green.playlists_pkeyonly when itsWHEREclause is satisfied (i.e. the lookup is for an album row).Test plan
go test -count=1 ./api/...(full suite, all green)/v1/full/users/handle/audius?user_id=Wem1ereturns identical response shape, ~70 ms warm🤖 Generated with Claude Code