Skip to content

[Perf] GetUsers triangle rewrite + partial album index for GetTracks#800

Merged
raymondjacobson merged 1 commit intomainfrom
ray/perf-getusers-triangle-album-index
May 9, 2026
Merged

[Perf] GetUsers triangle rewrite + partial album index for GetTracks#800
raymondjacobson merged 1 commit intomainfrom
ray/perf-getusers-triangle-album-index

Conversation

@raymondjacobson
Copy link
Copy Markdown
Member

Two independent fixes for the API's two hottest queries (GetUsers ~287M calls and GetTracks ~268M calls in pg_stat_statements).

1. GetUsers — rewrite current_user_followee_follow_count

Signed-in GetUsers was 700× slower than unsigned (2-3 s vs 4 ms for 20 users). Profiling each personalization subquery in isolation:

Subquery (myId=20, 20 users) Mean
does_current_user_follow 0.3 ms
does_follow_current_user 2.2 ms
does_current_user_subscribe 2.2 ms
artist_coin_badge 0.8 ms
current_user_followee_follow_count 2,246 ms ← entire delta

The old shape let Postgres pick a Merge Join that walked the full follower list of the target user — 492 k - 1.9 M rows for popular users like @audius — just to intersect with my ~1,752 followees.

The rewrite drives the loop from "my followees" (always small — at most a few thousand) and probes whether each follows the target. The LIMIT 1 OFFSET 0 inside the EXISTS is the same optimization fence used by #798 (feed): it pins the planner to nested-loop semantics so the plan never flips back to merge join.

Verified on prod read replica

Full GetUsers, 20 popular target users, three warm runs each:

Scenario Before After Δ
myId=0 (unsigned) 4 ms 2 ms sanity / unchanged
myId=20 (1752 follows) 2 - 3 s 127-155 ms ~15-20×
myId=755516 (1816 follows) 2.5 s 142-157 ms ~15-18×

End-to-end through local server: /v1/full/users/handle/audius?user_id=Wem1e (target has 1.95 M followers) → 60-85 ms warm. Response shape unchanged; current_user_followee_follow_count returns the same count as before.

2. GetTracks — partial index for album_backlink

The album_backlink subquery does ~200 random playlists_pkey lookups per popular track to filter for is_album = true AND is_delete = false AND is_current = true. ~99.98 % of those lookups get rejected by the filter — for 50 popular tracks that's 10,115 heap probes returning 1-2 actual matches.

The partial index covers only published-album playlists, so non-album lookups skip the heap entirely — the planner sees no row at the index level and moves on without fetching the page.

create index concurrently if not exists idx_playlists_albums_published
    on playlists (playlist_id)
    where is_album = true and is_delete = false and is_current = true;
  • Size: ~55,671 album rows × ~12 bytes ≈ 700 KB.
  • Built CONCURRENTLY so no ACCESS EXCLUSIVE lock — follows the pattern from placeholder popular search param. #196 (the migration whose comment explains the prior 0195 outage).
  • Expected: GetTracks album_backlink portion drops from ~38 ms (50 popular tracks, warm) to ~10-15 ms — most of GetTracks's "always-on" cost.

Risk

  • GetUsers rewrite is semantically identical. Same count(*) over the same intersection, just with the join driven from the small side. Existing user tests (TestV1UsersRelated, TestUsersFeed, etc.) pass; full ./api/... suite is green.
  • Partial index is additive. No existing query plan can regress — the planner picks it over playlists_pkey only when its WHERE clause is satisfied (i.e. the lookup is for an album row).

Test plan

  • go test -count=1 ./api/... (full suite, all green)
  • EXPLAIN ANALYZE on prod read replica across three myId regimes (unsigned, mid follows, heavy follows)
  • Local server smoke test: /v1/full/users/handle/audius?user_id=Wem1e returns identical response shape, ~70 ms warm

🤖 Generated with Claude Code

Two independent fixes for the API's two most-called queries
(GetUsers ~287M calls, GetTracks ~268M calls in pg_stat_statements).

== GetUsers: rewrite current_user_followee_follow_count

Signed-in GetUsers was 700x slower than unsigned (2-3s vs 4ms for
20 users). Drilling in: the entire delta was one personalization
subquery — "of the people I follow, how many also follow this user"
— at 2,246ms for 20 users. Every other personalization subquery
(does_current_user_follow, does_current_user_subscribe,
does_follow_current_user, artist_coin_badge) was sub-millisecond.

The old shape let Postgres pick a Merge Join that walked the full
follower list of the target — 492k-1.9M rows for popular users
like @audius — just to intersect with my ~1,752 followees.

Rewrite drives the loop from "my followees" (always small) and
probes whether each follows the target. The LIMIT 1 OFFSET 0 inside
the EXISTS is the same optimization fence used by the feed query
(#798): it pins the planner to nested-loop semantics so the plan
never flips back to merge join.

Verified on the prod read replica (full GetUsers, 20 popular target
users, three warm runs each):

  myId=0  (unsigned)               : 4ms     -> 2ms       (unchanged, sanity)
  myId=20 (1752 follows)           : 2-3s    -> 127-155ms (15-20x)
  myId=755516 (1816 follows)       : 2.5s    -> 142-157ms (15-18x)

End-to-end via local server, /v1/full/users/handle/audius (1.95M
followers) with ?user_id=Wem1e: 60-85ms warm. Response shape
unchanged; current_user_followee_follow_count returns the same
count as before.

== GetTracks album_backlink: partial album index

The album_backlink subquery does ~200 random playlists_pkey lookups
per popular track to filter for `is_album AND is_delete=false AND
is_current=true`. ~99.98% of those lookups end up rejected (because
most playlists aren't albums). For 50 popular tracks that's 10k
heap probes returning 1-2 actual matches.

A partial index covering only published-album playlists lets non-
album lookups skip the heap entirely — the planner sees no row at
the index level and moves on without ever fetching the page.

  Index size: ~55k rows x ~12 bytes ≈ 700 KB.
  Built CONCURRENTLY (no ACCESS EXCLUSIVE lock).

Expected GetTracks album_backlink portion drops from ~38ms (50
popular tracks, warm) to ~10-15ms — most of GetTracks's "always-on"
cost.
@raymondjacobson raymondjacobson merged commit da04209 into main May 9, 2026
4 checks passed
@raymondjacobson raymondjacobson deleted the ray/perf-getusers-triangle-album-index branch May 9, 2026 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant