|
| 1 | +# Vector Search Implementation – LLM Build Prompts |
| 2 | + |
| 3 | +> This document supplies a **blueprint** and a **chain of reusable prompts** for a code-generation LLM (e.g. Cursor‐GPT) to implement semantic vector search in KillrVideo **incrementally and test-driven**. |
| 4 | +> Every prompt is independent, yet builds on artifacts produced by the previous one. After completing **each** prompt the LLM must: |
| 5 | +> 1. **Run** `ruff --fix`, `black .`, `pytest -q` (failing tests → iterate). |
| 6 | +> 2. Commit only when the *entire* suite is green. |
| 7 | +
|
| 8 | +--- |
| 9 | + |
| 10 | +## 0. Glossary & Conventions |
| 11 | +* **NV-Embed** = NVIDIA embedding model (4096-dim) with **512-token** input cap. |
| 12 | +* **Data API** = Astra DB `$vectorize` endpoint used for vector search. |
| 13 | +* **FastAPI app** lives in `app/`. |
| 14 | +* **Test helpers** in `tests/` use `pytest` + `pytest-asyncio`. |
| 15 | +* Use **feature flag** `settings.VECTOR_SEARCH_ENABLED` (default **False**). |
| 16 | +* Use `HTTP 400` for token-limit violations in search/query paths. |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## 1. High-Level Blueprint (single narrative) |
| 21 | +1. **Schema Migration** – enlarge `content_features` column, attach NVIDIA provider, recreate SAI index, & seed backfill. |
| 22 | +2. **Ingestion Changes** – on video submit/update, assemble *title + description + tags*, clip to 512 tokens, store as *string* (server vectorises). |
| 23 | +3. **Semantic Search** – new helper in `video_service`, integrate into existing `/search/videos` endpoint, add `mode` param. |
| 24 | +4. **Pagination & Validation** – respect `page`/`pageSize`, enforce query length ≤ 512 tokens, keyword fallback when semantic disabled. |
| 25 | +5. **OpenAPI & Docs** – update schemas + docs. |
| 26 | +6. **Front-end Search UI** – add search box & results list, fallback messaging. |
| 27 | +7. **Feature Roll-out** – env flag, smoke tests, monitoring hooks. |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +## 2. Iterative Roadmap → Chunks |
| 32 | +| Milestone | Chunk | Output | |
| 33 | +|-----------|-------|--------| |
| 34 | +| M1 Schema | C1.1 DDL script (JSON & CQL) | `migrations/2025_08_vector.sql` & CI-run JSON | |
| 35 | +| | C1.2 Py backfill job | `scripts/backfill_vectors.py` | |
| 36 | +| M2 Ingest | C2.1 `clip_to_512_tokens` util + tests | `app/utils/text.py` | |
| 37 | +| | C2.2 Submit flow rewrite | `video_service.py` patched | |
| 38 | +| M3 Search | C3.1 Service helper | `search_semantic()` + unit tests | |
| 39 | +| | C3.2 Router wiring | `/api/v1/search/videos` param, tests | |
| 40 | +| M4 Docs | C4.1 OpenAPI regen | updated yaml | |
| 41 | +| M5 Front | C5.1 Home search UI | React component & e2e tests | |
| 42 | +| M6 Roll-out | C6.1 Feature flag infra | settings + toggles | |
| 43 | + |
| 44 | +Chunks are intentionally modest (≈1–3 files each, <150 LoC). |
| 45 | + |
| 46 | +--- |
| 47 | + |
| 48 | +## 3. Right-Sized Steps (final cut) |
| 49 | +1. **Step 1 – Create DB migration & index recreation** |
| 50 | +2. **Step 2 – Backfill existing videos with `$vectorize` bulk update** |
| 51 | +3. **Step 3 – Add `clip_to_512_tokens()` util + tests** |
| 52 | +4. **Step 4 – Modify `submit_new_video()` to store string & guard token count** |
| 53 | +5. **Step 5 – Implement `search_videos_by_semantic()` helper + unit tests** |
| 54 | +6. **Step 6 – Extend search router with `mode` param, integrate helper** |
| 55 | +7. **Step 7 – Update OpenAPI YAML & regenerate client** |
| 56 | +8. **Step 8 – Add feature flag env var + toggling logic** |
| 57 | +9. **Step 9 – Front-end search bar + API wiring (mocked until backend green)** |
| 58 | +10. **Step 10 – Smoke & load tests, rollout script** |
| 59 | + |
| 60 | +Each step below is accompanied by an LLM prompt. |
| 61 | + |
| 62 | +--- |
| 63 | + |
| 64 | +## 4. Prompts (feed sequentially) |
| 65 | + |
| 66 | +### Prompt 1 – DB Migration |
| 67 | +```text |
| 68 | +You are working inside the KillrVideo FastAPI repo. |
| 69 | +Goal: **Enlarge the `videos.content_features` column to `vector<float,4096>` and attach the NVIDIA service**. Also drop & recreate the SAI cosine index. |
| 70 | +Tasks: |
| 71 | +1. Add *migrations/2025_08_vector.cql* containing the necessary `ALTER TABLE`, `DROP INDEX`, `CREATE INDEX` CQL. |
| 72 | +2. Add *migrations/2025_08_vector.json* Data API payload (see docs/vector_search.md §3). |
| 73 | +3. Register the SQL script in *scripts/migrate.py* so CI picks it up. |
| 74 | +4. Unit test: mock Cassandra session; assert index metadata after migration. |
| 75 | +After coding run **ruff, black, pytest**. Ensure all tests pass. |
| 76 | +``` |
| 77 | + |
| 78 | +--- |
| 79 | + |
| 80 | +### Prompt 2 – Vector Backfill Job |
| 81 | +```text |
| 82 | +Goal: **Populate the new 4096-dim vectors for existing rows**. |
| 83 | +1. Create *scripts/backfill_vectors.py*. |
| 84 | + • Scan `videos` where `content_features IS NULL` (page size 100). |
| 85 | + • Build text = title + description + tags. |
| 86 | + • POST Data API `updateMany` with `$vectorize`. |
| 87 | +2. Provide CLI entry-point `python -m scripts.backfill_vectors --dry-run`. |
| 88 | +3. Add unit tests with `responses` to stub Data API. |
| 89 | +Run lints/tests until green. |
| 90 | +``` |
| 91 | + |
| 92 | +--- |
| 93 | + |
| 94 | +### Prompt 3 – Token-Clipping Utility |
| 95 | +```text |
| 96 | +Goal: Guard 512-token NVIDIA limit. |
| 97 | +1. Create *app/utils/text.py* with `clip_to_512_tokens(text: str) -> str` using whitespace splitter. |
| 98 | +2. Edge-case: consecutive whitespace, Unicode punctuation. |
| 99 | +3. Tests: >512 tokens → clipped length ==512, ≤512 unchanged. |
| 100 | +Run ruff/black/pytest. |
| 101 | +``` |
| 102 | + |
| 103 | +--- |
| 104 | + |
| 105 | +### Prompt 4 – Ingestion Pipeline Update |
| 106 | +```text |
| 107 | +Goal: Use auto-vectorize on insert. |
| 108 | +1. In *app/services/video_service.py* → function `submit_new_video`: |
| 109 | + • Build `embedding_text` from name/description/tags. |
| 110 | + • Call `clip_to_512_tokens`. |
| 111 | + • Assign string to `content_features` field. |
| 112 | +2. Remove legacy 16-float stub path. |
| 113 | +3. Add unit tests with `monkeypatch` to verify Data API payload contains **string**, not list. |
| 114 | +Run lints/tests. |
| 115 | +``` |
| 116 | + |
| 117 | +--- |
| 118 | + |
| 119 | +### Prompt 5 – Semantic Search Helper |
| 120 | +```text |
| 121 | +Goal: Backend ANN search wrapper. |
| 122 | +1. Add `search_videos_by_semantic(query: str, page:int, page_size:int)` to *video_service.py*. |
| 123 | + • Validate len(query_tokens) ≤512 else raise `InvalidQueryError` (400). |
| 124 | + • Call Data API `find` with `sort:{"$vectorize": query}`. |
| 125 | +2. Return list[VideoSummary] preserving existing pagination schema. |
| 126 | +3. Tests: stub API, assert ordering & error path. |
| 127 | +Run lints/tests. |
| 128 | +``` |
| 129 | + |
| 130 | +--- |
| 131 | + |
| 132 | +### Prompt 6 – API Router Wiring |
| 133 | +```text |
| 134 | +Goal: Expose semantic mode. |
| 135 | +1. In *routers/search.py* add optional `mode: Literal['semantic','keyword']='semantic'`. |
| 136 | +2. If `mode=='semantic' and settings.VECTOR_SEARCH_ENABLED` → call helper; else fallback. |
| 137 | +3. Update OpenAPI annotations. |
| 138 | +4. Tests: both branches, 400 on long query. |
| 139 | +Run lints/tests. |
| 140 | +``` |
| 141 | + |
| 142 | +--- |
| 143 | + |
| 144 | +### Prompt 7 – OpenAPI & Client Regen |
| 145 | +```text |
| 146 | +Goal: Align docs with new behaviour. |
| 147 | +1. Update *docs/killrvideo_openapi.yaml* paths `/search/videos` (`mode` param, 400 response). |
| 148 | +2. Run generator (`scripts/gen_client.py`) to refresh `client/` stubs. |
| 149 | +3. Ensure CI passes. |
| 150 | +``` |
| 151 | + |
| 152 | +--- |
| 153 | + |
| 154 | +### Prompt 8 – Feature Flag Infrastructure |
| 155 | +```text |
| 156 | +Goal: Toggle vector search safely. |
| 157 | +1. Add `VECTOR_SEARCH_ENABLED: bool = False` to *app/core/config.py* (env-driven). |
| 158 | +2. Docs update in README & `.env.example`. |
| 159 | +3. Unit test: flag off ⇒ helper not called. |
| 160 | +Run lints/tests. |
| 161 | +``` |
| 162 | + |
| 163 | +--- |
| 164 | + |
| 165 | +### Prompt 9 – Front-end Search UI |
| 166 | +```text |
| 167 | +Goal: New search bar (React / Next.js). |
| 168 | +1. Create `components/SemanticSearchBar.tsx`. |
| 169 | +2. Call backend `/api/v1/search/videos?q=...`. |
| 170 | +3. Display results using existing `VideoCard`. |
| 171 | +4. Cypress e2e: search term returns expected mock. |
| 172 | +Run `npm run lint && npm run test` until green. |
| 173 | +``` |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +### Prompt 10 – Smoke & Load Tests + Roll-out Script |
| 178 | +```text |
| 179 | +Goal: Confidence for production switch. |
| 180 | +1. Add *tests/e2e/test_semantic_search.py* hitting a staging DB. |
| 181 | +2. Add Locust file *load/semantic_search.py* (RPS 20). |
| 182 | +3. Create *scripts/enable_vector_flag.py* that flips env + triggers migration. |
| 183 | +4. Update GitHub Actions workflow to run load test nightly. |
| 184 | +Run lints/tests. |
| 185 | +``` |
| 186 | + |
| 187 | +--- |
| 188 | + |
| 189 | +**End of prompts.** |
| 190 | + |
| 191 | +Once Prompt 10 passes all checks, the vector search feature is fully integrated, tested and ready for production rollout. |
0 commit comments