Skip to content

Ensure keywords exist for all sources#91

Merged
brikin01 merged 4 commits into
mainfrom
stesol-520-fill-source-keywords
Jun 24, 2026
Merged

Ensure keywords exist for all sources#91
brikin01 merged 4 commits into
mainfrom
stesol-520-fill-source-keywords

Conversation

@brikin01

Copy link
Copy Markdown
Collaborator

STESOL-520

Add keywords for Ampere sources and add test to ensure future sources always have keywords

Copilot AI review requested due to automatic review settings June 16, 2026 22:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds missing Keywords metadata for Ampere document sources in the embedding-generation source list, and introduces a validation test to prevent future CSV rows from being added without keywords (supporting lexical/BM25 retrieval quality).

Changes:

  • Populate Keywords for Ampere sources in vector-db-sources.csv (and remove an invalid chrome-extension:// URL form).
  • Add a pytest that asserts all rows with a non-empty URL also have non-empty Keywords, and that required columns exist.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
embedding-generation/vector-db-sources.csv Adds keyword values for Ampere sources so they participate in lexical retrieval and meet the new validation requirement.
embedding-generation/tests/test_vector_db_sources.py Adds a CSV validation test enforcing presence of Keywords (and required columns) for all source rows.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread embedding-generation/tests/test_vector_db_sources.py
@brikin01 brikin01 requested a review from NeethuESim June 23, 2026 15:27
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

@NeethuESim NeethuESim left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me !!

@brikin01 brikin01 merged commit 3704826 into main Jun 24, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants