Gate automated review on risk/complexity scores and test coverage#6249
Open
msujaws wants to merge 2 commits into
Open
Gate automated review on risk/complexity scores and test coverage#6249msujaws wants to merge 2 commits into
msujaws wants to merge 2 commits into
Conversation
This was referenced Jun 29, 2026
b342f0c to
48cbf57
Compare
Adds a cheap pre-screen so the (expensive) review agent only runs on revisions
that score low on risk and complexity, letting teams "ease in" by raising
thresholds as confidence grows.
Library (bugbug/tools/code_review/):
- diff_analysis.py: classify the patch's in-diff test signal (absent / sparse /
adequate / tests_only / no_code_change) and collect changed non-test paths.
- test_coverage.py: probe mozilla-central via the async searchfox client for
existing tests referencing the changed files (covered / partial / uncovered);
degrades gracefully if searchfox is unavailable.
- scoring.py: RiskComplexityScorer, a single structured Haiku call returning
0-10 risk/complexity plus factors, fed the precomputed <test_signals> block.
- agent.py: inject the test-signals block into the review prompt and surface
token usage + tool-call counts in the response details.
- llms.py: DEFAULT_SCORING_MODEL and a shared usage_from_messages() helper.
Service (reviewhelper-api):
- reviewer_groups.py + reviewer_groups.yaml: per-group thresholds (with
defaults) loaded from a checked-in YAML file; matching_groups() resolves a
revision's reviewer groups via the new PhabricatorPatch plumbing.
- review_processor.py: _score_and_gate() scores before reviewing and raises
ReviewSkipped("above_threshold") when either score meets its threshold.
- ReviewStatus.SKIPPED (terminal, no retry) + skipped_reason and
risk/complexity/test-signal columns on review_requests, recorded even when
skipped; alembic migration + status index.
- internal.py records skips as SKIPPED (HTTP 204); request.py messaging.
- pricing.py: per-model token pricing for later cost reporting.
Consume PhabricatorPatch.historical_reviewer_project_phids so a group that was assigned via a rotation (added, then swapped for an individual and removed) still matches. Add an `aliases` field to ReviewerGroup so a rotation project (e.g. home-newtab-reviewers-rotation) counts as the same logical group as home-newtab-reviewers, and resolve every slug (own + aliases) when matching. Falls back to the current reviewer snapshot for patches without history.
48cbf57 to
b403990
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Depends on #6248.
Adds a cheap pre-screen so the expensive review agent only runs on revisions that score low on risk and complexity, letting teams ease in by raising thresholds as confidence grows. Also adds reviewer-group resolution (with aliases) so gating can be configured per group.
Library (bugbug/tools/code_review/):
Service (reviewhelper-api):
aliasesfield lets a rotation project (e.g. home-newtab-reviewers-rotation) count as the same logical group, matched against historical reviewer groups.