Gate automated review on risk/complexity scores and test coverage by msujaws · Pull Request #6249 · mozilla/bugbug

msujaws · 2026-06-29T14:59:49Z

Depends on #6248.

Adds a cheap pre-screen so the expensive review agent only runs on revisions that score low on risk and complexity, letting teams ease in by raising thresholds as confidence grows. Also adds reviewer-group resolution (with aliases) so gating can be configured per group.
Library (bugbug/tools/code_review/):

diff_analysis.py — classify in-diff test signal (absent / sparse / adequate / tests_only / no_code_change) and collect changed non-test paths.
test_coverage.py — probe mozilla-central via the async searchfox client for existing tests referencing changed files; degrades gracefully if unavailable.
scoring.py — RiskComplexityScorer: one structured Haiku call returning 0-10 risk/complexity plus factors, fed the precomputed <test_signals> block.
agent.py — inject the test-signals block into the prompt; surface token usage and tool-call counts in the response.
llms.py — DEFAULT_SCORING_MODEL and a shared usage_from_messages() helper.

Service (reviewhelper-api):

reviewer_groups.py + reviewer_groups.yaml — per-group thresholds (with defaults) from a checked-in YAML; matching_groups() resolves a revision's groups via the new PhabricatorPatch plumbing. An aliases field lets a rotation project (e.g. home-newtab-reviewers-rotation) count as the same logical group, matched against historical reviewer groups.
review_processor.py — _score_and_gate() scores before reviewing and raises ReviewSkipped("above_threshold") when either score meets its threshold.
- ReviewStatus.SKIPPED (terminal, no retry) + skipped_reason and risk/complexity/test-signal columns on review_requests, recorded even when skipped; alembic migration + status index.
- internal.py records skips as SKIPPED (HTTP 204); pricing.py adds per-model token pricing for later cost reporting.

Adds a cheap pre-screen so the (expensive) review agent only runs on revisions that score low on risk and complexity, letting teams "ease in" by raising thresholds as confidence grows. Library (bugbug/tools/code_review/): - diff_analysis.py: classify the patch's in-diff test signal (absent / sparse / adequate / tests_only / no_code_change) and collect changed non-test paths. - test_coverage.py: probe mozilla-central via the async searchfox client for existing tests referencing the changed files (covered / partial / uncovered); degrades gracefully if searchfox is unavailable. - scoring.py: RiskComplexityScorer, a single structured Haiku call returning 0-10 risk/complexity plus factors, fed the precomputed <test_signals> block. - agent.py: inject the test-signals block into the review prompt and surface token usage + tool-call counts in the response details. - llms.py: DEFAULT_SCORING_MODEL and a shared usage_from_messages() helper. Service (reviewhelper-api): - reviewer_groups.py + reviewer_groups.yaml: per-group thresholds (with defaults) loaded from a checked-in YAML file; matching_groups() resolves a revision's reviewer groups via the new PhabricatorPatch plumbing. - review_processor.py: _score_and_gate() scores before reviewing and raises ReviewSkipped("above_threshold") when either score meets its threshold. - ReviewStatus.SKIPPED (terminal, no retry) + skipped_reason and risk/complexity/test-signal columns on review_requests, recorded even when skipped; alembic migration + status index. - internal.py records skips as SKIPPED (HTTP 204); request.py messaging. - pricing.py: per-model token pricing for later cost reporting.

Consume PhabricatorPatch.historical_reviewer_project_phids so a group that was assigned via a rotation (added, then swapped for an individual and removed) still matches. Add an `aliases` field to ReviewerGroup so a rotation project (e.g. home-newtab-reviewers-rotation) counts as the same logical group as home-newtab-reviewers, and resolve every slug (own + aliases) when matching. Falls back to the current reviewer snapshot for patches without history.

This was referenced Jun 29, 2026

Limit automated review to enabled groups and member authors #6250

Open

Match review-context rules on reviewer groups #6251

Open

msujaws force-pushed the feature-1-risk-gating branch from b342f0c to 48cbf57 Compare June 29, 2026 18:30

msujaws added 2 commits June 30, 2026 11:45

msujaws force-pushed the feature-1-risk-gating branch from 48cbf57 to b403990 Compare June 30, 2026 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gate automated review on risk/complexity scores and test coverage#6249

Gate automated review on risk/complexity scores and test coverage#6249
msujaws wants to merge 2 commits into
mozilla:masterfrom
msujaws:feature-1-risk-gating

msujaws commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

msujaws commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant