Add monitor tuning meta-guide for run-volume and false-positive avoidance#53
Open
samgutentag wants to merge 1 commit into
Open
Add monitor tuning meta-guide for run-volume and false-positive avoidance#53samgutentag wants to merge 1 commit into
samgutentag wants to merge 1 commit into
Conversation
…e-failure-flap avoidance, branch coverage
Contributor
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
flaky-tests/detection/tuning-monitors.mdxdocs.jsonnav entry under the Flaky test detection group, slotted after the three monitor-type pagesWhy
Sourced from customer feedback mining (cluster
monitor-tuning-thresholds, verdictpartial+ first-class IA candidate, 15 pairs across 7 customers). The individual monitor pages already document each monitor type. Customers consistently ask the same set of system-level tuning questions — when to use failure-count vs failure-rate, how to avoid single-failure flips, why a monitor scoped tomainmisses queue-branch failures, what "inactive" means in the UI, what to check before turning on auto-quarantine.Items flagged for review
flaky-tests/detection/rather thanflaky-tests/management/because the page is about tuning detection behavior, not managing already-detected tests. The cluster suggestion mentioned either location; this felt cleaner since every link inside the page points at detection pages. Confirm or move.pass-on-retry-monitor.mdxand matches the cluster Gusto thread.failure-rate-monitor.mdx. GitLab Merge Trains intentionally omitted since the cluster didn't surface a question about them — failure-rate-monitor.mdx notes they run on the target branch directly.>=2 failures in 1hfailure-count threshold on queue branches as a proxy. This came directly from the Gusto thread reply ("Higher-threshold failure count monitor that marks broken is the right pattern... No good way to distinguish flakes-detected-in-MQ from actual-bad-PRs-in-MQ today."). Confirm the proxy guidance is still accurate.../agents/autofix-flaky-tests. That page exists but its content is more about the auto-investigation/PR flow than the auto-quarantine toggle. If there's a better target page for the auto-quarantine setting itself, swap it.Customer signal
monitor-tuning-thresholds(verdict: partial, 15 pairs / 7 customers, first-class IA candidate)findings/clusters/monitor-tuning-thresholds.json):