Add monitor tuning meta-guide for run-volume and false-positive avoidance by samgutentag · Pull Request #53 · trunk-io/docs2

samgutentag · 2026-05-20T22:59:23Z

Summary

New page: flaky-tests/detection/tuning-monitors.mdx
Adds a docs.json nav entry under the Flaky test detection group, slotted after the three monitor-type pages
Ties together run-volume → monitor-type recommendations, single-failure-flap avoidance, branch coverage, recovery vs activation, monitor states (active / inactive / disabled), and a pre-auto-quarantine checklist

Why

Sourced from customer feedback mining (cluster monitor-tuning-thresholds, verdict partial + first-class IA candidate, 15 pairs across 7 customers). The individual monitor pages already document each monitor type. Customers consistently ask the same set of system-level tuning questions — when to use failure-count vs failure-rate, how to avoid single-failure flips, why a monitor scoped to main misses queue-branch failures, what "inactive" means in the UI, what to check before turning on auto-quarantine.

Items flagged for review

Page location. Slotted under flaky-tests/detection/ rather than flaky-tests/management/ because the page is about tuning detection behavior, not managing already-detected tests. The cluster suggestion mentioned either location; this felt cleaner since every link inside the page points at detection pages. Confirm or move.
Auto-quarantine recommended window: "1-3 days." Lifted from the cluster Q&A (Caseware thread). Confirm this still matches current eng guidance.
Pass-on-Retry default recovery = 7 days, range 1-15. Pulled from pass-on-retry-monitor.mdx and matches the cluster Gusto thread.
Branch patterns table (Trunk Merge Queue / GitHub Merge Queue / Graphite Merge Queue) mirrors the table in failure-rate-monitor.mdx. GitLab Merge Trains intentionally omitted since the cluster didn't surface a question about them — failure-rate-monitor.mdx notes they run on the target branch directly.
The "gap" section explicitly calls out that there's no way to distinguish "flakes detected in MQ" from "bad PR in MQ" at the monitor level, and proposes a >=2 failures in 1h failure-count threshold on queue branches as a proxy. This came directly from the Gusto thread reply ("Higher-threshold failure count monitor that marks broken is the right pattern... No good way to distinguish flakes-detected-in-MQ from actual-bad-PRs-in-MQ today."). Confirm the proxy guidance is still accurate.
"Inactive" state definition. Cluster note said "Copy will be improved" in the UI — the doc currently defines it as "previously triggered, no longer triggered, still enabled." Confirm this matches the latest UI state and whether the copy change has shipped.
Pre-auto-quarantine cross-link points at ../agents/autofix-flaky-tests. That page exists but its content is more about the auto-investigation/PR flow than the auto-quarantine toggle. If there's a better target page for the auto-quarantine setting itself, swap it.

Customer signal

Cluster: monitor-tuning-thresholds (verdict: partial, 15 pairs / 7 customers, first-class IA candidate)
Channels: trunk-gusto, trunk-retool, trunk-descript, trunk-chainlink, trunk-healthie, trunk-caseware
Source threads (full list in findings/clusters/monitor-tuning-thresholds.json):

…e-failure-flap avoidance, branch coverage

mintlify · 2026-05-20T23:03:10Z

Preview deployment for your docs. Learn more about Mintlify Previews.

Project	Status	Preview	Updated (UTC)
trunk	🟢 Ready	View Preview	May 20, 2026, 11:05 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

add monitor tuning meta-guide: matching monitors to run volume, singl…

46dc9a5

…e-failure-flap avoidance, branch coverage

samgutentag added the needs review PR sourced from customer-feedback-mining; needs human scrutiny for accuracy before merge label May 20, 2026

mintlify Bot deployed to staging May 20, 2026 23:05 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add monitor tuning meta-guide for run-volume and false-positive avoidance#53

Add monitor tuning meta-guide for run-volume and false-positive avoidance#53
samgutentag wants to merge 1 commit into
mainfrom
sam-gutentag/monitor-tuning

samgutentag commented May 20, 2026

Uh oh!

mintlify Bot commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

samgutentag commented May 20, 2026

Summary

Why

Items flagged for review

Customer signal

Uh oh!

mintlify Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

mintlify Bot commented May 20, 2026 •

edited

Loading