Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions merge-queue/optimizations/batching.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@

* **High-risk changes** — Infrastructure updates, database migrations, or changes that could affect other PRs in unpredictable ways
* **Debugging batch failures** — Isolate a suspected problematic PR to confirm it tests correctly on its own
* **Critical hotfixes** — Make sure a time-sensitive fix isn't delayed or affected by other PRs in a batch

Check warning on line 45 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L45

Did you really mean 'hotfixes'?
* **Flaky PR isolation** — Test a PR with known flaky behavior separately to avoid impacting other PRs

#### How to exclude a PR from batching
Expand Down Expand Up @@ -322,7 +322,7 @@
* **Misconception:** "If a batch fails, all PRs in the batch fail" 
* **Reality:** Trunk automatically splits the batch and retests to identify only the failing PR(s). Passing PRs still merge.
* **Misconception:** "Batching always makes the queue faster" 
* **Reality:** Batching is most effective with stable tests and high PR volume. For low-traffic repos or flaky tests, the overhead may outweigh benefits.

Check warning on line 325 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L325

Did you really mean 'repos'?

## Related features

Expand All @@ -336,6 +336,91 @@

**Anti-flake protection** - Essential companion to batching. Reduces false batch failures caused by flaky tests, making batching more reliable and efficient.

## How batching interacts with parallel queues and PFD

Batching, [parallel queues](./parallel-queues), [pending failure depth](./pending-failure-depth), and [optimistic merging](./optimistic-merging) each shape the queue independently — but they also interact in ways that are easy to miss when reading each page in isolation. This section covers the cross-feature behaviors customers most often ask about.

### Batching is another form of lane

It's tempting to think of batching as something that happens *within* a parallel-queue lane. It's clearer to think of a batch as **itself another form of lane**: a temporary testing-lane shape that groups PRs together for a single test run.

Parallel queues split the queue into lanes based on impacted targets. Batching groups PRs into a single test unit. Both produce a "thing the queue tests as one." The two are orthogonal — you can run batching inside a parallel-queue setup, and the graph view shows the resulting lane shapes.

<Info>
Use the merge queue graph view to see how batches and parallel-queue lanes overlap for your queue's current state.
</Info>

### PR batch eligibility

Trunk groups PRs into a batch based on:

* **FIFO order** — PRs are considered for batching in the order they were submitted to the queue.
* **Target overlap** — In parallel mode, PRs only batch together if they share enough impacted-target overlap to belong to the same lane. PRs with disjoint target sets stay in separate lanes and don't batch with each other.
* **Maximum wait time and target batch size** — A batch forms when either the target batch size fills, or the maximum wait time elapses with at least one PR present.
* **`--no-batch` flag** — PRs submitted with `--no-batch` (or `noBatch: true` via API) test in isolation, regardless of the batching configuration.

If a PR ahead in the queue declares it impacts ALL targets, that doesn't force every downstream PR into the same batch — see the next section.

### The ALL keyword does not serialize downstream PRs

A PR that impacts ALL targets blocks downstream PRs that share *any* target with it. But "blocks" doesn't mean "serializes" — downstream PRs without shared targets can continue testing in parallel behind the ALL-impacting PR, up to your testing concurrency.

Concretely: if PR-A impacts ALL targets and PR-B impacts only `docs`, PR-B can still test in parallel behind PR-A. If PR-A fails, PR-B's test result is unaffected — they were never sharing a lane.

<Info>
ALL is a correctness signal for impact, not a serialization directive. Concurrency, target overlap, and FIFO order still govern what tests in parallel.
</Info>

### Transitive dependents are usually captured by impacted targets

A common worry: "PR-1 changes target X, PR-2 introduces a feature that depends on X's old behavior. Will they merge in parallel and break main?"

If the two PRs share zero impacted targets, the queue treats them as parallel-safe. In practice this is rare: most impact-detection tools (Bazel, Nx, and similar) include transitive dependents when computing impacted targets. PR-2 would typically list target X as well, putting them in the same lane.

Check warning on line 378 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L378

Did you really mean 'Bazel'?

Check warning on line 378 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L378

Did you really mean 'Nx'?

If your impact-detection setup misses transitive dependents, you'll see false-parallel merges. That's a signal to widen your impact graph, not to disable parallel mode.

### Bisection splits in half, not one-by-one

When a batch fails and the queue needs to find the culprit, it **splits the batch in half** rather than peeling off individual PRs.

A batch of 5 doesn't bisect into 5 isolated tests. It bisects into 2 sub-batches, retests, and recurses on whichever sub-batch failed. Your **Bisection Testing Concurrency** is the concurrency limit for those sub-batch test runs — not the number of individual PRs being retested at once.

Check warning on line 386 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L386

Did you really mean 'recurses'?

This matters when you size bisection concurrency: setting it equal to your batch size is more than you need. The bisection process is logarithmic in batch size.

### PFD's downstream-PR delay

Check warning on line 390 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L390

Did you really mean 'PFD's'?

Pending failure depth waits for **predecessor** groups to finish (always) and **successor** groups to finish (up to the configured depth). One consequence customers hit:

> **Example.** PFD is set to 1. PR-A fails. PR-B is testing behind it. A new PR-C arrives that shares impacted targets with PR-A. The queue will wait for PR-C to finish testing before it transitions PR-A out of Pending Failure — even though PR-C wasn't in the queue when PR-A failed.

This is by design. PFD's correctness signal depends on observing how successor PRs that include the failed group's changes behave. A newly-arrived overlapping PR is a valid successor test — its result is informative about whether PR-A's failure was a flake or a real failure. Waiting for it produces a stronger signal.

Check warning on line 396 in merge-queue/optimizations/batching.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/batching.mdx#L396

Did you really mean 'PFD's'?

The trade-off: high PFD values combined with frequent new submissions can extend the time before a known-failed PR is kicked from the queue.

<Info>
If you need an upper bound on how many successor *conclusions* (not just queue positions) PFD waits on, that's an active area of design — share your use case with support.
</Info>

### Disabling optimistic merge causes batch-removal restarts

[Optimistic merging](./optimistic-merging) lets downstream batches keep testing against a projected future state of main while an upstream batch is still being resolved. If you disable optimistic merging, the queue can no longer reuse those downstream test results when the upstream batch changes shape.

So if a batch ahead of yours is removed (for example, because it failed bisection and one PR was kicked), your batch may need to **re-test from scratch** — even if it had already passed. Customers who disabled optimistic merging have reported PRs going from "ready to merge" back to "testing" specifically because of this restart behavior.

<Info>
If you're seeing unexplained re-tests of already-passing batches, check whether optimistic merging is disabled.
</Info>

### MQ-only failures aren't yet flakiness signals

A test that fails in the merge queue but passes on main isn't currently fed into [Trunk Flaky Tests](../../flaky-tests/overview) as a flakiness signal. Similarly, bisection test runs aren't surfaced as flakiness data today.

In practice, the queue's anti-flake protection (optimistic merging + PFD) catches a lot of these transient failures without needing a separate flakiness signal. But if you're trying to understand why an MQ-only flake doesn't show up in your flaky-test dashboard, that's why.

### Event-side view

For the webhook lifecycle of these batch events — including the `pending_failure` event that fires when a batch enters the hold state — see the [webhooks reference](../webhooks).

## Batching + Optimistic Merging and Pending Failure Depth

Enabling batching along with Pending Failure Depth and Optimistic Merging can help you realize the major cost savings of batching while still reaping the [anti-flake](./anti-flake-protection) protection of optimistic merging and pending failure depth.
Expand Down
4 changes: 4 additions & 0 deletions merge-queue/optimizations/parallel-queues/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
description: "Create dynamic parallel queues to reduce queue time"
---

Normally, a merge queue behaves by enqueueing all submitted pull requests into a single line. Under this mode of operation, every pull request is [predictively tested ](/merge-queue/optimizations/predictive-testing)against the pull requests ahead of it. While this guarantees the correctness of the protected branch at all times, under a high submission load, the wait time for an item in the queue can be negatively impacted.

Check warning on line 6 in merge-queue/optimizations/parallel-queues/index.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/parallel-queues/index.mdx#L6

Did you really mean 'enqueueing'?

A regular merge queue operates like a grocery store with only a single checkout lane. When a lot of folks are trying to checkout at the same time - the line will grow (sometimes intolerably). With a dynamic parallel queue, trunk merge creates additional checkout lanes in real-time while still guaranteeing that the protected branch doesn't break.​

Expand All @@ -18,7 +18,7 @@
* PR C with impacted target list `[ frontend, backend]`
* PR D with impacted target list `[ docs]`

Without parallelization, the PRs **A**, **B**, **C**, and **D** would all be tested in a single predictive path **A** \<- **B** \<- **C** \<- **D**. Using the impacted target information we can instead build three dynamically provisioned queues and the predictive testing can yield higher throughput - which means your pull request spends less time in the queue stuck testing with unrelated code changes.

Check warning on line 21 in merge-queue/optimizations/parallel-queues/index.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/parallel-queues/index.mdx#L21

Did you really mean 'parallelization'?

<Frame caption="Three Dynamic Parallel Queues">
<img src="/assets/file.excalidraw_(6).svg" alt="Three Dynamic Parallel Queues" class="gitbook-drawing"/>
Expand All @@ -28,13 +28,13 @@

To run in parallel mode, each pull request needs to be inspected for its impacted targets. This is a fancy way of saying that each pull request needs to report what parts of the codebase are changing.

In the example above, the pull requests **A**, **B**, and **D** can be tested in isolation since they affect distinct targets - `backend`, `frontend` and `docs`. The **C** pull request affects both `frontend` and `backend` and would be tested predictively with the changes in both **A** and **B**.

Check warning on line 31 in merge-queue/optimizations/parallel-queues/index.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/parallel-queues/index.mdx#L31

Did you really mean 'predictively'?

To understand the interactions or dependent changes between pull requests, Trunk Merge Queue provides an API for posting the list of **impacted targets** that result from code changes in every PR. When Trunk Merge Queue is running in parallel mode, pull requests will not be processed until the list of impacted targets are uploaded.

#### **What are Impacted Targets?**

Impacted targets are metadata that describe the logical changes of a pull request. An impacted target is a string that can be as expressive as a Bazel target or the name of a file folder. Calculating impacted targets with a purpose-built build system will provide absolute correctness for the merge queue, but more lightweight glob or folder-based approaches can also work with fewer guarantees around correctness.

Check warning on line 37 in merge-queue/optimizations/parallel-queues/index.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/parallel-queues/index.mdx#L37

Did you really mean 'Bazel'?

#### **Posting impacted targets from your pull requests**

Expand Down Expand Up @@ -85,3 +85,7 @@
<Info>
See [Filter Metrics by Impacted Targets ](../../administration/metrics#filter-metrics-by-impacted-targets)for detailed guidance on using this feature.
</Info>

### Related

* [How batching interacts with parallel queues and PFD](../batching#how-batching-interacts-with-parallel-queues-and-pfd) — including PR batch eligibility, the ALL keyword, and transitive dependents.
1 change: 1 addition & 0 deletions merge-queue/optimizations/pending-failure-depth.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,5 @@
* [Anti-flake protection](./anti-flake-protection) - Understand the combined mechanism of optimistic merging + Pending Failure Depth
* [Optimistic merging](./optimistic-merging) - The companion feature that enables automated flake clearing
* [Batching](./batching) - How Pending Failure Depth interacts with batch groups and bisection
* [How batching interacts with parallel queues and PFD](./batching#how-batching-interacts-with-parallel-queues-and-pfd) - Cross-feature behaviors, including PFD's downstream-PR delay

Check warning on line 90 in merge-queue/optimizations/pending-failure-depth.mdx

View check run for this annotation

Mintlify / Mintlify Validation (trunk-4cab4936) - vale-spellcheck

merge-queue/optimizations/pending-failure-depth.mdx#L90

Did you really mean 'PFD's'?
* [Predictive testing](./predictive-testing) - The foundation that makes successor test runs include predecessor changes