fix(bes): bound queued events during outages and honor BES timeout by thesayyn · Pull Request #1069 · aspect-build/aspect-cli

thesayyn · 2026-05-07T21:55:31Z

Addresses maintainer feedback on #1065. This branch carries the original commit from #1065 plus a follow-up commit with the fixes below.

Changes

Bound the gRPC sink's forwarder channel by retry_max_buffer_size.
The Tokio mpsc that bridges the synchronous broadcaster into the async state machine was unbounded, so events accumulated freely while drive_stream was sleeping between reconnects or replaying the retry buffer — defeating the very knob meant to cap memory during BES outages. The forwarder now uses try_send against a bounded channel and signals overflow via an atomic flag; drive_stream observes the closed receiver, checks the flag, and exits with BufferFull so the configured error_strategy takes effect.
Honor the configured timeout. RetryConfig.timeout was being stored but never read, so the bes_timeout knob was a no-op. The sink's work is now wrapped in tokio::time::timeout so finite deadlines actually bound lifecycle calls, stream retries, and the final upload — matching Bazel's --bes_timeout semantics.
Drop the TODO about bessie authentication on the workflows BES sink: within the deployment, transport security is plain TLS.

Test plan

cargo test -p axl-runtime --lib (155 passed)
cargo test -p axl-runtime --lib grpc (9 passed, including the three end-to-end error_strategy scenarios)
CI green

Generated by Claude Code

Address maintainer feedback on PR #1065: 1. Bound the gRPC sink's forwarder channel by retry_max_buffer_size. Previously the Tokio mpsc was unbounded, so events accumulated freely while drive_stream was sleeping between reconnects or replaying the retry buffer — defeating the very knob meant to cap memory. The forwarder now uses try_send and signals overflow via an atomic flag; drive_stream sees the closed receiver, checks the flag, and exits with BufferFull so error_strategy applies. 2. Wire the configured timeout. RetryConfig.timeout was previously stored but never read, so 'bes_timeout' was a no-op. Wrap the sink's work in tokio::time::timeout so finite deadlines actually bound lifecycle calls, stream retries, and the final upload — matching Bazel's --bes_timeout semantics. 3. Drop the TODO about bessie authentication on the workflows BES sink: within the deployment, transport security is plain TLS.

CLAassistant · 2026-05-07T21:55:39Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

thesayyn and others added 2 commits May 7, 2026 21:32

feat: support failure knobs for sinks

e2ae698

thesayyn closed this May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bes): bound queued events during outages and honor BES timeout#1069

fix(bes): bound queued events during outages and honor BES timeout#1069
thesayyn wants to merge 2 commits intomainfrom
claude/fix-maintainer-concerns-ci-Ed321

thesayyn commented May 7, 2026

Uh oh!

CLAassistant commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

thesayyn commented May 7, 2026

Changes

Test plan

Uh oh!

CLAassistant commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants