fix(util): Fix flaky TestWorkerPool_TestMetric race condition#7497
Merged
fix(util): Fix flaky TestWorkerPool_TestMetric race condition#7497
Conversation
SungJin1212
approved these changes
May 10, 2026
Member
SungJin1212
left a comment
There was a problem hiding this comment.
LGTM, can you get the rebase?
The test was racy because the first Submit() could hit the select default case (fallback path) if the worker goroutine had not yet started receiving on the channel. This caused the fallback counter to be 2 instead of the expected 1. Fix by adding a started channel that the first job closes once it begins executing, ensuring the worker is actually busy before the second job is submitted. Signed-off-by: Ben Ye <benye@amazon.com>
40e9e6a to
2aa711f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Fixes the flaky
TestWorkerPool_TestMetrictest inpkg/util.Root Cause
The test was racy because the first
Submit()could hit theselect defaultcase (fallback path) if the worker goroutine hadn't started receiving on the channel yet. Since the channel is unbuffered andSubmituses a non-blocking select:If the single worker goroutine wasn't ready to receive when the first job was submitted, both jobs would go through the fallback path, incrementing the counter to 2 instead of the expected 1.
Fix
Add a
startedchannel that the first job closes once it begins executing. The test waits on this channel before submitting the second job, ensuring the worker is actually busy.Verification
Ran the test 50 times with
-count=50and with-race— all pass consistently.