feat: aborted-transaction retry policy and client integration#18
Open
mlwelles wants to merge 3 commits into
Open
feat: aborted-transaction retry policy and client integration#18mlwelles wants to merge 3 commits into
mlwelles wants to merge 3 commits into
Conversation
21760f3 to
67c0b31
Compare
There was a problem hiding this comment.
2 issues found across 2 files
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
Add RetryPolicy / DefaultRetryPolicy and a runner that re-executes a function on aborted Dgraph transactions with exponential backoff (retry.go), exposed on the client via a WithRetry method.
67c0b31 to
b477c28
Compare
…Retries Addresses review feedback on the retry policy: - delay() now adds jitter before the final MaxDelay clamp, so the documented invariant "no single delay exceeds MaxDelay" holds. Previously the cap was applied before jitter, letting the delay reach MaxDelay*(1+Jitter). The exponential is also capped before the shift to avoid overflow at large attempt counts. - WithRetry clamps a negative MaxRetries to zero so fn always runs at least once, as documented, instead of skipping the loop and hitting the unreachable panic. Tests: - TestRetryPolicyDelayJitterNeverExceedsMaxCap asserts the cap invariant across attempts with large jitter. - TestWithRetryNegativeMaxRetries asserts fn runs once with MaxRetries=-1. - TestWithRetryContextCancellation now returns dgo.ErrAborted from fn so it actually enters the backoff sleep and exercises the ctx.Done() path it claims to cover (previously it returned ctx.Err() and bailed immediately). Docs: add runnable examples for WithRetry and a custom RetryPolicy.
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="retry_example_test.go">
<violation number="1" location="retry_example_test.go:20">
P2: Example function name does not match exported method: ExampleClient_withRetry should be ExampleClient_WithRetry to properly associate with Client.WithRetry in godoc.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| // predicate — is retried with exponential backoff instead of surfacing to the | ||
| // caller. Non-abort errors return immediately; the context bounds the total | ||
| // wait. | ||
| func ExampleClient_withRetry() { |
There was a problem hiding this comment.
P2: Example function name does not match exported method: ExampleClient_withRetry should be ExampleClient_WithRetry to properly associate with Client.WithRetry in godoc.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At retry_example_test.go, line 20:
<comment>Example function name does not match exported method: ExampleClient_withRetry should be ExampleClient_WithRetry to properly associate with Client.WithRetry in godoc.</comment>
<file context>
@@ -0,0 +1,53 @@
+// predicate — is retried with exponential backoff instead of surfacing to the
+// caller. Non-abort errors return immediately; the context bounds the total
+// wait.
+func ExampleClient_withRetry() {
+ client, _ := modusgraph.NewClient("dgraph://localhost:9080")
+ defer client.Close()
</file context>
Suggested change
| func ExampleClient_withRetry() { | |
| func ExampleClient_WithRetry() { |
Drop the int->uint shift conversion (Go allows a signed shift count, and attempt is bounded to [0,63)), clearing gosec G115. Annotate the jitter RNG with nolint:gosec — backoff jitter is not security-sensitive, so math/rand/v2 is appropriate. No behavior change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this adds
An opt-in retry layer for aborted Dgraph transactions. When concurrent
writers conflict on an indexed predicate, Dgraph aborts one of the transactions
with
dgo.ErrAborted. Without a retry, that surfaces to the caller as a failedwrite even though retrying would succeed.
RetryPolicy/DefaultRetryPolicy— exponential backoff with jitter,modeled after dgraph4j's policy.
Client.WithRetry(ctx, policy, fn)— runsfn, retrying on aborts per thepolicy; returns non-abort errors immediately; honors context cancellation
during backoff.
Self-contained:
retry.goplus theWithRetrymethod onClient. Builds andtests green. The bulk-loader PR (#19) stacks on this.
The problem it solves
Concurrent inserts on the same indexed predicate race at commit time; Dgraph
lets one win and aborts the rest. The aborting transaction is safe to retry
against a fresh snapshot.
WithRetryencapsulates that loop — backoff schedule,jitter to avoid a thundering herd, and a retry ceiling — so callers wrap their
mutation once instead of hand-rolling retry logic at every write site.
Review round — changes since first push
Addresses the automated review (cubic):
delay()applied theMaxDelaycapbefore adding jitter, so the result could reach
MaxDelay * (1 + Jitter)—violating the documented "no single delay exceeds MaxDelay." Jitter is now
added before a final clamp, and the exponential is capped before the shift so
a large attempt count cannot overflow.
for attempt := range MaxRetries + 1ran zero times when
MaxRetries < 0, falling through to the unreachablepanic.WithRetrynow clamps a negativeMaxRetriesto zero, sofnalways runs at least once as documented.
Tests
TestRetryPolicyDelayJitterNeverExceedsMaxCap— asserts the cap invariantacross attempts under large jitter.
TestWithRetryNegativeMaxRetries— assertsfnruns once withMaxRetries=-1.TestWithRetryContextCancellation— rewritten to returndgo.ErrAbortedfromfn, so it actually enters the backoff sleep and exercises thectx.Done()path it claims to cover (it previously returned
ctx.Err()and returnedbefore any backoff).
Documentation
Runnable, compile-checked examples for
WithRetryand a customRetryPolicy.