Skip to content

Fix eviction overwriting failed migration condition with success#309

Merged
notandy merged 1 commit into
mainfrom
fix/eviction-sticky-migration-failure
May 20, 2026
Merged

Fix eviction overwriting failed migration condition with success#309
notandy merged 1 commit into
mainfrom
fix/eviction-sticky-migration-failure

Conversation

@notandy
Copy link
Copy Markdown
Contributor

@notandy notandy commented May 20, 2026

When an eviction processes a list of VMs and one is in ERROR state, the
controller correctly skips it via moveToBack and sets a Failed migration
condition. However, the next successful migration of another VM
unconditionally overwrote that condition with Succeeded, hiding the fact
that an errored VM was still outstanding.

Preserve the Failed condition while there are still outstanding VMs.
The condition resolves naturally: either the errored VM is fixed and
migrates (overwriting with Succeeded), or the eviction terminates with
Evicting=Succeeded once the list empties.

Adds a unit test covering a mixed list of healthy and errored VMs that
exercises the skip-and-retry pattern across multiple reconciliations.

Summary by CodeRabbit

  • Bug Fixes

    • Improved migration status tracking during multi-VM evictions. The system now correctly preserves failure states when some instances fail, rather than prematurely marking eviction as successful. Failed VMs remain properly tracked in outstanding instances without duplicate migration attempts.
  • Tests

    • Added test coverage for mixed VM eviction scenarios with both successful and failed instances.

Review Change Stack

When an eviction processes a list of VMs and one is in ERROR state, the
controller correctly skips it via moveToBack and sets a Failed migration
condition. However, the next successful migration of another VM
unconditionally overwrote that condition with Succeeded, hiding the fact
that an errored VM was still outstanding.

Preserve the Failed condition while there are still outstanding VMs.
The condition resolves naturally: either the errored VM is fixed and
migrates (overwriting with Succeeded), or the eviction terminates with
Evicting=Succeeded once the list empties.

Adds a unit test covering a mixed list of healthy and errored VMs that
exercises the skip-and-retry pattern across multiple reconciliations.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

The eviction controller's migration condition handling now preserves "sticky" failure states. When an instance migration completes but other instances remain outstanding with a prior Failed condition, the controller maintains the Failed status instead of prematurely marking the eviction as Succeeded. A comprehensive test validates this behavior across mixed healthy and errored VM scenarios.

Changes

Sticky migration failure condition

Layer / File(s) Summary
Sticky migration failure condition and test
internal/controller/eviction_controller.go, internal/controller/eviction_controller_test.go
The evictNext method replaces unconditional Succeeded setting with conditional logic that preserves an existing Failed migration condition when other instances remain outstanding. A new "Mixed VM Eviction" test validates that healthy VMs are live-migrated and removed while errored VMs persist in OutstandingInstances with their failure condition intact across multiple reconciliation loops.

Sequence Diagram(s)

(No sequence diagrams generated; the change is a single-layer conditional logic update with localized control flow rather than multi-component interactions requiring visualization.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • toanju
  • fwiesel

Poem

🐰 When migrations leap and instances roam,
Some stumble and fail, far from home,
But sticky conditions hold tight to the past,
Until all are resolved—success comes at last! 🚀

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix eviction overwriting failed migration condition with success' accurately summarizes the main change: preventing the controller from prematurely overwriting a failed migration condition when other outstanding instances remain.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/eviction-sticky-migration-failure

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Merging this branch will increase overall coverage

Impacted Packages Coverage Δ 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller 70.17% (+3.67%) 👍

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/eviction_controller.go 61.59% (+22.21%) 164 (+4) 101 (+38) 63 (-34) 🌟

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/cobaltcore-dev/openstack-hypervisor-operator/internal/controller/eviction_controller_test.go

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/eviction_controller_test.go`:
- Around line 592-597: The test currently only checks the error message when err
!= nil which lets a nil error slip by; change the assertions around
evictionReconciler.Reconcile(reconcileRequest) so you first assert that err is
not nil (e.g., Expect(err).ToNot(BeNil())) and then unconditionally assert the
error string contains the VM UUID (use
Expect(err.Error()).To(ContainSubstring("error-1"))), keeping references to
evictionReconciler.Reconcile, reconcileRequest and err to locate the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b1e9da2a-6e5d-43d1-826b-f75af88c14db

📥 Commits

Reviewing files that changed from the base of the PR and between c252897 and 15aeaa0.

📒 Files selected for processing (2)
  • internal/controller/eviction_controller.go
  • internal/controller/eviction_controller_test.go

Comment thread internal/controller/eviction_controller_test.go
@notandy notandy merged commit 65bc78f into main May 20, 2026
7 checks passed
@notandy notandy deleted the fix/eviction-sticky-migration-failure branch May 20, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants