Implement Better VM revert using VDI revert (26.1-lcm)#7116
Open
psafont wants to merge 7 commits into
Open
Conversation
97b3938 to
7f81a53
Compare
last-genius
requested changes
Jun 5, 2026
7f81a53 to
9752afa
Compare
Contributor
|
All of the |
last-genius
approved these changes
Jun 5, 2026
4b4bf79 to
6f791df
Compare
gthvn1
reviewed
Jun 11, 2026
gthvn1
approved these changes
Jun 11, 2026
Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
8e9d14e to
95b8cf0
Compare
This is for both the live VDI (revert_from) and the snapshot (revert_to) These correspond to the VM operations `reverting` and `revert`, respectively. Only snapshots backed by an SR that allows clones are allowed to be reverted to. This is because SRs supporting revert is not enough to use vdi revert without clone: having an incorrect snapshot_of field will also cause the use of clone. With this we can ensure that if a bad edge case is met the system has a working fallback. Reverting to "live" snapshot VDIs is allowed, this happens when reverting to a checkpoint. Currently reverting requires the cloning feature because it's not possible to discriminate between backends that support vdi revert and ones that done. This will be possible once all of them do Test that VDIs... * can be reverted to a snapshot. * can be reverted to an attached snapshot. * cannot be reverted to a leaf VDI. * cannot be reverted to an attached leaf. Co-authored-by: John Else <john.else@citrix.com> Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
This operation allows storage backends to implement VDI revert natively, as well as advertise it. The operation is unused by xapi for the time being. Co-authored-by: David Scott <dave.scott@eu.citrix.com> Co-authored-by: John Else <john.else@citrix.com> Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
The API function is needed to be able to forward it across the pool and to block other operations on the VDI while it's running. It's unused for the time being. Co-authored-by: John Else <john.else@citrix.com> Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
The code now tries to call VDI.revert on all the disk VDIs, and if that fails, it falls back to the original revert method that destroys VM disks, clones snapshot disks and then fixes up metadata. This means that the successfully reverted disks must be excluded from the original method. This is done by introducing sets of VDIs and VBDs and using set logic on them. The CD disks and the suspend VDI are treated like before: destroy + clone. The code is more convoluted that I would have liked because of existing clone infrastructure mixes VBDs with VDIs, so they need to be converted back and forth to be able to run the previous method correctly, especially for updating the snapshot_of field of the cloned snapshot disks. Quicktests encodes the difference in behaviour from the original method to the native one: now VDI UUIDs are not changed when reverting. Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
8a09224 to
97823b4
Compare
Member
Author
|
Waiting to merge this until the prep PR has been tested by Xenserver and a build with these patches is tested on XCP-ng's CI this weekend |
last-genius
approved these changes
Jun 12, 2026
Member
Author
|
I've ran the quicktests with backends that have support vfi revert and ones that don't |
In case where there's an error while reverting on the Storage Backend, the state of the SR might become unstable, and a recovery using the clone method for reverting might introduce bigger issues. Instead stop the revert immediately. The cases where ignoring the errors is acceptable is when either the SR does not advertise support for VDI_REVERT, and when the SR does not implement revert for a particular VDI. for example, if the SR implements the revert for only some of the formats it implements. Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
f40f110 to
8b526e7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the changes needed to implement the proposal shown in https://github.com/xapi-project/xen-api/blob/master/doc/content/design/snapshot-revert.md
Please note that the storage backends need to implement VDI_REVERT in order to benefit from this change.
The fallback logic to the previous method complicates the implementation quite a bit, although it's made a bit more manageable using sets.
The complexity comes from the current infrastructure to clone and delete, which mixes VBDs and VDIs, and that two VMs are related: the 'live' or 'destination' VM and the snapshot VM.
Note that only disks are affected by the change, and both CDs and suspend VDIs are treated the same way as before.
There's one change in user-visible behaviour: now the UUID of the VDIs do not change when reverting if VDI.revert is used.
This PR is stacked on top of #7103