Fix rollback disk snapshots on instance snapshot failure#12949
Fix rollback disk snapshots on instance snapshot failure#12949sureshanaparti wants to merge 1 commit intoapache:4.22from
Conversation
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12949 +/- ##
=========================================
Coverage 17.60% 17.60%
+ Complexity 15677 15676 -1
=========================================
Files 5918 5918
Lines 531681 531686 +5
Branches 65005 65006 +1
=========================================
+ Hits 93623 93624 +1
- Misses 427498 427502 +4
Partials 10560 10560
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17339 |
There was a problem hiding this comment.
Pull request overview
Fixes cleanup/rollback of per-volume disk snapshots when an instance (VM) snapshot operation fails, preventing orphaned snapshots and incorrect snapshot resource counts (issue #12927).
Changes:
- Track created disk snapshots for rollback using a
Map<volumeId, SnapshotInfo>to ensure snapshots are rolled back even if creation fails mid-way. - Harden rollback logic with null checks and improve rollback logging.
- Update the KVM VM snapshot strategy unit test to match the new
createDiskSnapshotmethod signature.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| engine/storage/snapshot/src/main/java/org/apache/cloudstack/storage/vmsnapshot/StorageVMSnapshotStrategy.java | Track snapshots for rollback via volume-id map; ensure rollback attempts occur on failures; add null-safety/logging improvements. |
| engine/storage/snapshot/src/test/java/org/apache/cloudstack/storage/vmsnapshot/VMSnapshotStrategyKVMTest.java | Adjust unit test to pass the new rollback map parameter to createDiskSnapshot. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| snapshotInfo = snapshotStrategy.takeSnapshot(snapshotInfo); | ||
| if (snapshotInfo == null) { | ||
| throw new CloudRuntimeException("Failed to create snapshot"); | ||
| } else { |
There was a problem hiding this comment.
volumeToSnapshotInfoMapForRollback.put(vol.getId(), snapshotInfo) is performed twice (before and after takeSnapshot). If the intent is to ensure rollback coverage when takeSnapshot throws, keep the pre-takeSnapshot put, but drop the else block and only overwrite the map entry if takeSnapshot returns a different SnapshotInfo instance. This removes redundant writes and makes the intent clearer.
| } else { | |
| } | |
| if (snapshotInfo != volumeToSnapshotInfoMapForRollback.get(vol.getId())) { |
| snapshot = snapshotDao.persist(snapshot); | ||
| vol.addPayload(setPayload(vol, snapshot, quiescevm)); | ||
| SnapshotInfo snapshotInfo = snapshotDataFactory.getSnapshot(snapshot.getId(), vol.getDataStore()); | ||
| snapshotInfo.addPayload(vol.getpayload()); | ||
| volumeToSnapshotInfoMapForRollback.put(vol.getId(), snapshotInfo); | ||
| SnapshotStrategy snapshotStrategy = storageStrategyFactory.getSnapshotStrategy(snapshotInfo, SnapshotOperation.TAKE); | ||
| if (snapshotStrategy == null) { | ||
| throw new CloudRuntimeException("Could not find strategy for snapshot uuid:" + snapshotInfo.getUuid()); | ||
| } | ||
| snapshotInfo = snapshotStrategy.takeSnapshot(snapshotInfo); | ||
| if (snapshotInfo == null) { | ||
| throw new CloudRuntimeException("Failed to create snapshot"); | ||
| } else { |
There was a problem hiding this comment.
The new rollback behavior relies on adding the snapshot to the rollback map before snapshotStrategy.takeSnapshot(...) runs so that failures/exceptions still get cleaned up. There’s no unit test covering the failure path (e.g., takeSnapshot throws/returns null) and verifying that rollback deletes the persisted snapshot record. Please add a test for this scenario to prevent regressions of #12927.
DaanHoogland
left a comment
There was a problem hiding this comment.
I may be missing some context, hence my questions?!?
| vol.addPayload(setPayload(vol, snapshot, quiescevm)); | ||
| SnapshotInfo snapshotInfo = snapshotDataFactory.getSnapshot(snapshot.getId(), vol.getDataStore()); | ||
| snapshotInfo.addPayload(vol.getpayload()); | ||
| volumeToSnapshotInfoMapForRollback.put(vol.getId(), snapshotInfo); |
There was a problem hiding this comment.
so it might be added twice right? so why the else at line 466?
| } | ||
| if (!result) { | ||
| for (SnapshotInfo snapshotInfo : forRollback) { | ||
| for (SnapshotInfo snapshotInfo : volumeToSnapshotInfoMapForRollback.values()) { |
There was a problem hiding this comment.
this seems the only usage, why not a HashSet?
Description
This PR fixes rollback disk snapshots on instance snapshot failure.
Fixes #12927
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?