Skip to content

Conversation

@shwstppr
Copy link
Contributor

@shwstppr shwstppr commented Mar 6, 2023

Description

Storage GC deletes volumes that are in Destroy state. This PR adds a check in this action that is if two volume records (one in Destroy state and one in Ready state) point to exactly the same disk file then skip doing GC on such Destroy state volume record.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

@shwstppr shwstppr force-pushed the improve-storage-gc-vol-delete branch from 627eedc to c85e114 Compare March 6, 2023 12:13
@shwstppr
Copy link
Contributor Author

shwstppr commented Mar 6, 2023

@blueorangutan package

@blueorangutan
Copy link

@shwstppr a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 5683

@DaanHoogland
Copy link
Contributor

@blueorangutan test matrix

@blueorangutan
Copy link

@DaanHoogland a Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@DaanHoogland DaanHoogland added this to the 4.18.0.0 milestone Mar 6, 2023
@DaanHoogland DaanHoogland added the Severity:Critical Critical bug label Mar 6, 2023
@blueorangutan
Copy link

Trillian test result (tid-6262)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 34389 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6262-xenserver-71.zip
Smoke tests completed. 99 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_06_deploy_vm_with_extraconfig_throws_exception_xenserver Error 1.17 test_deploy_vm_extra_config_data.py
test_07_deploy_vm_with_extraconfig_xenserver Error 1.15 test_deploy_vm_extra_config_data.py
test_02_deploy_vm_on_specific_cluster Error 1.12 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 1.15 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 1.17 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 1.12 test_vm_deployment_planner.py

@blueorangutan
Copy link

Trillian test result (tid-6263)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 38020 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6263-vmware-67u3.zip
Smoke tests completed. 99 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_04_deploy_vm_with_extraconfig_throws_exception_vmware Error 1.21 test_deploy_vm_extra_config_data.py
test_05_deploy_vm_with_extraconfig_vmware Error 1.20 test_deploy_vm_extra_config_data.py
test_02_deploy_vm_on_specific_cluster Error 1.19 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 1.23 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 1.22 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 1.20 test_vm_deployment_planner.py

@blueorangutan
Copy link

Trillian test result (tid-6264)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 40558 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6264-kvm-centos7.zip
Smoke tests completed. 99 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_deploy_vm_with_extraconfig_kvm Error 0.13 test_deploy_vm_extra_config_data.py
test_03_update_vm_with_extraconfig_kvm Error 1.16 test_deploy_vm_extra_config_data.py
test_02_deploy_vm_on_specific_cluster Error 1.12 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 1.16 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 1.16 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 1.15 test_vm_deployment_planner.py

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't test this, LGTM - this would need testing (one concern that can be tested if, this doesn't introduce any regression in the GC logic such that valid volumes that must be GC'd and deleted are no longer deleted)

@shwstppr shwstppr changed the base branch from 4.17 to main March 9, 2023 12:07
@shwstppr shwstppr changed the base branch from main to 4.17 March 9, 2023 12:07
@shwstppr
Copy link
Contributor Author

shwstppr commented Mar 9, 2023

Changing base branch creates conflicts @rohityadavcloud @DaanHoogland

shwstppr added 2 commits May 23, 2023 13:30
…icate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr shwstppr force-pushed the improve-storage-gc-vol-delete branch from c85e114 to 8a0564d Compare May 23, 2023 08:49
@shwstppr shwstppr changed the base branch from 4.17 to 4.18 May 23, 2023 08:49
@shwstppr
Copy link
Contributor Author

@blueorangutan package

@shwstppr shwstppr marked this pull request as ready for review May 23, 2023 08:56
@blueorangutan
Copy link

@shwstppr a [SF] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6121

@rohityadavcloud
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@rohityadavcloud a [SF] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian Build Failed (tid-6589)

@DaanHoogland
Copy link
Contributor

@blueorangutan LLtest

@blueorangutan
Copy link

@DaanHoogland a [LL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

[LL]Trillian test result (tid-6544)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 47715 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6544-kvm-centos7.zip
Smoke tests completed. 105 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_04_rvpc_privategw_static_routes Failure 966.51 test_privategw_acl.py
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Failure 356.44 test_vpc_redundant.py
test_04_rvpc_network_garbage_collector_nics Failure 303.64 test_vpc_redundant.py
test_01_redundant_vpc_site2site_vpn Failure 516.61 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Error 1184.62 test_vpc_vpn.py
test_01_vpc_site2site_vpn Error 449.88 test_vpc_vpn.py

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SF] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-6594)

@DaanHoogland
Copy link
Contributor

@blueorangutan LLtest

@blueorangutan
Copy link

@DaanHoogland a [LL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

[LL]Trillian test result (tid-6549)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 47826 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6549-kvm-centos7.zip
Smoke tests completed. 106 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_list_cpvm_vm Failure 0.03 test_ssvm.py
test_04_cpvm_internals Failure 0.04 test_ssvm.py
test_01_redundant_vpc_site2site_vpn Failure 547.30 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Error 1181.42 test_vpc_vpn.py
test_01_vpc_site2site_vpn Error 465.71 test_vpc_vpn.py

@DaanHoogland
Copy link
Contributor

@blueorangutan test matrix

@blueorangutan
Copy link

@DaanHoogland a [SF] Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@shwstppr
Copy link
Contributor Author

shwstppr commented Jun 1, 2023

@blueorangutan package

@blueorangutan
Copy link

@shwstppr a [LL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [LL]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 6067

@DaanHoogland
Copy link
Contributor

@blueorangutan test matrix

@blueorangutan
Copy link

@DaanHoogland a [SF] Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-6664)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 40805 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6664-xenserver-71.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link

[SF] Trillian test result (tid-6665)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8
Total time taken: 43525 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6665-vmware-67u3.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@blueorangutan
Copy link

[SF] Trillian test result (tid-6666)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 43842 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr7313-t6666-kvm-centos7.zip
Smoke tests completed. 108 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@DaanHoogland
Copy link
Contributor

This seems ready to merge, I am just missing some validation and given the description it seems hard to reproduce/test.

@shwstppr
Copy link
Contributor Author

shwstppr commented Jun 5, 2023

This is just a defensive check and I don't think it would be simple to reproduce. At most, we can test that there is no regression in the GC logic. I've also tried to add multiple unit tests to prevent any unnecessary change in the behaviour

@DaanHoogland
Copy link
Contributor

This is just a defensive check and I don't think it would be simple to reproduce. At most, we can test that there is no regression in the GC logic. I've also tried to add multiple unit tests to prevent any unnecessary change in the behaviour

Thanks, @shwstppr I think we did enough for regression testing, looking at the code change (scope), merging

@DaanHoogland DaanHoogland merged commit 2d6a069 into apache:4.18 Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants