-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move waiting for tasks to separate phase to unblock process #1262
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1262 +/- ##
==========================================
- Coverage 15.57% 15.50% -0.08%
==========================================
Files 112 112
Lines 23255 23300 +45
==========================================
- Hits 3623 3612 -11
- Misses 19345 19403 +58
+ Partials 287 285 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Issue: When the Forklift creates the snapshot we need to wait for the task to finish. Right now we are using task.WaitForResult, this causes the whole process to wait for the snapshot creation and blocks other VM migrations. Same problem we have with snapshot removal, if the ESXi host is busy and we start the snapshot removal the snapshots can take longer than the reconcile cycle (3s) and we can fail due to it. Fix: Instead of using the task.WaitForResult the forklift will start querying for the latest tasks per VM, by default it's 10 tasks. This querying will be done in a separate phase then the creation/deletion. So we will have WaitFor phases for each of the object manipulations. We find the specific task for the creation/deletion and check its status. This has the advantage that we are not only getting the status of the task but in addition also the results of the task, so we can propagate them to the user, in case the creation/deletion fails. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 Signed-off-by: Martin Necas <mnecas@redhat.com>
Quality Gate passedIssues Measures |
Add a wait phase for snapshot tasks Issue: The main problem of the MTV-1753 and MTV-1775 is that we are either not waiting for the VMware task to finish or if we are waiting we are halting the whole controller process. This causes either performance issues or even migration failures. So we need to add a mechanism to wait for the tasks without halting the whole process. Fix: My first attempt was in PR kubev2v#1262 which used the event manager. This on the surface was an easy approach which did not require any additional changes to the CR. The problem there was that some of the tasks were not reported to the taskManager. These tasks had a prefix haTask. After some investigation, I found out that these tasks are directly on the ESXi host and not sent to the vspehre, so we can't use the taskManager. This PR adds the taskIds to the status CR so additional wait phases can monitor the tasks. The main controller will get the ESXi client and create a property collector to request the specific task from the host. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 - kubev2v#1262 - kubev2v#1265 Signed-off-by: Martin Necas <mnecas@redhat.com>
Issue: The main problem of the MTV-1753 and MTV-1775 is that we are either not waiting for the VMware task to finish or if we are waiting we are halting the whole controller process. This causes either performance issues or even migration failures. So we need to add a mechanism to wait for the tasks without halting the whole process. Fix: My first attempt was in PR kubev2v#1262 which used the event manager. This on the surface was an easy approach which did not require any additional changes to the CR. The problem there was that some of the tasks were not reported to the taskManager. These tasks had a prefix haTask. After some investigation, I found out that these tasks are directly on the ESXi host and not sent to the vspehre, so we can't use the taskManager. This PR adds the taskIds to the status CR so additional wait phases can monitor the tasks. The main controller will get the ESXi client and create a property collector to request the specific task from the host. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 - kubev2v#1262 - kubev2v#1265 Signed-off-by: Martin Necas <mnecas@redhat.com>
Issue: The main problem of the MTV-1753 and MTV-1775 is that we are either not waiting for the VMware task to finish or if we are waiting we are halting the whole controller process. This causes either performance issues or even migration failures. So we need to add a mechanism to wait for the tasks without halting the whole process. Fix: My first attempt was in PR kubev2v#1262 which used the event manager. This on the surface was an easy approach which did not require any additional changes to the CR. The problem there was that some of the tasks were not reported to the taskManager. These tasks had a prefix haTask. After some investigation, I found out that these tasks are directly on the ESXi host and not sent to the vspehre, so we can't use the taskManager. This PR adds the taskIds to the status CR so additional wait phases can monitor the tasks. The main controller will get the ESXi client and create a property collector to request the specific task from the host. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 - kubev2v#1262 - kubev2v#1265 Signed-off-by: Martin Necas <mnecas@redhat.com>
Issue: The main problem of the MTV-1753 and MTV-1775 is that we are either not waiting for the VMware task to finish or if we are waiting we are halting the whole controller process. This causes either performance issues or even migration failures. So we need to add a mechanism to wait for the tasks without halting the whole process. Fix: My first attempt was in PR #1262 which used the event manager. This on the surface was an easy approach which did not require any additional changes to the CR. The problem there was that some of the tasks were not reported to the taskManager. These tasks had a prefix haTask. After some investigation, I found out that these tasks are directly on the ESXi host and not sent to the vspehre, so we can't use the taskManager. This PR adds the taskIds to the status CR so additional wait phases can monitor the tasks. The main controller will get the ESXi client and create a property collector to request the specific task from the host. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 - #1262 - #1265 Signed-off-by: Martin Necas <mnecas@redhat.com>
Issue: The main problem of the MTV-1753 and MTV-1775 is that we are either not waiting for the VMware task to finish or if we are waiting we are halting the whole controller process. This causes either performance issues or even migration failures. So we need to add a mechanism to wait for the tasks without halting the whole process. Fix: My first attempt was in PR kubev2v#1262 which used the event manager. This on the surface was an easy approach which did not require any additional changes to the CR. The problem there was that some of the tasks were not reported to the taskManager. These tasks had a prefix haTask. After some investigation, I found out that these tasks are directly on the ESXi host and not sent to the vspehre, so we can't use the taskManager. This PR adds the taskIds to the status CR so additional wait phases can monitor the tasks. The main controller will get the ESXi client and create a property collector to request the specific task from the host. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 - kubev2v#1262 - kubev2v#1265 Signed-off-by: Martin Necas <mnecas@redhat.com>
Issue: The main problem of the MTV-1753 and MTV-1775 is that we are either not waiting for the VMware task to finish or if we are waiting we are halting the whole controller process. This causes either performance issues or even migration failures. So we need to add a mechanism to wait for the tasks without halting the whole process. Fix: My first attempt was in PR #1262 which used the event manager. This on the surface was an easy approach which did not require any additional changes to the CR. The problem there was that some of the tasks were not reported to the taskManager. These tasks had a prefix haTask. After some investigation, I found out that these tasks are directly on the ESXi host and not sent to the vspehre, so we can't use the taskManager. This PR adds the taskIds to the status CR so additional wait phases can monitor the tasks. The main controller will get the ESXi client and create a property collector to request the specific task from the host. Ref: - https://issues.redhat.com/browse/MTV-1753 - https://issues.redhat.com/browse/MTV-1775 - #1262 - #1265 Signed-off-by: Martin Necas <mnecas@redhat.com>
Issue:
When the Forklift creates the snapshot we need to wait for the task to finish. Right now we are using task.WaitForResult, this causes the whole process to wait for the snapshot creation and blocks other VM migrations. Same problem we have with snapshot removal, if the ESXi host is busy and we start the snapshot removal the snapshots can take longer than the reconcile cycle (3s) and we can fail due to it.
Fix:
Instead of using the task.WaitForResult the forklift will start querying for the latest tasks per VM, by default it's 10 tasks. This querying will be done in a separate phase then the creation/deletion. So we will have WaitFor phases for each of the object manipulations. We find the specific task for the creation/deletion and check its status. This has the advantage that we are not only getting the status of the task but in addition also the results of the task, so we can propagate them to the user, in case the creation/deletion fails.
Ref:
Proper fix of: #1253