-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update KFTO multi-node test names according to recent updates in orig… #2164
Update KFTO multi-node test names according to recent updates in orig… #2164
Conversation
...i/tests/Tests/0600__distributed_workloads/0602__training/test-run-training-stack-tests.robot
Fixed
Show fixed
Hide fixed
...i/tests/Tests/0600__distributed_workloads/0602__training/test-run-training-stack-tests.robot
Fixed
Show fixed
Hide fixed
...i/tests/Tests/0600__distributed_workloads/0602__training/test-run-training-stack-tests.robot
Fixed
Show fixed
Hide fixed
Robot Results
|
What about other 2 test scenarios |
@ChughShilpa Actually the remaining MultiNode/MultiGPUs tests requires 2 cluster-nodes with minimum 2 GPUs each (GPU instance like g4dn.12xlarge - A100 GPUs), which I'm not sure whether will be available during QG tests.. |
We can add the tests to ODS CI, just we can't run them as part of QG, only as part of our own jobs. |
g4dn.12xlarge instance is used in qe-jenkins, and we also have |
Run Training Operator KFTO Test TestPyTorchJobMnistMultiNodeWithROCm ${ROCM_TRAINING_IMAGE} | ||
|
||
Run Training operator KFTO_MNIST multi-node multi-gpu test with NVIDIA CUDA image | ||
[Documentation] Run Go KFTO_MNIST multi-node multi-gpu test for Training operator using PyTorch job with NVIDIA CUDA image - It requires 2 cluster-nodes with 2 GPUs each |
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
Run Training Operator KFTO Test TestPyTorchJobMnistMultiNodeMultiGpuWithCuda ${CUDA_TRAINING_IMAGE} | ||
|
||
Run Training operator KFTO_MNIST multi-node multi-gpu test with AMD ROCm image | ||
[Documentation] Run Go KFTO_MNIST multi-node multi-gpu test for Training operator using PyTorch job with AMD ROCm image - It requires 2 cluster-nodes with 2 GPUs each |
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
d8d75d4
to
ffd8213
Compare
Run Training operator KFTO_MNIST multi-node CPU test with NVIDIA CUDA image | ||
[Documentation] Run Go KFTO_MNIST multi-node CPU test for Training operator using PyTorch job with NVIDIA CUDA image | ||
Run Training operator KFTO_MNIST multi-node single-CPU test with NVIDIA CUDA image | ||
[Documentation] Run Go KFTO_MNIST multi-node single-CPU test for Training operator using PyTorch job with NVIDIA CUDA image - It requires 2 cluster-nodes with at least 1 CPUs each |
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
Run Training operator KFTO_MNIST multi-node test with NVIDIA CUDA image | ||
[Documentation] Run Go KFTO_MNIST multi-node test for Training operator using PyTorch job with NVIDIA CUDA image | ||
Run Training operator KFTO_MNIST multi-node multi-CPU test with NVIDIA CUDA image | ||
[Documentation] Run Go KFTO_MNIST multi-node multi-CPU test for Training operator using PyTorch job with NVIDIA CUDA image - It requires 2 cluster-nodes with 2 CPUs each |
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
Run Training Operator KFTO Test TestPyTorchJobMnistMultiNodeMultiCpu ${CUDA_TRAINING_IMAGE} | ||
|
||
Run Training operator KFTO_MNIST multi-node single-GPU test with NVIDIA CUDA image | ||
[Documentation] Run Go KFTO_MNIST multi-node single-GPU test for Training operator using PyTorch job with NVIDIA CUDA image - It requires 2 cluster-nodes with 1 GPU each |
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
Run Training operator KFTO_MNIST multi-node test with AMD ROCm image | ||
[Documentation] Run Go KFTO_MNIST multi-node test for Training operator using PyTorch job with AMD ROCm image | ||
Run Training operator KFTO_MNIST multi-node single-GPU test with AMD ROCm image | ||
[Documentation] Run Go KFTO_MNIST multi-node single-GPU test for Training operator using PyTorch job with AMD ROCm image - It requires 2 cluster-nodes with 1 GPU each |
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
...i/tests/Tests/0600__distributed_workloads/0602__training/test-run-training-stack-tests.robot
Outdated
Show resolved
Hide resolved
ffd8213
to
2a5986d
Compare
...i/tests/Tests/0600__distributed_workloads/0602__training/test-run-training-stack-tests.robot
Show resolved
Hide resolved
...i/tests/Tests/0600__distributed_workloads/0602__training/test-run-training-stack-tests.robot
Show resolved
Hide resolved
2a5986d
to
c098ea0
Compare
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: abhijeet-dhumal, ChughShilpa, jiripetrlik, sutaakar The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
fd945bf
into
red-hat-data-services:master
Update KFTO multi-node test names according to recent updates in original test names
Related to : opendatahub-io/distributed-workloads#299