-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration of Pytorch Rest Protocol test on triton for Kserve (UI -> API) #2172
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,11 +33,14 @@ | |
${TEST_NS}= tritonmodel | ||
${DOWNLOAD_IN_PVC}= ${FALSE} | ||
${MODELS_BUCKET}= ${S3.BUCKET_1} | ||
${LLM_RESOURCES_DIRPATH}= tests/Resources/Files/llm | ||
${INFERENCESERVICE_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/serving_runtimes/base/isvc.yaml | ||
${INFERENCESERVICE_FILEPATH_NEW}= ${LLM_RESOURCES_DIRPATH}/serving_runtimes/isvc | ||
${INFERENCESERVICE_FILLED_FILEPATH}= ${INFERENCESERVICE_FILEPATH_NEW}/isvc_filled.yaml | ||
${KSERVE_RUNTIME_REST_NAME}= triton-kserve-runtime | ||
${PYTORCH_MODEL_NAME}= resnet50 | ||
${INFERENCE_REST_INPUT_PYTORCH}= @tests/Resources/Files/triton/kserve-triton-resnet-rest-input.json | ||
${EXPECTED_INFERENCE_REST_OUTPUT_FILE__PYTORCH}= tests/Resources/Files/triton/kserve-triton-resnet-rest-output.json | ||
${PATTERN}= https:\/\/([^\/:]+) | ||
${PROTOBUFF_FILE}= tests/Resources/Files/triton/grpc_predict_v2.proto | ||
|
||
|
@@ -73,7 +76,7 @@ | |
... remote_port=${service_port} process_alias=triton-process | ||
END | ||
Verify Model Inference With Retries model_name=${PYTHON_MODEL_NAME} inference_input=${INFERENCE_REST_INPUT_PYTHON} | ||
... expected_inference_output=${EXPECTED_INFERENCE_REST_OUTPUT_PYTHON} project_title=${test_namespace} | ||
... deployment_mode=Cli kserve_mode=${KSERVE_MODE} service_port=${service_port} | ||
... end_point=/v2/models/${model_name}/infer retries=3 | ||
[Teardown] Run Keywords | ||
|
@@ -81,12 +84,53 @@ | |
... isvc_names=${models_names} wait_prj_deletion=${FALSE} kserve_mode=${KSERVE_MODE} | ||
... AND | ||
... Run Keyword If "${KSERVE_MODE}"=="RawDeployment" Terminate Process triton-process kill=true | ||
|
||
Test Pytorch Model Rest Inference Via API (Triton on Kserve) # robocop: off=too-long-test-case | ||
Check warning Code scanning / Robocop Test case '{{ test_name }}' has too many keywords inside ({{ keyword_count }}/{{ max_allowed_count }}) Warning test
Test case 'Test Pytorch Model Rest Inference Via API (Triton on Kserve)' has too many keywords inside (12/10)
|
||
[Documentation] Test the deployment of python model in Kserve using Triton | ||
[Tags] Tier2 RHOAIENG-16909 | ||
Setup Test Variables model_name=${PYTORCH_MODEL_NAME} use_pvc=${FALSE} use_gpu=${FALSE} | ||
... kserve_mode=${KSERVE_MODE} model_path=triton/model_repository/ | ||
Set Project And Runtime runtime=${KSERVE_RUNTIME_REST_NAME} protocol=${PROTOCOL} namespace=${test_namespace} | ||
Check warning Code scanning / Robocop Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
Line is too long (123/120)
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @tarukumar @rpancham @Raghul-M @rnetser guys I was thinking..this keyword was designed to use the runtime YAML from ODS-CI repo (iirc at the time the runtime we were using were not in RHOAI yet). Today, if the runtime is already part of the OOTB runtimes maybe it would make more sense to fetch the definition from the cluster itself (i.e., openshift I know the main goal is to move out of ODS-CI for your scrum, but given that these tests are still under maintainance maybe it's worth considering this improvement. Up to you |
||
... download_in_pvc=${DOWNLOAD_IN_PVC} model_name=${PYTORCH_MODEL_NAME} | ||
... storage_size=100Mi memory_request=100Mi | ||
${requests}= Create Dictionary memory=1Gi | ||
Check notice Code scanning / Robocop {{ create_keyword }} can be replaced with VAR Note test
Create Dictionary can be replaced with VAR
|
||
Compile Inference Service YAML isvc_name=${PYTORCH_MODEL_NAME} | ||
... sa_name=models-bucket-sa | ||
... model_storage_uri=${storage_uri} | ||
... model_format=python serving_runtime=${KSERVE_RUNTIME_REST_NAME} | ||
... version="1" | ||
... limits_dict=${limits} requests_dict=${requests} kserve_mode=${KSERVE_MODE} | ||
Deploy Model Via CLI isvc_filepath=${INFERENCESERVICE_FILLED_FILEPATH} | ||
... namespace=${test_namespace} | ||
# File is not needed anymore after applying | ||
Remove File ${INFERENCESERVICE_FILLED_FILEPATH} | ||
Wait For Pods To Be Ready label_selector=serving.kserve.io/inferenceservice=${PYTORCH_MODEL_NAME} | ||
... namespace=${test_namespace} | ||
${pod_name}= Get Pod Name namespace=${test_namespace} | ||
... label_selector=serving.kserve.io/inferenceservice=${PYTORCH_MODEL_NAME} | ||
${service_port}= Extract Service Port service_name=${PYTORCH_MODEL_NAME}-predictor protocol=TCP | ||
... namespace=${test_namespace} | ||
IF "${KSERVE_MODE}"=="RawDeployment" | ||
Start Port-forwarding namespace=${test_namespace} pod_name=${pod_name} local_port=${service_port} | ||
... remote_port=${service_port} process_alias=triton-process | ||
END | ||
${EXPECTED_INFERENCE_REST_OUTPUT_PYTORCH}= Load Json File | ||
... file_path=${EXPECTED_INFERENCE_REST_OUTPUT_FILE_PYTORCH} as_string=${TRUE} | ||
Verify Model Inference With Retries model_name=${PYTORCH_MODEL_NAME} inference_input=${INFERENCE_REST_INPUT_PYTORCH} | ||
Check warning Code scanning / Robocop Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test
Line is too long (125/120)
|
||
... expected_inference_output=${EXPECTED_INFERENCE_REST_OUTPUT_PYTORCH} project_title=${test_namespace} | ||
... deployment_mode=Cli kserve_mode=${KSERVE_MODE} service_port=${service_port} | ||
... end_point=/v2/models/${model_name}/infer retries=3 | ||
[Teardown] Run Keywords | ||
... Clean Up Test Project test_ns=${test_namespace} | ||
... isvc_names=${models_names} wait_prj_deletion=${FALSE} kserve_mode=${KSERVE_MODE} | ||
... AND | ||
... Run Keyword If "${KSERVE_MODE}"=="RawDeployment" Terminate Process triton-process kill=true | ||
|
||
Check warning Code scanning / Robocop Trailing whitespace at the end of line Warning test
Trailing whitespace at the end of line
|
||
Test Python Model Grpc Inference Via API (Triton on Kserve) # robocop: off=too-long-test-case | ||
[Documentation] Test the deployment of python model in Kserve using Triton | ||
[Tags] Tier2 RHOAIENG-16912 | ||
Setup Test Variables model_name=${PYTHON_MODEL_NAME} use_pvc=${FALSE} use_gpu=${FALSE} | ||
... kserve_mode=${KSERVE_MODE} model_path=triton/model_repository/ | ||
Set Project And Runtime runtime=${KSERVE_RUNTIME_REST_NAME} protocol=${PROTOCOL_GRPC} namespace=${test_namespace} | ||
... download_in_pvc=${DOWNLOAD_IN_PVC} model_name=${PYTHON_MODEL_NAME} | ||
... storage_size=100Mi memory_request=100Mi | ||
|
@@ -166,6 +210,7 @@ | |
... AND | ||
... Run Keyword If "${KSERVE_MODE}"=="RawDeployment" Terminate Process triton-process kill=true | ||
|
||
|
||
*** Keywords *** | ||
Suite Setup | ||
[Documentation] Suite setup keyword | ||
|
Check warning
Code scanning / Robocop
Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test