Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--triton-launch-mode=remote #874

Open
riyajatar37003 opened this issue May 15, 2024 · 4 comments
Open

--triton-launch-mode=remote #874

riyajatar37003 opened this issue May 15, 2024 · 4 comments

Comments

@riyajatar37003
Copy link

Hi ,

can u share any example/command for these mode.?

during launching i am doing this way "tritonserver --model-control-mode explicit --exit-on-error=false --model-repository=/tmp/models"

and in the other container i am running this
" model-analyzer profile
--profile-models reranker --triton-launch-mode=remote
--output-model-repository-path ./output
--export-path profile_results--triton-http-endpoint "

but triton-server itself not launching

@tgerdesnv
Copy link
Collaborator

--triton-launch-mode=remote tells model analyzer to not launch tritonserver. The expectation is that there is already a server up and running (usually on a different machine).

@riyajatar37003
Copy link
Author

what is this then

This mode is beneficial when you want to use an already running Triton Inference Server. You may provide the URLs for the Triton instance's HTTP or GRPC endpoint depending on your chosen client protocol using the --triton-grpc-endpoint, and --triton-http-endpoint flags. You should also make sure that same GPUs are available to the Inference Server and Model Analyzer and they are on the same machine. Triton Server in this mode needs to be launched with --model-control-mode explicit flag to support loading/unloading of the models.

@riyajatar37003
Copy link
Author

now i am getting this error

Model Analyzer] Initializing GPUDevice handles

[Model Analyzer] Using GPU 0 NVIDIA A100-SXM4-40GB with UUID GPU-d9a0447f-f8fa-9d2f-79fc-ecf2567dacc2

[Model Analyzer] WARNING: Overriding the output model repo path "./rerenker_output1"

[Model Analyzer] Starting a local Triton Server

[Model Analyzer] Loaded checkpoint from file /model_repositories/checkpoints/0.ckpt

[Model Analyzer] GPU devices match checkpoint - skipping server metric acquisition

[Model Analyzer]

[Model Analyzer] Starting quick mode search to find optimal configs

[Model Analyzer]

[Model Analyzer] Creating model config: reranker_config_default

[Model Analyzer]

[Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_default

[Model Analyzer]

[Model Analyzer] Profiling reranker_config_default: client batch size=1, concurrency=24

[Model Analyzer] Profiling bge_reranker_v2_onnx_config_default: client batch size=1, concurrency=8

[Model Analyzer]

[Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer

[Model Analyzer] perf_analyzer did not produce any output.

[Model Analyzer] Saved checkpoint to model_repositories/checkpoints/1.ckpt

[Model Analyzer] Creating model config: reranker_config_0

[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]

[Model Analyzer] Setting max_batch_size to 1

[Model Analyzer] Enabling dynamic_batching

[Model Analyzer]

[Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_0

[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]

[Model Analyzer] Setting max_batch_size to 1

[Model Analyzer] Enabling dynamic_batching

[Model Analyzer]

[Model Analyzer] Profiling reranker_config_0: client batch size=1, concurrency=2

[Model Analyzer] Profiling bge_reranker_v2_onnx_config_0: client batch size=1, concurrency=2

[Model Analyzer]

[Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer

[Model Analyzer] perf_analyzer did not produce any output.

[Model Analyzer] No changes made to analyzer data, no checkpoint saved.

Traceback (most recent call last):

File "/opt/app_venv/bin/model-analyzer", line 8, in

sys.exit(main())

File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/entrypoint.py", line 278, in main

analyzer.profile(

File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 124, in profile

self._profile_models()

File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 233, in _profile_models

self._model_manager.run_models(models=models)

File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 145, in run_models

self._stop_ma_if_no_valid_measurement_threshold_reached()

File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 239, in _stop_ma_if_no_valid_measurement_threshold_reached

raise TritonModelAnalyzerException(

model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.

@nv-braf
Copy link
Contributor

nv-braf commented May 16, 2024

MA is not receiving a measurement from Perf Analyzer within the timeout window (600s). After two attempts without measurements, MA exits and directs you to examine the error logs to determine what has gone wrong. There can be a variety of reasons why this is occurring. Please examine the PA error log for additional details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants