-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--triton-launch-mode=remote #874
Comments
|
what is this then This mode is beneficial when you want to use an already running Triton Inference Server. You may provide the URLs for the Triton instance's HTTP or GRPC endpoint depending on your chosen client protocol using the --triton-grpc-endpoint, and --triton-http-endpoint flags. You should also make sure that same GPUs are available to the Inference Server and Model Analyzer and they are on the same machine. Triton Server in this mode needs to be launched with --model-control-mode explicit flag to support loading/unloading of the models. |
now i am getting this error Model Analyzer] Initializing GPUDevice handles [Model Analyzer] Using GPU 0 NVIDIA A100-SXM4-40GB with UUID GPU-d9a0447f-f8fa-9d2f-79fc-ecf2567dacc2 [Model Analyzer] WARNING: Overriding the output model repo path "./rerenker_output1" [Model Analyzer] Starting a local Triton Server [Model Analyzer] Loaded checkpoint from file /model_repositories/checkpoints/0.ckpt [Model Analyzer] GPU devices match checkpoint - skipping server metric acquisition [Model Analyzer] [Model Analyzer] Starting quick mode search to find optimal configs [Model Analyzer] [Model Analyzer] Creating model config: reranker_config_default [Model Analyzer] [Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_default [Model Analyzer] [Model Analyzer] Profiling reranker_config_default: client batch size=1, concurrency=24 [Model Analyzer] Profiling bge_reranker_v2_onnx_config_default: client batch size=1, concurrency=8 [Model Analyzer] [Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer [Model Analyzer] perf_analyzer did not produce any output. [Model Analyzer] Saved checkpoint to model_repositories/checkpoints/1.ckpt [Model Analyzer] Creating model config: reranker_config_0 [Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}] [Model Analyzer] Setting max_batch_size to 1 [Model Analyzer] Enabling dynamic_batching [Model Analyzer] [Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_0 [Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}] [Model Analyzer] Setting max_batch_size to 1 [Model Analyzer] Enabling dynamic_batching [Model Analyzer] [Model Analyzer] Profiling reranker_config_0: client batch size=1, concurrency=2 [Model Analyzer] Profiling bge_reranker_v2_onnx_config_0: client batch size=1, concurrency=2 [Model Analyzer] [Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer [Model Analyzer] perf_analyzer did not produce any output. [Model Analyzer] No changes made to analyzer data, no checkpoint saved. Traceback (most recent call last): File "/opt/app_venv/bin/model-analyzer", line 8, in sys.exit(main()) File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/entrypoint.py", line 278, in main analyzer.profile( File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 124, in profile self._profile_models() File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 233, in _profile_models self._model_manager.run_models(models=models) File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 145, in run_models self._stop_ma_if_no_valid_measurement_threshold_reached() File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 239, in _stop_ma_if_no_valid_measurement_threshold_reached raise TritonModelAnalyzerException( model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong. |
MA is not receiving a measurement from Perf Analyzer within the timeout window (600s). After two attempts without measurements, MA exits and directs you to examine the error logs to determine what has gone wrong. There can be a variety of reasons why this is occurring. Please examine the PA error log for additional details. |
Hi ,
can u share any example/command for these mode.?
during launching i am doing this way "tritonserver --model-control-mode explicit --exit-on-error=false --model-repository=/tmp/models"
and in the other container i am running this
" model-analyzer profile
--profile-models reranker --triton-launch-mode=remote
--output-model-repository-path ./output
--export-path profile_results--triton-http-endpoint "
but triton-server itself not launching
The text was updated successfully, but these errors were encountered: