Llama2 70B Benchemark #248

ninano1208 · 2025-02-20T04:28:58Z

I am unable to run the benchmark due to the following issues. I need help to resolve these two issues.
I am using ROCM6.1 version and ROCM is working very accurately.

/app/mlc/bin/python3 main.py --scenario Offline --dataset-path /root/MLC/repos/local/cache/download-file_a50938f4/open_orca/open_orca_gpt4_tokenized_llama.sampled_24576.pkl.gz --device rocm --total-sample-count 10 --user-conf '/root/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/5083167cafc74fa18c4e30e3eed9e796.conf' --output-log-dir /root/MLC/repos/local/cache/get-mlperf-inference-results-dir_98334fda/test_results/5cfda71d8e53-reference-rocm-pytorch-v2.6.0.dev20241122-default_config/llama2-70b-99/offline/performance/run_1 --dtype float32 --model-path /root/MLC/repos/local/cache/get-ml-model-llama2_adcd8091 2>&1 | tee '/root/MLC/repos/local/cache/get-mlperf-inference-results-dir_98334fda/test_results/5cfda71d8e53-reference-rocm-pytorch-v2.6.0.dev20241122-default_config/llama2-70b-99/offline/performance/run_1/console.out'; echo ${PIPESTATUS[0]} > exitstatus
usage: main.py [-h] [--scenario {Offline,Server}] [--model-path MODEL_PATH]
[--dataset-path DATASET_PATH] [--accuracy] [--dtype DTYPE]
[--device {cpu,cuda:0}] [--audit-conf AUDIT_CONF]
[--user-conf USER_CONF]
[--total-sample-count TOTAL_SAMPLE_COUNT]
[--batch-size BATCH_SIZE] [--output-log-dir OUTPUT_LOG_DIR]
[--enable-log-trace] [--num-workers NUM_WORKERS] [--vllm]
[--api-model-name API_MODEL_NAME] [--api-server API_SERVER]
[--lg-model-name {llama2-70b,llama2-70b-interactive}]
main.py: error: argument --device: invalid choice: 'rocm' (choose from cpu, cuda:0)
Traceback (most recent call last):
File "/app/mlc/bin/mlcr", line 8, in
sys.exit(mlcr())
^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1715, in mlcr
main()
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1797, in main
res = method(run_args)
^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1509, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1772, in _run
r = customize_code.preprocess(ii)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/script/run-mlperf-inference-app/customize.py", line 284, in preprocess
r = mlc.access(ii)
^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 92, in access
result = method(options)
^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1509, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1842, in _run
r = self._call_run_deps(prehook_deps, self.local_env_keys, local_env_keys_from_meta, env, state, const, const_state, add_deps_recursive,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3532, in _call_run_deps
r = script._run_deps(deps, local_env_keys, env, state, const, const_state, add_deps_recursive, recursion_spaces,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3702, in _run_deps
r = self.action_object.access(ii)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 92, in access
result = method(options)
^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1509, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1858, in _run
r = prepare_and_run_script_with_postprocessing(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 5488, in prepare_and_run_script_with_postprocessing
r = script_automation._call_run_deps(posthook_deps, local_env_keys, local_env_keys_from_meta, env, state, const, const_state,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3532, in _call_run_deps
r = script._run_deps(deps, local_env_keys, env, state, const, const_state, add_deps_recursive, recursion_spaces,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3702, in _run_deps
r = self.action_object.access(ii)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 92, in access
result = method(options)
^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1509, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1885, in _run
r = self._run_deps(post_deps, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive, recursion_spaces,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3702, in _run_deps
r = self.action_object.access(ii)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 92, in access
result = method(options)
^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/mlc/lib/python3.12/site-packages/mlc/main.py", line 1519, in call_script_module_function
raise ScriptExecutionError(f"Script {function_name} execution failed. Error : {error}")
mlc.main.ScriptExecutionError: Script run execution failed. Error : MLC script failed (name = benchmark-program, return code = 512)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Please file an issue at https://github.com/mlcommons/mlperf-automations/issues along with the full MLC command being run and the relevant
or full console log.

anandhu-eng · 2025-02-21T05:20:15Z

Hi @ninano1208 , I suppose you have followed this documentation. Could you replace --device=rocm to --device=cuda on your run command? I think it should pick up the gpu for execution even if rocm library is installed.

anandhu-eng · 2025-02-21T10:00:30Z

Apologies for the previous reply, that would install CUDA dependency on the system instead of ROCm. PR containing the fix has been raised here . There will not be any modification to the run command.

anandhu-eng · 2025-02-21T10:02:29Z

@ninano1208 could you please try again, the changes should now reflect.

anandhu-eng mentioned this issue Feb 21, 2025

Map rocm and gpu to cuda #251

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama2 70B Benchemark #248

Llama2 70B Benchemark #248

ninano1208 commented Feb 20, 2025

anandhu-eng commented Feb 21, 2025

anandhu-eng commented Feb 21, 2025

anandhu-eng commented Feb 21, 2025

Llama2 70B Benchemark #248

Llama2 70B Benchemark #248

Comments

ninano1208 commented Feb 20, 2025

anandhu-eng commented Feb 21, 2025

anandhu-eng commented Feb 21, 2025

anandhu-eng commented Feb 21, 2025