-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trtexec: onnx to tensorRT convertion fails with --fp16, It reported a Segmentation fault. #4111
Comments
Add |
@lix19937 Hi, I add This is complete log file. |
Can you upload the onnx file here ? |
Yes, I can. But the onnx file is too large (about 1.5G), can you use baidu cloud? |
@demuxin Can you upload google drive ? |
Hi, @lix19937 , this is google drive link of onnx model: https://drive.google.com/file/d/1YyBqO0GbskV-_3Wc2ljHh5s84bH9ldV_/view?usp=sharing |
@demuxin
But when continue build, it raise error as follow
My TensorRT version is v8601, about Your |
Thank you for your prompt reply. I've confirmed that I'm building the engine on RTX 3090, and there's plenty of memory left, so it shouldn't be out of memory issue. It might also a bug of tensorrt. I tried again with TensorRT 10.3 and it can build engine successfully, but the model output are completely different compared to the fp32 mode. Do you have any better suggestions for troubleshooting this problem? |
Can you upload the log ? |
I inference successfully using C++ tensorrt api without any error message. what log do you want me to upload? |
The log of |
This is complete trtexec log file on TensorRT 10.3. |
Your model has layernorm layer after self-attention, which overflow in fp16, so you should set layernorm in fp32. |
Thank you for working so hard. I'm using netron to visualize onnx model and no layernorm layer is found, do you know what's going on? and do you know how to set layernoram layer individually to fp32 using tensorrt C++ api and trtexec? |
You can ref
trt will fusion ln struct nodes. |
Hi @lix19937 , I used this command to build tensorrt engine:
But there is still the following warning:
this is build log file: It seems that the setting was not successful, How should I set it up? Moreover do you know how to set layernoram layer individually to fp32 using tensorrt C++ api ? |
I update torch to 1.13, and export onnx with opset 17, this issue can be solved. |
I had successfully converted fp32 engine from onnx model. I tried the suggestion above to add --verbose for the fp16 conversion. Within the log, there is no overflow logging and the result is different compared to the fp32 engine. Any further investigation? |
Try to use follow to see the diff polygraphy run $spec_onnx --onnxrt --trt
polygraphy run $spec_onnx --onnxrt --trt --fp16
|
by running
by running for
I still can't see which layer produces differently. I tried |
Description
I'm using C++ TensorRT to build on the model, but it's reporting Segmentation fault.
Then I used the trtexec command and the same error was reported:
But I can successfully build the engine without specifying the fp16 parameter.
What can be done to troubleshoot this problem?
Environment
TensorRT Version: tensorrt9.3 / tensorrt8.6
NVIDIA GPU: RTX 3090
NVIDIA Driver Version: 535.183.01
CUDA Version: 11.8
Operating System: ubuntu22
The text was updated successfully, but these errors were encountered: