Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frigate.sh install issue #8

Open
Roagert opened this issue Aug 9, 2024 · 32 comments
Open

frigate.sh install issue #8

Roagert opened this issue Aug 9, 2024 · 32 comments

Comments

@Roagert
Copy link

Roagert commented Aug 9, 2024

Please verify that you have read and understood the guidelines.

yes

A clear and concise description of the issue.

any settings running the frigate.sh script fails

What settings are you currently utilizing?

Advanced Settings

Which Linux distribution are you employing?

Debian 11

If relevant, including screenshots or a code block can be helpful in clarifying the issue.

As described in tteck#2711.

What i have noticed is the follow:

default installation doesnt work:
im1_creating_default

Seems like debian11 is failing installation. I just tried ubuntu with focal. This seems to continue further even past the Nvidia driver installation.

But further issues:

Shared drive: (bookworm)
im1_creating_adv_bookworm_reboot
Shared drive disabled: (bookworm)
Seems to continue but fails on driver installation
im1_creating_adv_bookworm_no_shared_drive_drivers_fail

Bullseye failed:
im1_creating_adv_bullseye_fail

Ubuntu also fails:
But it got significanty further
image

` ✓ Set Up Hardware Acceleration
✓ Stop spinner to prevent segmentation fault
/Collecting py3nvml (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 17))
Cloning https://github.com/fbcotter/py3nvml to /tmp/pip-wheel-cnhsa4ht/py3nvml_73a47781bfe14d25a7d33dfcb3c22244
Running command git clone --filter=blob:none --quiet https://github.com/fbcotter/py3nvml /tmp/pip-wheel-cnhsa4ht/py3nvml_73a47781bfe14d25a7d33dfcb3c22244
Resolved https://github.com/fbcotter/py3nvml to commit 0545dfd39858c1552aaabc52ea3fbcb7f5714853
Preparing metadata (setup.py) ... done
Collecting click==8.1.* (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 1))
Downloading click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting Flask==3.0.* (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 2))
Downloading flask-3.0.3-py3-none-any.whl.metadata (3.2 kB)
Collecting Flask_Limiter==3.7.* (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 3))
Downloading Flask_Limiter-3.7.0-py3-none-any.whl.metadata (6.1 kB)
Collecting imutils==0.5.* (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 4))
Downloading imutils-0.5.4.tar.gz (17 kB)
Preparing metadata (setup.py) ... done
Collecting joserfc==0.11.* (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 5))
Downloading joserfc-0.11.1-py3-none-any.whl.metadata (2.5 kB)
Collecting markupsafe==2.1.* (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 6))
Downloading MarkupSafe-2.1.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting mypy==1.6.1 (from -r /opt/frigate/docker/main/requirements-wheels.txt (line 7))
Downloading mypy-1.6.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.7 kB)
ERROR: Ignored the following versions that require a different python version: 1.25.0 Requires-Python >=3.9; 1.25.0rc1 Requires-Python >=3.9; 1.25.1 Requires-Python >=3.9; 1.25.2 Requires-Python >=3.9; 1.26.0 Requires-Python <3.13,>=3.9; 1.26.0b1 Requires-Python <3.13,>=3.9; 1.26.0rc1 Requires-Python <3.13,>=3.9; 1.26.1 Requires-Python <3.13,>=3.9; 1.26.2 Requires-Python >=3.9; 1.26.3 Requires-Python >=3.9; 1.26.4 Requires-Python >=3.9; 2.0.0 Requires-Python >=3.9; 2.0.0b1 Requires-Python >=3.9; 2.0.0rc1 Requires-Python >=3.9; 2.0.0rc2 Requires-Python >=3.9; 2.0.1 Requires-Python >=3.9
ERROR: Could not find a version that satisfies the requirement numpy==1.26.* (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.3, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5, 1.20.0, 1.20.1, 1.20.2, 1.20.3, 1.21.0, 1.21.1, 1.21.2, 1.21.3, 1.21.4, 1.21.5, 1.21.6, 1.22.0, 1.22.1, 1.22.2, 1.22.3, 1.22.4, 1.23.0rc1, 1.23.0rc2, 1.23.0rc3, 1.23.0, 1.23.1, 1.23.2, 1.23.3, 1.23.4, 1.23.5, 1.24.0rc1, 1.24.0rc2, 1.24.0, 1.24.1, 1.24.2, 1.24.3, 1.24.4)
ERROR: No matching distribution found for numpy==1.26.*

[ERROR] in line 60: exit code 0: while executing command $STD pip3 wheel --wheel-dir=/wheels -r /opt/frigate/docker/main/requirements-wheels.txt`

Please provide detailed steps to reproduce the issue.

bash -c "$(wget -qLO - https://github.com/remz1337/Proxmox/raw/remz/ct/frigate.sh)"

set default og advanced , see description for issue

@chrislawso
Copy link

chrislawso commented Aug 12, 2024

Hi, I hit nearly the same results. The error I receive is slightly different. The host PVE is version 8.2.4. I am running this command on host:
bash -c "$(wget -qLO - https://github.com/remz1337/Proxmox/raw/remz/ct/frigate.sh)"

Install settings are with Advanced Settings, Debian 11, and I select Nvidia gpu passthrough. I have nvidia drivers installed on the host. The error results I get are:

...
 -Processing triggers for rsyslog (8.2102.0-2+deb11u1) ...
 \Processing triggers for man-db (2.9.4-2) ...
 \Processing triggers for mailcap (3.69) ...
 ✓ Updated Container OS
 ✓ Set up Container
 ✓ Installed APT proxy client
 ✓ Installed sudo
 -safe_mount: 1425 No such file or directory - Failed to mount "/dev/nvidia-modeset" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/nvidia-modeset"
run_buffer: 571 Script exited with status 17
lxc_setup: 3948 Failed to run autodev hooks
do_start: 1273 Failed to setup container "10008"
sync_wait: 34 An error occurred in another process (expected sequence number 4)
__lxc_start: 2114 Failed to spawn container "10008"
startup for container '10008' failed

[ERROR] in line 20: exit code 0: while executing command pct reboot $CTID

/dev/fd/63: line 200: pop_var_context: head of shell_variables not a function context

After this point the lxc that was created is not running. If I try to start the lxc manually it fails with PVE showing error:

safe_mount: 1425 No such file or directory - Failed to mount "/dev/nvidia-modeset" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/nvidia-modeset"
run_buffer: 571 Script exited with status 17
lxc_setup: 3948 Failed to run autodev hooks
do_start: 1273 Failed to setup container "10008"
sync_wait: 34 An error occurred in another process (expected sequence number 4)
__lxc_start: 2114 Failed to spawn container "10008"
TASK ERROR: startup for container '10008' failed

@chrislawso
Copy link

chrislawso commented Sep 1, 2024

Today tried your bash installer again with advanced install, no gpu passthrough. It shows the following error:

✓ LXC Container 101 was successfully created.
 -/dev/dri/renderD128 is not a device
 \
[ERROR] in line 730: exit code 0: while executing command pct start "$CTID"

I reviewed your shell files again and I do not see where this line exists. I don't see a frigate file with a line 730.

@remz1337
Copy link
Owner

I was able to investigate this and made some changes. Could you try again running the script with default parameters?

@chrislawso
Copy link

chrislawso commented Sep 13, 2024

I ran the script now with default parameters and again also tried with advanced settings and it always throws same error as identified previously. I am running it on proxmox VE 8.2.4, kernel 6.8.4.3, Intel xeon e5 v2 cpus and dell motherboard.

✓ LXC Container 101 was successfully created.
 -/dev/dri/renderD128 is not a device
 \
[ERROR] in line 730: exit code 0: while executing command pct start "$CTID"

@remz1337
Copy link
Owner

@chrislawso looks like your hardware config is different from mine. Can you run this command on your host and paste the output here please?
ls -al /dev/dri

Here's what I get on my server:

root@proxmox:~# ls -al /dev/dri
total 0
drwxr-xr-x  3 root root        100 Aug 26 15:15 .
drwxr-xr-x 22 root root       5700 Sep 12 20:56 ..
drwxr-xr-x  2 root root         80 Aug 26 15:15 by-path
crw-rw----  1 root video  226,   0 Aug 26 15:15 card0
crw-rw----  1 root render 226, 128 Aug 26 15:15 renderD128

@chrislawso
Copy link

chrislawso commented Sep 14, 2024

Hi,

Thank you for responding and thanks for all your help in making your script possible.

Here is a dell xeon server machine PVE 8.2.4 kernel 6.8.4-3 with a dedicated nvidia gpu:

root@pvealpha:~# ls -al /dev/dri
total 0
drwxr-xr-x  3 root root     120 Aug 13 00:09 .
drwxr-xr-x 20 root root    6060 Aug 29 15:28 ..
drwxr-xr-x  2 root root     100 Aug 13 00:09 by-path
crw-rw----  1 root video 226, 0 Aug 13 00:00 card0
crw-rw----  1 root video 226, 1 Aug 13 00:09 card1
----------  1 root root       0 Aug 13 00:01 renderD128

I also tried your script on a different i7 intel machine with PVE 8.2.4 and Linux 6.8.8-1-pve (2024-06-10T11:42Z) which throws this error after running your script:

✓ Started LXC Container
Customizing LXC creation
 / Setting up Container   
[ERROR] in line 12: exit code 0: while executing command pct exec $CTID -- /bin/bash -c "apt install -qqy curl &>/dev/null"

/dev/fd/63: line 200: pop_var_context: head of shell_variables not a function context

Here is the i7 desktop machine PVE 8.2.4 kernel 6.8.8-1 with a dedicated nvidia gtx gpu:

root@pve:~# ls -al /dev/dri
total 0
drwxr-xr-x  3 root root        120 Aug 12 00:10 .
drwxr-xr-x 19 root root       4540 Aug 12 00:16 ..
drwxr-xr-x  2 root root        100 Aug 12 00:10 by-path
crw-rw----  1 root video  226,   0 Aug 12 00:00 card0
crw-rw----  1 root video  226,   1 Aug 12 00:10 card1
crw-rw----  1 root render 226, 128 Aug 12 00:10 renderD128

location of nvidia gpu which I add for device passthrough to lxc

root@pve:~# ls /dev/nvidia* -l
crw-rw-rw- 1 root root 195,   0 Sep 14 00:12 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Sep 14 00:12 /dev/nvidiactl
crw-rw-rw- 1 root root 234,   0 Sep 14 00:12 /dev/nvidia-uvm
crw-rw-rw- 1 root root 234,   1 Sep 14 00:12 /dev/nvidia-uvm-tools

/dev/nvidia-caps:
total 0
cr-------- 1 root root 237, 1 Sep 14 00:12 nvidia-cap1
cr--r--r-- 1 root root 237, 2 Sep 14 00:12 nvidia-cap2

@remz1337
Copy link
Owner

I'm trying to troubleshoot the curl issue, but I'm unable to reproduce on my machine. The only this I see different on your Xeon server are the rights on the renderD128. Should be crw-rw---- but you have ----------

@chrislawso
Copy link

chrislawso commented Sep 14, 2024

Most of my machines show crw-rw---- same as you have. On a different dell xeon server I performed a clean new proxmox install with nvidia gpu and it shows same rights are you have. Also if you see my previous messages I have also testing a desktop intel i7 with nvidia gtx and it has same right as you have. These machines with the same rights as you have also stop at error messages which i posted above in previous messages.

root@pve:~# ls -al /dev/dri
total 0
drwxr-xr-x  3 root root        120 Sep 14 08:42 .
drwxr-xr-x 20 root root       4400 Sep 14 09:59 ..
drwxr-xr-x  2 root root        100 Sep 14 08:42 by-path
crw-rw----  1 root video  226,   0 Sep 14 08:42 card0
crw-rw----  1 root video  226,   1 Sep 14 08:42 card1
crw-rw----  1 root render 226, 128 Sep 14 08:42 renderD128

See below my newest post. I give you direct access to one of my test proxmox machines. If you have a moment you can test and reproduce the issues.

@chrislawso
Copy link

I'm trying to troubleshoot the curl issue, but I'm unable to reproduce on my machine. The only this I see different on your Xeon server are the rights on the renderD128. Should be crw-rw---- but you have ----------

I understand that you are unable to reproduce issues on your machine. If you have willingness I am happy to provide you remote access to any of my machines.

I am posting access information to a test machine open and available to you now. The access info is available at this link for one time view and then the page will be burned deleted after one time so please copy the access credentials. Can you please respond when you have copied this information and please confirm you are able to connect into the proxmox host? I will leave server running powered on for however long you require.

https://pastebin.com/n8NFnfFK
pw to link
y6KmkHVpFX

Please respond when you have copied and connected so I can remove these posts. Thank you

@remz1337
Copy link
Owner

I've copied it, thanks. So they all have the same error while trying to install curl?

@chrislawso
Copy link

chrislawso commented Sep 15, 2024

Hi, the server you have access to I tested your script now with default and it results with the following message:

 /Processing triggers for rsyslog (8.2102.0-2+deb11u1) ...
 -Processing triggers for man-db (2.9.4-2) ...
 -Processing triggers for mailcap (3.69) ...
 ✓ Updated Container OS
 ✓ Set up Container
 ✓ Installed APT proxy client
 ✓ Installed sudo
 ✓ Rebooted LXC
 \ Installing Nvidia Drivers   
[ERROR] in line 149: exit code 0: while executing command pct exec $CTID -- /bin/bash -c "wget -q $DOWNLOAD_URL"

bash: line 200: pop_var_context: head of shell_variables not a function context

This looks like you probably performed more updates to your script because it is first time I see script arriving and failing at the stage \ Installing Nvidia Drivers

On a different computer desktop intel i7 with nvidia gtx I tested today your script and the script installation fails at:

Setting up gnupg-utils (2.2.27-2+deb11u2) ...
 -Setting up gpg-agent (2.2.27-2+deb11u2) ...
 |Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-browser.socket → /usr/lib/systemd/user/gpg-agent-browser.socket.
 /Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-extra.socket → /usr/lib/systemd/user/gpg-agent-extra.socket.
 -Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent-ssh.socket → /usr/lib/systemd/user/gpg-agent-ssh.socket.
 \Created symlink /etc/systemd/user/sockets.target.wants/gpg-agent.socket → /usr/lib/systemd/user/gpg-agent.socket.
Setting up gpg-wks-client (2.2.27-2+deb11u2) ...
 |Setting up gpg-wks-server (2.2.27-2+deb11u2) ...
Setting up gnupg (2.2.27-2+deb11u2) ...
 /Processing triggers for man-db (2.9.4-2) ...
 \Processing triggers for libc-bin (2.36-9+deb12u8) ...
 /Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
 \OK
 \
[ERROR] in line 125: exit code 0: while executing command wget -q https://developer.download.nvidia.com/compute/cuda/repos/${os}/x86_64/cuda-keyring_1.1-1_all.deb

On this different computer desktop intel i7 with nvidia gtx I ran your script again today and the script installation fails differently:

Adding a concatenated output as "detections".
Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)
 |/usr/local/src/tensorrt_demos/yolo/yolo_to_onnx.py:486: DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead.
  param_name, param_data, param_data_shape = self._load_one_param_type(
 -
Generating yolov7-320.trt. This may take a few minutes.

 -ERROR: failed to build the TensorRT engine!
 \[09/15/2024-16:44:29] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Unsupported SM: 0x502)
Loading the ONNX file...
Adding yolo_layer plugins.
Adding a concatenated output as "detections".
Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)
Available tensorrt models:
ls: cannot access '*.trt': No such file or directory

[ERROR] in line 193: exit code 0: while executing command $STD bash /opt/frigate/docker/tensorrt/detector/rootfs/etc/s6-overlay/s6-rc.d/trt-model-prepare/run

Thank you for everything you do

@remz1337
Copy link
Owner

Thank you for the logs, I'm running the script again on my side to see if there's any issues, and I'll continue investigating this week

@remz1337
Copy link
Owner

Ok, let's try to fix 1 server at a time. I connected through the remote access, but I can't create new LXC and I don't see an existing frigate lxc to debug.

Let's start with the first server. What are the exact specification (CPU/GPU/PVE+kernel version)? Which version of the Nvidia driver is installed on the host? (you can check by running nvidia-smi)

@chrislawso
Copy link

chrislawso commented Sep 18, 2024

updated. see post below

@chrislawso
Copy link

chrislawso commented Sep 18, 2024

Please login with root user and use the password I provided for you earlier to your other account. You will have su root cli access and you can run your bash commands on the test machine now.

The machine you have access to has the following details:
Xeon(R) CPU E5-2680 v4 , Virtual Environment 8.2.4 Linux 6.8.12-1-pv
NVIDIA-SMI 550.90.12 Driver Version: 550.90.12 CUDA Version: 12.4

I gave your user all admin privileges and cli admin privileges and when I tested install the bash script it gave this message:
✗ Please run this script as root.
So please use the root user with your password.

@remz1337
Copy link
Owner

Are you using a caching server of some sort? There seems to be a few minutes delay between the time I push fixes on my repo and the time your server can fetch the latest patch. Weird.

Also, I saw you already have a working frigate LXC (ID=100). From the logs everything seems to be working fine. Did you manage to successfully install it?

Anyway, I'm trying my frigate script again now. Will keep you posted.

@chrislawso
Copy link

chrislawso commented Sep 19, 2024

Caching server? Nothing should be caching http traffic. I do have a local dns server that might be caching dns resolution but that should not effect changes on github repos etc. I do not know what may be causing that delay except maybe its occurring with the repo provider.

The frigate lxc 100 was a plain tteck install completed over the weekend as I want to test the new frigate release as I was hitting bugs from the previous release. That container is not from your remz1337 repo . If you like you can remove it or do whatever you need to.

The other lxc 10111 and 10112 have the gpu device successfully passed through and you can see the device paths that are required in order to pass gpu into lxc.
/dev/nvidia0
/dev/nvidiactl
/dev/nvidia-uvm-tools
/dev/nvidia-uvm

@remz1337
Copy link
Owner

remz1337 commented Sep 19, 2024

Ok, I figured what's the issue. Your driver version (550.90.12) is not available through the download link

See here the list of available drivers. Closest to yours would be 550.90.07
https://download.nvidia.com/XFree86/Linux-x86_64/

Please reinstall the driver on the host with any version that is available through that link, then retry my frigate script.

@chrislawso
Copy link

doing that now.

@chrislawso
Copy link

chrislawso commented Sep 19, 2024

Before when I was searching for nvidia drivers I found this page https://www.nvidia.com/Download/index.aspx and it provided me this link https://us.download.nvidia.com/tesla/550.90.12/NVIDIA-Linux-x86_64-550.90.12.run

I did not anticipate that different nvidia web pages would not show many driver versions and that your script would not be able to find certain matching nvidia driver version. I will now use your link https://download.nvidia.com/XFree86/Linux-x86_64/ and get drivers from here.

On host I installed the driver you suggested is closest 550.90.07 and I am now running your install script. I will continue to update you here today with how the installs conclude. Thank you for looking into this.

@remz1337
Copy link
Owner

Classic Nvidia. It is recommended to have the exact same driver version across the host and LXC. BTW, I also have a script to install nvidia driver on the host. It's currently hardcoded to install version 550.67

https://raw.githubusercontent.com/remz1337/Proxmox/remz/misc/nvidia-drivers-host.sh

@chrislawso
Copy link

chrislawso commented Sep 19, 2024

Unfortunately the script hit another issue. I am pasting below.

When searching on internet for related errors

 -ERROR: failed to build the TensorRT engine!
 -[09/18/2024-19:36:27] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Unsupported SM: 0x601)

I found some forum posts with similar errors and I believe the issue is because your script is installing tensorrt-10.4.0 which might not support or deprecated the older Pascal nvidia gpu in the system.

  ✓ Installed Nvidia Dependencies
 -Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
Requirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (1.26.4)
 \Collecting tensorrt
 |  Downloading https://pypi.nvidia.com/tensorrt/tensorrt-10.4.0.tar.gz (16 kB)
 \  Preparing metadata (setup.py) ... done
 \Collecting cuda-python
 -  Downloading cuda_python-12.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
 /Collecting cython
  Downloading Cython-3.0.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.2 kB)
 -Collecting nvidia-cuda-runtime-cu12
  Downloading https://pypi.nvidia.com/nvidia-cuda-runtime-cu12/nvidia_cuda_runtime_cu12-12.6.68-py3-none-manylinux2014_x86_64.whl (897 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.7/897.7 kB 4.7 MB/s eta 0:00:00
 |Collecting nvidia-cuda-runtime-cu11
  Downloading https://pypi.nvidia.com/nvidia-cuda-runtime-cu11/nvidia_cuda_runtime_cu11-11.8.89-py3-none-manylinux2014_x86_64.whl (875 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 875.6/875.6 kB 5.4 MB/s eta 0:00:00
 -Collecting nvidia-cublas-cu11
  Downloading https://pypi.nvidia.com/nvidia-cublas-cu11/nvidia_cublas_cu11-11.11.3.6-py3-none-manylinux2014_x86_64.whl (417.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 417.9/417.9 MB 13.4 MB/s eta 0:00:00
 /Collecting nvidia-cudnn-cu11
  Downloading nvidia_cudnn_cu11-9.4.0.58-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
 |Collecting onnx
  Downloading onnx-1.16.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (16 kB)
 \Requirement already satisfied: protobuf in /usr/local/lib/python3.9/dist-packages (5.28.2)
 |Collecting tensorrt-cu12==10.4.0 (from tensorrt)
 /  Downloading https://pypi.nvidia.com/tensorrt-cu12/tensorrt-cu12-10.4.0.tar.gz (18 kB)
 donereparing metadata (setup.py) ... -
 /Collecting tensorrt-cu12_bindings==10.4.0 (from tensorrt-cu12==10.4.0->tensorrt)
  Downloading https://pypi.nvidia.com/tensorrt-cu12-bindings/tensorrt_cu12_bindings-10.4.0-cp39-none-manylinux_2_17_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 4.2 MB/s eta 0:00:00
 \Collecting tensorrt-cu12_libs==10.4.0 (from tensorrt-cu12==10.4.0->tensorrt)
  Downloading https://pypi.nvidia.com/tensorrt-cu12-libs/tensorrt_cu12_libs-10.4.0-py2.py3-none-manylinux_2_17_x86_64.whl (2083.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 GB ? eta 0:00:00
 |Downloading cuda_python-12.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (24.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.2/24.2 MB 12.1 MB/s eta 0:00:00
Downloading Cython-3.0.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 8.7 MB/s eta 0:00:00
 /Downloading nvidia_cudnn_cu11-9.4.0.58-py3-none-manylinux2014_x86_64.whl (568.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 568.8/568.8 MB 5.9 MB/s eta 0:00:00
Downloading onnx-1.16.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.9/15.9 MB 15.9 MB/s eta 0:00:00
 -Building wheels for collected packages: tensorrt, tensorrt-cu12
 doneuilding wheel for tensorrt (setup.py) ... -
  Created wheel for tensorrt: filename=tensorrt-10.4.0-py2.py3-none-any.whl size=16347 sha256=c67da4c774be93b2c59f2a08ef5ded888d3e38bcedc403146d3119cbf62cc9d1
  Stored in directory: /root/.cache/pip/wheels/89/2e/c6/552a66fde839fe217340b71bfba843686a22900f4aa9367d76
 doneuilding wheel for tensorrt-cu12 (setup.py) ... -
  Created wheel for tensorrt-cu12: filename=tensorrt_cu12-10.4.0-py2.py3-none-any.whl size=17576 sha256=d09992e0f14da416e98ce075e74ee98b12ad3cb5d9dc1e8f68e33e2cdc11d638
  Stored in directory: /root/.cache/pip/wheels/2b/47/1b/94050cf8d1831f07ebc4b9df0ccd668b3cc0468441e9fef61c
Successfully built tensorrt tensorrt-cu12
 /Installing collected packages: tensorrt-cu12_bindings, cuda-python, onnx, nvidia-cuda-runtime-cu12, nvidia-cuda-runtime-cu11, nvidia-cublas-cu11, cython, tensorrt-cu12_libs, nvidia-cudnn-cu11, tensorrt-cu12, tensorrt
 -Successfully installed cuda-python-12.6.0 cython-3.0.11 nvidia-cublas-cu11-11.11.3.6 nvidia-cuda-runtime-cu11-11.8.89 nvidia-cuda-runtime-cu12-12.6.68 nvidia-cudnn-cu11-9.4.0.58 onnx-1.16.2 tensorrt-10.4.0 tensorrt-cu12-10.4.0 tensorrt-cu12_bindings-10.4.0 tensorrt-cu12_libs-10.4.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
 -Selecting previously unselected package nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6.
(Reading database ... 58640 files and directories currently installed.)
Preparing to unpack nv-tensorrt-local-repo-amd64.deb ...
 -Unpacking nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6 (1.0-1) ...
 |Setting up nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6 (1.0-1) ...
 \
The public nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6 GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6/nv-tensorrt-local-8C0C9E14-keyring.gpg /usr/share/keyrings/

Get:1 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  InRelease [1,572 B]
Get:1 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  InRelease [1,572 B]
Get:2 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  Packages [5,361 B]              
Hit:3 https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64  InRelease                                                            
Hit:4 http://security.debian.org bullseye-security InRelease                                                                                 
Hit:5 http://deb.debian.org/debian bullseye InRelease                                                                   
Hit:6 http://deb.debian.org/debian bullseye-updates InRelease                                       
Hit:7 https://packages.cloud.google.com/apt coral-edgetpu-stable InRelease
Hit:8 https://deb.nodesource.com/node_20.x nodistro InRelease
Reading package lists... Done        
Building dependency tree... Done
Reading state information... Done
All packages are up to date.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
 |The following additional packages will be installed:
  libnvinfer-dev libnvinfer-dispatch-dev libnvinfer-dispatch10 libnvinfer-headers-dev libnvinfer-headers-plugin-dev libnvinfer-lean-dev
  libnvinfer-lean10 libnvinfer-plugin-dev libnvinfer-plugin10 libnvinfer-vc-plugin-dev libnvinfer-vc-plugin10 libnvinfer10 libnvonnxparsers-dev
  libnvonnxparsers10
 /The following NEW packages will be installed:
  libnvinfer-dev libnvinfer-dispatch-dev libnvinfer-dispatch10 libnvinfer-headers-dev libnvinfer-headers-plugin-dev libnvinfer-lean-dev
  libnvinfer-lean10 libnvinfer-plugin-dev libnvinfer-plugin10 libnvinfer-vc-plugin-dev libnvinfer-vc-plugin10 libnvinfer10 libnvonnxparsers-dev
  libnvonnxparsers10 tensorrt-dev
0 upgraded, 15 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/2,574 MB of archives.
After this operation, 6,850 MB of additional disk space will be used.
Get:1 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-headers-dev 10.4.0.26-1+cuda12.6 [109 kB]
Get:2 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer10 10.4.0.26-1+cuda12.6 [1,256 MB]                                     
Get:3 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-dev 10.4.0.26-1+cuda12.6 [1,262 MB]                                   
Get:4 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-dispatch10 10.4.0.26-1+cuda12.6 [211 kB]
Get:5 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-dispatch-dev 10.4.0.26-1+cuda12.6 [120 kB]
Get:6 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-headers-plugin-dev 10.4.0.26-1+cuda12.6 [6,060 B]
Get:7 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-lean10 10.4.0.26-1+cuda12.6 [8,646 kB]
Get:8 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-lean-dev 10.4.0.26-1+cuda12.6 [22.0 MB]
Get:9 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-plugin10 10.4.0.26-1+cuda12.6 [10.2 MB]
Get:10 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-plugin-dev 10.4.0.26-1+cuda12.6 [10.5 MB]
Get:11 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-vc-plugin10 10.4.0.26-1+cuda12.6 [266 kB]
Get:12 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvinfer-vc-plugin-dev 10.4.0.26-1+cuda12.6 [122 kB]
Get:13 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvonnxparsers10 10.4.0.26-1+cuda12.6 [1,308 kB]
Get:14 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  libnvonnxparsers-dev 10.4.0.26-1+cuda12.6 [2,146 kB]
Get:15 file:/var/nv-tensorrt-local-repo-ubuntu2204-10.4.0-cuda-12.6  tensorrt-dev 10.4.0.26-1+cuda12.6 [2,930 B]
 -Selecting previously unselected package libnvinfer-headers-dev.
(Reading database ... 58676 files and directories currently installed.)
Preparing to unpack .../00-libnvinfer-headers-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
 /Unpacking libnvinfer-headers-dev (10.4.0.26-1+cuda12.6) ...
 \Selecting previously unselected package libnvinfer10.
Preparing to unpack .../01-libnvinfer10_10.4.0.26-1+cuda12.6_amd64.deb ...
 \Unpacking libnvinfer10 (10.4.0.26-1+cuda12.6) ...
 \Selecting previously unselected package libnvinfer-dev.
Preparing to unpack .../02-libnvinfer-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
 \Unpacking libnvinfer-dev (10.4.0.26-1+cuda12.6) ...
 /Selecting previously unselected package libnvinfer-dispatch10.
Preparing to unpack .../03-libnvinfer-dispatch10_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-dispatch10 (10.4.0.26-1+cuda12.6) ...
 \Selecting previously unselected package libnvinfer-dispatch-dev.
Preparing to unpack .../04-libnvinfer-dispatch-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
 |Unpacking libnvinfer-dispatch-dev (10.4.0.26-1+cuda12.6) ...
 -Selecting previously unselected package libnvinfer-headers-plugin-dev.
Preparing to unpack .../05-libnvinfer-headers-plugin-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
 \Unpacking libnvinfer-headers-plugin-dev (10.4.0.26-1+cuda12.6) ...
 /Selecting previously unselected package libnvinfer-lean10.
Preparing to unpack .../06-libnvinfer-lean10_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-lean10 (10.4.0.26-1+cuda12.6) ...
 /Selecting previously unselected package libnvinfer-lean-dev.
Preparing to unpack .../07-libnvinfer-lean-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-lean-dev (10.4.0.26-1+cuda12.6) ...
 -Selecting previously unselected package libnvinfer-plugin10.
Preparing to unpack .../08-libnvinfer-plugin10_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-plugin10 (10.4.0.26-1+cuda12.6) ...
 |Selecting previously unselected package libnvinfer-plugin-dev.
Preparing to unpack .../09-libnvinfer-plugin-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-plugin-dev (10.4.0.26-1+cuda12.6) ...
 -Selecting previously unselected package libnvinfer-vc-plugin10.
Preparing to unpack .../10-libnvinfer-vc-plugin10_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-vc-plugin10 (10.4.0.26-1+cuda12.6) ...
 /Selecting previously unselected package libnvinfer-vc-plugin-dev.
Preparing to unpack .../11-libnvinfer-vc-plugin-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvinfer-vc-plugin-dev (10.4.0.26-1+cuda12.6) ...
 \Selecting previously unselected package libnvonnxparsers10.
Preparing to unpack .../12-libnvonnxparsers10_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvonnxparsers10 (10.4.0.26-1+cuda12.6) ...
 -Selecting previously unselected package libnvonnxparsers-dev.
Preparing to unpack .../13-libnvonnxparsers-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
Unpacking libnvonnxparsers-dev (10.4.0.26-1+cuda12.6) ...
 |Selecting previously unselected package tensorrt-dev.
Preparing to unpack .../14-tensorrt-dev_10.4.0.26-1+cuda12.6_amd64.deb ...
 /Unpacking tensorrt-dev (10.4.0.26-1+cuda12.6) ...
 /Setting up libnvinfer-headers-dev (10.4.0.26-1+cuda12.6) ...
 -Setting up libnvinfer10 (10.4.0.26-1+cuda12.6) ...
Setting up libnvinfer-plugin10 (10.4.0.26-1+cuda12.6) ...
 \Setting up libnvinfer-vc-plugin10 (10.4.0.26-1+cuda12.6) ...
Setting up libnvonnxparsers10 (10.4.0.26-1+cuda12.6) ...
 |Setting up libnvinfer-dispatch10 (10.4.0.26-1+cuda12.6) ...
Setting up libnvinfer-dispatch-dev (10.4.0.26-1+cuda12.6) ...
 /Setting up libnvinfer-dev (10.4.0.26-1+cuda12.6) ...
Setting up libnvinfer-lean10 (10.4.0.26-1+cuda12.6) ...
 -Setting up libnvonnxparsers-dev (10.4.0.26-1+cuda12.6) ...
Setting up libnvinfer-headers-plugin-dev (10.4.0.26-1+cuda12.6) ...
 \Setting up libnvinfer-lean-dev (10.4.0.26-1+cuda12.6) ...
Setting up libnvinfer-plugin-dev (10.4.0.26-1+cuda12.6) ...
 |Setting up libnvinfer-vc-plugin-dev (10.4.0.26-1+cuda12.6) ...
Setting up tensorrt-dev (10.4.0.26-1+cuda12.6) ...
 /Processing triggers for libc-bin (2.36-9+deb12u8) ...
 ✓ Installed TensorRT
 \g++ is already the newest version (4:10.2.1-1).tience)   
g++ set to manually installed.
 /The following NEW packages will be installed:
  python-is-python3
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 2,800 B of archives.
After this operation, 13.3 kB of additional disk space will be used.
 \Selecting previously unselected package python-is-python3.
(Reading database ... 58765 files and directories currently installed.)
Preparing to unpack .../python-is-python3_3.9.2-1_all.deb ...
 -Unpacking python-is-python3 (3.9.2-1) ...
 \Setting up python-is-python3 (3.9.2-1) ...
 |Processing triggers for man-db (2.9.4-2) ...
+ SCRIPT_DIR=/usr/local/src/tensorrt_demos
+ git clone --depth 1 https://github.com/NateMeyer/tensorrt_demos.git -b conditional_download
Cloning into 'tensorrt_demos'...
 \remote: Enumerating objects: 118, done.
remote: Counting objects: 100% (118/118), done.
remote: Compressing objects: 100% (111/111), done.
remote: Total 118 (delta 13), reused 59 (delta 5), pack-reused 0 (from 0)
Receiving objects: 100% (118/118), 192.06 MiB | 9.19 MiB/s, done.
Resolving deltas: 100% (13/13), done.
Updating files: 100% (109/109), done.
+ bash /opt/frigate/fix_tensorrt.sh
+ '[' '!' -e /usr/local/cuda ']'
+ cd ./tensorrt_demos/plugins
++ nproc
 |+ make all -j4 computes=
computes: 
NVCCFLAGS: 
nvcc -ccbin g++ -I"/usr/local/cuda/include" -I"/usr/local/lib/python3.9/dist-packages/tensorrt_libs/include" -I"/usr/local/include" -I"plugin"  -Xcompiler -fPIC -c -o yolo_layer.o yolo_layer.cu
 /yolo_layer.h(89): warning #997-D: function "nvinfer1::IPluginV2Ext::configurePlugin(const nvinfer1::Dims *, int32_t, const nvinfer1::Dims *, int32_t, const nvinfer1::DataType *, const nvinfer1::DataType *, const bool *, const bool *, nvinfer1::PluginFormat, int32_t)" is hidden by "nvinfer1::YoloLayerPlugin::configurePlugin" -- virtual function override intended?
              void configurePlugin(const PluginTensorDesc* in, int nbInput, const PluginTensorDesc* out, int nbOutput) noexcept override {}
                   ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

 -yolo_layer.h(89): warning #997-D: function "nvinfer1::IPluginV2Ext::configurePlugin(const nvinfer1::Dims *, int32_t, const nvinfer1::Dims *, int32_t, const nvinfer1::DataType *, const nvinfer1::DataType *, const __nv_bool *, const __nv_bool *, nvinfer1::PluginFormat, int32_t)" is hidden by "nvinfer1::YoloLayerPlugin::configurePlugin" -- virtual function override intended?
              void configurePlugin(const PluginTensorDesc* in, int nbInput, const PluginTensorDesc* out, int nbOutput) noexcept override {}
                   ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

 /yolo_layer.h:52:48: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
   52 |             IPluginV2IOExt* clone() const NOEXCEPT override;
      |                                                ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
yolo_layer.h:52:48: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
   52 |             IPluginV2IOExt* clone() const NOEXCEPT override;
      |                                                ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
yolo_layer.h:127:100: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
  127 |             IPluginV2IOExt* createPlugin(const char* name, const PluginFieldCollection* fc) NOEXCEPT override;
      |                                                                                                    ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
yolo_layer.h:129:117: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
  129 |             IPluginV2IOExt* deserializePlugin(const char* name, const void* serialData, size_t serialLength) NOEXCEPT override;
      |                                                                                                                     ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
yolo_layer.cu:76:48: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
   76 |     IPluginV2IOExt* YoloLayerPlugin::clone() const NOEXCEPT
      |                                                ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
yolo_layer.cu:293:100: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
  293 |     IPluginV2IOExt* YoloPluginCreator::createPlugin(const char* name, const PluginFieldCollection* fc) NOEXCEPT
      |                                                                                                    ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
yolo_layer.cu:363:117: warning: ‘IPluginV2IOExt’ is deprecated [-Wdeprecated-declarations]
  363 |     IPluginV2IOExt* YoloPluginCreator::deserializePlugin(const char* name, const void* serialData, size_t serialLength) NOEXCEPT
      |                                                                                                                     ^~~~~~~~
/usr/include/x86_64-linux-gnu/NvInferRuntimePlugin.h:712:22: note: declared here
  712 | class TRT_DEPRECATED IPluginV2IOExt : public IPluginV2Ext
      |                      ^~~~~~~~~~~~~~
 \g++ -shared -o libyolo_layer.so yolo_layer.o -L"/usr/local/cuda/lib64" -L"/usr/local/lib/python3.9/dist-packages/tensorrt_libs/lib" -L"/usr/local/lib" -Wl,--start-group -lnvinfer -lnvinfer_plugin -lcudnn -lcublas -lnvToolsExt -lcudart -lrt -ldl -lpthread -Wl,--end-group
 \+ cp libyolo_layer.so /usr/local/lib/libyolo_layer.so
Generating the following TRT Models: yolov4-tiny-288,yolov4-tiny-416,yolov7-tiny-416,yolov7-320
Downloading yolo weights
 /
Creating yolov4-tiny-288.cfg and yolov4-tiny-288.weights
Creating yolov4-tiny-416.cfg and yolov4-tiny-416.weights
Creating yolov7-tiny-416.cfg and yolov7-tiny-416.weights
Creating yolov7-320.cfg and yolov7-320.weights

Done.
 \/usr/local/src/tensorrt_demos/yolo/yolo_to_onnx.py:486: DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead.
  param_name, param_data, param_data_shape = self._load_one_param_type(
 -
Generating yolov4-tiny-288.trt. This may take a few minutes.

 /ERROR: failed to build the TensorRT engine!
 -[09/18/2024-19:36:06] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Unsupported SM: 0x601)
Loading the ONNX file...
Adding yolo_layer plugins.
Adding a concatenated output as "detections".
Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)
 |/usr/local/src/tensorrt_demos/yolo/yolo_to_onnx.py:486: DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead.
  param_name, param_data, param_data_shape = self._load_one_param_type(
 /
Generating yolov4-tiny-416.trt. This may take a few minutes.

 \ERROR: failed to build the TensorRT engine!
 |[09/18/2024-19:36:10] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Unsupported SM: 0x601)
Loading the ONNX file...
Adding yolo_layer plugins.
Adding a concatenated output as "detections".
Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)
 //usr/local/src/tensorrt_demos/yolo/yolo_to_onnx.py:486: DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead.
  param_name, param_data, param_data_shape = self._load_one_param_type(
 \
Generating yolov7-tiny-416.trt. This may take a few minutes.

 |ERROR: failed to build the TensorRT engine!
 |[09/18/2024-19:36:14] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Unsupported SM: 0x601)
Loading the ONNX file...
Adding yolo_layer plugins.
Adding a concatenated output as "detections".
Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)
 -/usr/local/src/tensorrt_demos/yolo/yolo_to_onnx.py:486: DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead.
  param_name, param_data, param_data_shape = self._load_one_param_type(
 -
Generating yolov7-320.trt. This may take a few minutes.

 -ERROR: failed to build the TensorRT engine!
 -[09/18/2024-19:36:27] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Unsupported SM: 0x601)
Loading the ONNX file...
Adding yolo_layer plugins.
Adding a concatenated output as "detections".
Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)
Available tensorrt models:
ls: cannot access '*.trt': No such file or directory

[ERROR] in line 192: exit code 0: while executing command $STD bash /opt/frigate/docker/tensorrt/detector/rootfs/etc/s6-overlay/s6-rc.d/trt-model-prepare/run

The above results is from the machine you have access to which has newer motherboard.

I have a different older server that I tried your script running with default parameters and with advanced parameters and it continues failing early with:

Enable Verbose Mode: yes
Creating a Frigate LXC using the above advanced settings
 ✓ Using local for Template Storage.
 ✓ Using rz6by4tb for Container Storage.
 ✓ Updated LXC Template List
 ✓ LXC Container 10297 was successfully created.
 -/dev/dri/renderD128 is not a device
 |
[ERROR] in line 730: exit code 0: while executing command pct start "$CTID"

How can I change your script to try to install older versions if tensorrt in order to test it with older nvidia gpu ie Pascal cards? On the system with Pascal cards I installed the frigate docker compose with image frigate:stable-tensorrt which is able to run on older gpus and inside that frigate docker container they automatically installed an older version, ie

root@773ff3f51abd:/opt/frigate# pip show tensorrt
Name: tensorrt
Version: 8.5.3.1
Summary: A high performance deep learning inference library
Home-page: https://developer.nvidia.com/tensorrt
Author: NVIDIA Corporation
Author-email: 
License: Proprietary
Location: /usr/local/lib/python3.9/dist-packages
Requires: nvidia-cublas-cu11, nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11
Required-by: 
root@773ff3f51abd:/opt/frigate# pip list | grep -i tensorrt
tensorrt                 8.5.3.1

Strangely I found that older versions for download have a different url path structure than what you use at line 151 trt_url="https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/${TRT_VER}/local_repo/nv-tensorrt-local-repo-ubuntu2204-${TRT_VER}-cuda-${trt_cuda}.0-1_amd64.deb"

In order to download old version files they are protected behind secure login access for example at url
https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.5.3/local_repos/nv-tensorrt-local-repo-ubuntu2204-8.5.3-cuda-11.8_1.0-1_amd64.deb

In your frigate-install.sh at line 139 https://github.com/remz1337/Proxmox/blob/remz/install/frigate-install.sh I see your "Installing TensorRT" you created it with 'Use latest TensorRT version (instead of fixed v8)'. I am trying to make sense of the script but I have zero experience or knowledge with making something like this work or to modify it for older versions considering that "NVIDIA Pascal (SM 6.x) devices are deprecated in TensorRT 8.6" according to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html Here are release notes for tensorrt and they identify which versions stop supporting which cards https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html#rel-10-4-0

I see on line 122 TARGET_CUDA_VER=$(echo $NVD_VER_CUDA | sed 's|\.|-|g') and line 144 TRT_VER=$(pip freeze | grep tensorrt== | sed "s|tensorrt==||g") your scripts gets versions.

Unfortunately I do not really understand most of the scripting and am trying to test changes. I cloned your repo and modified several lines to hardcode older version numbers, I setup webserver locally hosting the older file nv-tensorrt-local-repo-ubuntu2204-8.5.3-cuda-11.8_1.0-1_amd64.deb and am testing. In addition to the screen output during install and files at /var/log/ are there are other places with information to review for why install is failing? I am unable to identify where or what is failing during install.

@chrislawso
Copy link

chrislawso commented Sep 19, 2024

Good news ! ... your script finally succeeded completion after I installed a newer gpu similar to the generation you are using in your system, a T1000. This leads me to think your script requires newer generation nvidia gpu. When I use older Pascal generation gpus the script errors probably due to the incompatible tensorrt version or something else.

 ✓ Built Nginx
 /+ tempio_version=2021.09.0
+ [[ amd64 == \a\m\d\6\4 ]]
+ arch=amd64
+ mkdir -p /usr/local/tempio/bin
+ wget -q -O /usr/local/tempio/bin/tempio https://github.com/home-assistant/tempio/releases/download/2021.09.0/tempio_amd64
 |+ chmod 755 /usr/local/tempio/bin/tempio
 ✓ Installed Tempio
 ✓ Configured Services
 ✓ Customized Container
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
 \0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
 ✓ Cleaned
Don't forget to edit the Frigate config file (/config/config.yml) and reboot. Example configuration at https://docs.frigate.video/configuration/
 ✓ Set Container to Normal Resources
 ✓ Completed Successfully!

Frigate should be reachable by going to the following URL.

Also, I made a backup of this successful lxc installation and copied it to different proxmox host with older Pascal gpu and have been unsuccessfully trying to manually remove the installed newer tensorrt 10.4 and install older versions. Unfortunately I can't find the correct commands to remove all the python. cuda, cudnn, tensorrt, files and install older version due to hitting error messages and required dependencies.

I will continue testing because I wanted to use old Pascal gpus for frigate and not new gpus.

@chrislawso
Copy link

chrislawso commented Sep 21, 2024

I forked your repo to https://github.com/chrislawso/Proxmoxremz/tree/remz and I am trying to make it work to install older tensorrt for old pascal nvidia gpu and I am failing.

I made some changes to fork where I found lines related to variables setting versions for tensorrt, cuda, etc. Unfortunately I do not have your bash skill and I am failing to modify your files.

For example, at the forked file https://github.com/chrislawso/Proxmoxremz/blob/remz/install/frigate-install.sh
at lines 155-156

  #original trt_url="https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/${TRT_VER}/local_repo/nv-tensorrt-local-repo-ubuntu2204-${TRT_VER}-cuda-${trt_cuda}.0-1_amd64.deb"
  trt_url="http://172.30.132.221:53842/downloadFile?id=q3szL8xuhiFa8qp"

I have local webserver serving nv-tensorrt-local-repo-ubuntu2204-8.5.3-cuda-11.8_1.0-1_amd64.deb

I modified several other lines in your scripts where I believed need be changed, example line 152-153

  #trt_cuda=${trt_cuda}_1
  trt_cuda="11.8_1"

and many more lines modified.

When I run my fork of your script it pretends to successfully complete but it really fails at early stage without showing errors. I do not know how to correctly make changes and I am unable to find error where it fails using /var/log files or in the cli verbose output.

Are you able to help me with the script how can I change it to install old trt, old cuda, older versions etc? I will keep the remote test machine on for you if you ever have a moment I would genuinely appreciate if you could help look at it.

Thank you

@remz1337
Copy link
Owner

remz1337 commented Sep 22, 2024

Ok, let's try an older driver that comes with CUDA 11. The TensorRT docs says minimum version supported is 450, so maybe try
https://download.nvidia.com/XFree86/Linux-x86_64/455.38/

(Please report output of command nvidia-smi on the host please, I want to double check the cuda version)

What is the exact make/model of that old Pascal GPU you're trying to passthrough?

Edit:
Assuming the CUDA driver that comes with driver v455.38 is 11, please test my script again. I've just pushed an update to allow older driver version (with a warning message)

@chrislawso
Copy link

chrislawso commented Sep 22, 2024

Ok I will now install the driver version you linked.

The machine you have remote access to the gpu is Nvidia P40, another different machine I also want to use is P100 and P4.

At this moment here is cli output you requested:

root@pve:~# nvidia-smi
Sun Sep 22 13:09:33 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.107.02             Driver Version: 550.107.02     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla P40                      Off |   00000000:82:00.0 Off |                  Off |
| N/A   38C    P0             49W /  250W |       0MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

@chrislawso
Copy link

chrislawso commented Sep 22, 2024

I uninstalled the newer driver with nvidia-uninstall

When I install the older driver you identified https://download.nvidia.com/XFree86/Linux-x86_64/455.38/NVIDIA-Linux-x86_64-455.38.run

I am getting nvidia driver installation error failures. I will continue to work to troubleshoot this driver install.

ERROR: Failed to run `/usr/sbin/dkms build -m nvidia -v 455.38 -k 6.8.12-1-pve`: Sign command: /lib/modules/6.8.12-1-pve/build/scripts/sign-file
         Signing key: /var/lib/dkms/mok.key
         Public certificate (MOK): /var/lib/dkms/mok.pub
         Certificate or key are missing, generating self signed certificate for MOK...
                                                                                                                                                                                                                                    
         Building module:                                                   
         Cleaning build area...                                                                                                                                                                                                     
         'make' -j32 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=6.8.12-1-pve IGNORE_CC_MISMATCH='' modules.......(bad exit status: 2)
         Error! Bad return status for module build on kernel: 6.8.12-1-pve (x86_64)
         Consult /var/lib/dkms/nvidia/455.38/build/make.log for more information.

  ERROR: Failed to install the kernel module through DKMS. No kernel module was installed; please try installing again without DKMS, or check the DKMS logs for more information.
                                                                                                                                                                                                                                    
        
 ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at          
         www.nvidia.com.

/var/log/nvidia-installer.log

            from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/linux_nvswitch.h:27,
                 from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/procfs_nvswitch.c:24:
/tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/nv.h:25:10: fatal error: stdarg.h: No such file or directory
   25 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/procfs_nvswitch.o] Error 1
In file included from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/nv-linux.h:16,
                 from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/nvlink_linux.c:30:
/tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/nv.h:25:10: fatal error: stdarg.h: No such file or directory
   25 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/nvlink_linux.o] Error 1
In file included from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/nv-linux.h:16,
                 from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/nv-pci.h:15,
                 from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/nv-pci.c:13:
/tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/nv.h:25:10: fatal error: stdarg.h: No such file or directory
   25 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/nv-pci.o] Error 1
In file included from /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/nv-i2c.c:16:
/tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/common/inc/os-interface.h:27:10: fatal error: stdarg.h: No such file or directory
   27 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/nvidia/nv-i2c.o] Error 1
make[3]: Target '/tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel/' not remade because of errors.
make[2]: *** [/usr/src/linux-headers-6.8.12-1-pve/Makefile:1925: /tmp/selfgz3188/NVIDIA-Linux-x86_64-455.38/kernel] Error 2
make[2]: Target 'modules' not remade because of errors.
make[1]: *** [Makefile:240: __sub-make] Error 2
make[1]: Target 'modules' not remade because of errors.
make[1]: Leaving directory '/usr/src/linux-headers-6.8.12-1-pve'
make: *** [Makefile:81: modules] Error 2
ERROR: The nvidia kernel module was not created.
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

/var/lib/dkms/nvidia/455.38/build/make.log

compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/455.38/build/nvidia/nv-dma.o] Error 1
In file included from /var/lib/dkms/nvidia/455.38/build/nvidia/nv-mmap.c:14:
/var/lib/dkms/nvidia/455.38/build/common/inc/os-interface.h:27:10: fatal error: stdarg.h: No such file or directory
   27 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
In file included from /var/lib/dkms/nvidia/455.38/build/nvidia/nv-p2p.c:14:
/var/lib/dkms/nvidia/455.38/build/common/inc/os-interface.h:27:10: fatal error: stdarg.h: No such file or directory
   27 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/455.38/build/nvidia/nv-mmap.o] Error 1
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/455.38/build/nvidia/nv-p2p.o] Error 1
In file included from /var/lib/dkms/nvidia/455.38/build/nvidia/nv-pat.c:14:
/var/lib/dkms/nvidia/455.38/build/common/inc/os-interface.h:27:10: fatal error: stdarg.h: No such file or directory
   27 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/455.38/build/nvidia/nv-pat.o] Error 1
In file included from /var/lib/dkms/nvidia/455.38/build/common/inc/nv-linux.h:16,
                 from /var/lib/dkms/nvidia/455.38/build/common/inc/nv-pci.h:15,
                 from /var/lib/dkms/nvidia/455.38/build/nvidia/nv-pci.c:13:
/var/lib/dkms/nvidia/455.38/build/common/inc/nv.h:25:10: fatal error: stdarg.h: No such file or directory
   25 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/455.38/build/nvidia/nv-pci.o] Error 1
In file included from /var/lib/dkms/nvidia/455.38/build/nvidia/nv-i2c.c:16:
/var/lib/dkms/nvidia/455.38/build/common/inc/os-interface.h:27:10: fatal error: stdarg.h: No such file or directory
   27 | #include <stdarg.h>
      |          ^~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.build:243: /var/lib/dkms/nvidia/455.38/build/nvidia/nv-i2c.o] Error 1
make[2]: *** [/usr/src/linux-headers-6.8.12-1-pve/Makefile:1925: /var/lib/dkms/nvidia/455.38/build] Error 2
make[1]: *** [Makefile:240: __sub-make] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.8.12-1-pve'
make: *** [Makefile:81: modules] Error 2

@chrislawso
Copy link

chrislawso commented Sep 22, 2024

Installing the old drivers is failing and I am not making progress. Quickly searching and reading information suggests that old nvidia drivers need be processed with patches to install with new kernels.

More importantly, I do not believe installing old drivers is the resolution because I have proxmox machines with Pascal gpus running new nvidia drivers and the normal frigate docker compose installation installs and functions using the docker trt image inside proxmox lxc using
image: ghcr.io/blakeblackshear/frigate:stable-tensorrt

Somehow during the initialization the frigate devs are able to recognize and install the correct dependencies for old Pascal gpus inside the docker container when the proxmox host and the lxc are running new nvidia drivers.

For example here is the cli output from inside of a working frigate:stable-tensorrt docker image
with the following versions

root@a16d682a78e5:/opt/frigate# pip show tensorrt
Name: tensorrt
Version: 8.5.3.1
Summary: A high performance deep learning inference library
Home-page: https://developer.nvidia.com/tensorrt
Author: NVIDIA Corporation
Author-email: 
License: Proprietary
Location: /usr/local/lib/python3.9/dist-packages
Requires: nvidia-cublas-cu11, nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11
Required-by: 
root@a16d682a78e5:/opt/frigate# pip list | grep -i tensorrt
tensorrt                 8.5.3.1
root@a16d682a78e5:/opt/frigate# nvidia-smi
Sun Sep 22 18:29:14 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.107.02             Driver Version: 550.107.02     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla P4                       Off |   00000000:42:00.0 Off |                    0 |
| N/A   75C    P0             43W /   75W |    3010MiB /   7680MiB |     18%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
root@a16d682a78e5:/opt/frigate# 

@remz1337
Copy link
Owner

Although I understand your point, I had a really hard time reverse engineering this part (tensorRT) from the official docker image, which led to some complicated install procedure, requiring matching CUDA versions across different libraries. Hence my recommendation to find a driver that comes with CUDA 11, hoping that would be supported by your GPU.

After some digging, It looks like drivers version between 450 and 525 should come with CUDA 11.
See https://docs.nvidia.com/deploy/cuda-compatibility/

So let's try a newer driver, but <525
Maybe give this one a shot:
https://download.nvidia.com/XFree86/Linux-x86_64/520.56.06/

I want to confirm this hypothesis first before delving into complex script changes. Thanks

@chrislawso
Copy link

chrislawso commented Sep 23, 2024

Absolutely understand you and super respect what you already have engineered here is very impressive.

Now testing install https://download.nvidia.com/XFree86/Linux-x86_64/520.56.06/NVIDIA-Linux-x86_64-520.56.06.run

Update: That driver version also fails install on proxmox host Virtual Environment 8.2.4 Linux 6.8.12-2-pve (2024-09-05T10:03Z)

  ERROR: An error occurred while performing the step: "Building kernel modules". See /var/log/nvidia-installer.log for details.
                                                                                                                                                                                                                                    
 ERROR: An error occurred while performing the step: "Checking to see whether the nvidia kernel module was successfully built". See /var/log/nvidia-installer.log for details.
                                                                                                                                                                                                                                    
   ERROR: The nvidia kernel module was not created.

  ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at          
         www.nvidia.com.

@chrislawso
Copy link

chrislawso commented Sep 23, 2024

Updated above post; that 520.56.06 nvidia driver install also fails.

Also wanted to communicate to you that unless YOU have a need to delve into complex script changes please disregard all the above issues with old gpus and old drivers. I already successfully found that your script easily installs with newer Turing generation nvidia gpus and I will simply use newer cards to test your working environment versus the frigate docker release. Your script works on certain hardware requirements and that is already great.

I don't know your motivation and desire for creating and engineering this tensorRT script. In your working environment is your system running with no log errors, no ffmpeg issues?

I wanted to try to use your script because I notice docker abstraction layer for frigate docker with cpu detector has performance penalty compared to tteck nondocker cpu detector and I wanted to test and benchmark how your tensorrt system will perform and scale versus the same hardware using docker tensorrt.

Frigate 0.14.x in my usage environment has several other problems I am working on resolving, ie significant ffmpeg errors, and the devs are already working on frigate 0.15 which has many changes ie supporting alternate/different configurable ffmpeg versions ( blakeblackshear/frigate#13722 ).
Using older nvidia gpus is unimportant because your script does work on recent nvidia gpus and I need to perform testing different ffmpeg versions and dependencies and finding solution for all the ffmpeg errors generated in frigate.

Please consider you don't want to get burn out over working on your complex script and consider the next frigate releases. If you ever want to do any more testing on any more machines please message me and I will be happy to help or provide you with more remote access.

Thank you.

@remz1337
Copy link
Owner

I appreciate that. If you still want to continue investigating, the next step would be to find a way to install older drivers (like 520). From the logs you posted, it looks like an issue with DKMS and newer linux kernel. I think this could be patched by downgrading the kernel of your machine (not ideal), otherwise there may be some other workaround but would require some research.

Indeed, it is working on my server with TensorRT (using a T600 i'm seeing inference speed around 6ms with a single camera)

Unless you want to spend some time trying to figure out how to install older drivers, I might close this issue and add a disclaimer in the README that older GPUs are not supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants