Hi there. Ever since the build issue occurred due to the change in kernel 6.12 as stated in #86, I have not been able to get the vendor-reset to work on my RX Vega 56. I was able to change the affected line as stated in #86, and get the module to build with dkms but it doesn't reset the GPU properly.
Things I have attempted:
- Uninstalling vendor-reset from DKMS and reinstalling it
- Removing it from modprobe, reboot and loading it again
- Verifying that it shows up in
sudo dmesg | grep reset
- Verifying the reset_method is device_specific
Here are some of the relevant outputs.
sudo dmesg | grep reset
[ 7.520032] vendor_reset: loading out-of-tree module taints kernel. [ 7.520041] vendor_reset: module verification failed: signature and/or required key missing - tainting kernel [ 7.613785] vendor_reset_hook: installed [ 75.619428] amdgpu 0000:09:00.0: amdgpu: Starting gfx ring reset [ 75.845873] amdgpu 0000:09:00.0: amdgpu: Ring gfx reset failure [ 75.845877] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 76.650627] amdgpu 0000:09:00.0: amdgpu: BACO reset [ 77.150060] amdgpu 0000:09:00.0: amdgpu: GPU reset succeeded, trying to resume [ 77.150262] [drm] VRAM is lost due to GPU reset! [ 77.586359] amdgpu 0000:09:00.0: amdgpu: GPU reset(2) succeeded!
cat "/sys/bus/pci/devices/0000:09:00.0/reset_method"
device_specific
sudo dmesg | grep vfio-pci
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=f14fca79-ebec-4909-a9ec-9bbcf1c6a9f8 rw loglevel=3 quiet iommu=pt amd_iommu=on vfio-pci.ids=1002:687f,1002:aaf8,1022:145f,1022:1457 kvm.ignore_msrs=1 video=efifb:off [ 0.084960] Kernel command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=f14fca79-ebec-4909-a9ec-9bbcf1c6a9f8 rw loglevel=3 quiet iommu=pt amd_iommu=on vfio-pci.ids=1002:687f,1002:aaf8,1022:145f,1022:1457 kvm.ignore_msrs=1 video=efifb:off [ 62.087380] vfio-pci 0000:09:00.0: vgaarb: deactivate vga console [ 62.087388] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none [ 62.980643] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none [ 250.643445] vfio-pci 0000:09:00.0: vgaarb: deactivate vga console [ 250.643460] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none [ 251.005470] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none
I am running this GPU as a Single GPU Passthrough and vendor-reset has worked somewhat flawlessly before 6.12 update broke it. Now I am unable to boot into any of my VMs. Hopefully somebody could point me in the right direction as I'm thoroughly lost at the moment.
Hi there. Ever since the build issue occurred due to the change in kernel 6.12 as stated in #86, I have not been able to get the vendor-reset to work on my RX Vega 56. I was able to change the affected line as stated in #86, and get the module to build with dkms but it doesn't reset the GPU properly.
Things I have attempted:
sudo dmesg | grep resetHere are some of the relevant outputs.
sudo dmesg | grep reset[ 7.520032] vendor_reset: loading out-of-tree module taints kernel. [ 7.520041] vendor_reset: module verification failed: signature and/or required key missing - tainting kernel [ 7.613785] vendor_reset_hook: installed [ 75.619428] amdgpu 0000:09:00.0: amdgpu: Starting gfx ring reset [ 75.845873] amdgpu 0000:09:00.0: amdgpu: Ring gfx reset failure [ 75.845877] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 76.650627] amdgpu 0000:09:00.0: amdgpu: BACO reset [ 77.150060] amdgpu 0000:09:00.0: amdgpu: GPU reset succeeded, trying to resume [ 77.150262] [drm] VRAM is lost due to GPU reset! [ 77.586359] amdgpu 0000:09:00.0: amdgpu: GPU reset(2) succeeded!cat "/sys/bus/pci/devices/0000:09:00.0/reset_method"device_specificsudo dmesg | grep vfio-pci[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=f14fca79-ebec-4909-a9ec-9bbcf1c6a9f8 rw loglevel=3 quiet iommu=pt amd_iommu=on vfio-pci.ids=1002:687f,1002:aaf8,1022:145f,1022:1457 kvm.ignore_msrs=1 video=efifb:off [ 0.084960] Kernel command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=f14fca79-ebec-4909-a9ec-9bbcf1c6a9f8 rw loglevel=3 quiet iommu=pt amd_iommu=on vfio-pci.ids=1002:687f,1002:aaf8,1022:145f,1022:1457 kvm.ignore_msrs=1 video=efifb:off [ 62.087380] vfio-pci 0000:09:00.0: vgaarb: deactivate vga console [ 62.087388] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none [ 62.980643] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none [ 250.643445] vfio-pci 0000:09:00.0: vgaarb: deactivate vga console [ 250.643460] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none [ 251.005470] vfio-pci 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=noneI am running this GPU as a Single GPU Passthrough and vendor-reset has worked somewhat flawlessly before 6.12 update broke it. Now I am unable to boot into any of my VMs. Hopefully somebody could point me in the right direction as I'm thoroughly lost at the moment.