-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
NVIDIA Open GPU Kernel Modules Version
Does this happen with the proprietary driver (of the same version) as well?
I cannot test this
Operating System and Version
Description: Fedora release 36 (Thirty Six)
Kernel Release
Linux fedora 5.17.9-300.fc36.x86_64 #1 SMP PREEMPT Wed May 18 15:08:23 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Hardware: GPU
Its a RTX 2060 from GIGABYTE, I am not going to install the proprietary tool that is suggested
Describe the bug
GPU file descriptors return -1 or 3
To Reproduce
Try to open /dev/dri/cardx or use the drmOpen() function.
Bug Incidence
Always
nvidia-bug-report.log.gz
More Info
[ 4.751792] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751793] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751793] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751793] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751794] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751794] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751795] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751795] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751796] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751796] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751797] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751797] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751798] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 5: 0x80000000 MHz
[ 4.751798] ACPI: [Firmware Bug]: No valid BIOS _PSS frequency found for processor 5
[ 4.751799] ACPI: [Firmware Bug]: BIOS needs update for CPU frequency support
[ 4.751827] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751827] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751828] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751828] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751829] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751829] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751830] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751830] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751830] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751831] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751831] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751832] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751832] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751833] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751833] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751834] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 6: 0x80000000 MHz
[ 4.751834] ACPI: [Firmware Bug]: No valid BIOS _PSS frequency found for processor 6
[ 4.751835] ACPI: [Firmware Bug]: BIOS needs update for CPU frequency support
[ 4.751863] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751863] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751864] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751864] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751865] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751865] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751866] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751866] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751866] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751867] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751867] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751868] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751868] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751869] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751869] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751870] ACPI: [Firmware Bug]: Invalid BIOS _PSS frequency found for processor 7: 0x80000000 MHz
[ 4.751870] ACPI: [Firmware Bug]: No valid BIOS _PSS frequency found for processor 7
[ 4.751871] ACPI: [Firmware Bug]: BIOS needs update for CPU frequency support
[ 5.385291] nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
[ 5.385294] ucsi_ccg 0-0008: i2c_transfer failed -110
[ 5.385295] ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
[ 5.385298] ucsi_ccg: probe of 0-0008 failed with error -110
[ 5.398888] kauditd_printk_skb: 136 callbacks suppressed
[ 5.398889] audit: type=1130 audit(1653589722.262:145): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-udev-settle comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.444839] audit: type=1130 audit(1653589722.308:146): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-fsck@dev-disk-by\x2duuid-cd5cf0c9\x2db7ce\x2d41da\x2dbcf1\x2dae0ccb7c629a comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.459774] audit: type=1130 audit(1653589722.323:147): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-fsck@dev-disk-by\x2duuid-5B81\x2d8B7D comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.463346] EXT4-fs (sda2): mounted filesystem with ordered data mode. Quota mode: none.
[ 5.487808] audit: type=1130 audit(1653589722.351:148): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dracut-shutdown comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.511918] audit: type=1130 audit(1653589722.375:149): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=plymouth-read-write comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.519858] audit: type=1130 audit(1653589722.383:150): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=import-state comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.567885] audit: type=1130 audit(1653589722.431:151): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-setup comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 5.570480] audit: type=1334 audit(1653589722.433:152): prog-id=60 op=LOAD
[ 5.570670] audit: type=1334 audit(1653589722.434:153): prog-id=61 op=LOAD
[ 5.570726] audit: type=1334 audit(1653589722.434:154): prog-id=62 op=LOAD
[ 5.602867] RPC: Registered named UNIX socket transport module.
[ 5.602869] RPC: Registered udp transport module.
[ 5.602870] RPC: Registered tcp transport module.
[ 5.602870] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 5.771489] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 5.771491] Bluetooth: BNEP filters: protocol multicast
[ 5.771494] Bluetooth: BNEP socket layer initialized
[ 5.905379] NET: Registered PF_QIPCRTR protocol family
[ 6.526274] iwlwifi 0000:00:14.3: Conflict between TLV & NVM regarding enabling LAR (TLV = enabled NVM =disabled)
[ 6.712661] iwlwifi 0000:00:14.3: Conflict between TLV & NVM regarding enabling LAR (TLV = enabled NVM =disabled)
[ 9.192208] e1000e 0000:00:1f.6 eno2: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 9.192258] IPv6: ADDRCONF(NETDEV_CHANGE): eno2: link becomes ready
[ 9.768045] thermal cooling_device11: Setting cooling device state is deprecated
[ 11.529700] rfkill: input handler disabled
[ 11.930058] Bluetooth: RFCOMM TTY layer initialized
[ 11.930063] Bluetooth: RFCOMM socket layer initialized
[ 11.930096] Bluetooth: RFCOMM ver 1.11
[ 15.899439] logitech-hidpp-device 0003:046D:1025.0007: HID++ 1.0 device connected.
[ 28.879886] rfkill: input handler enabled
[ 249.288102] nvidia-modeset: Unloading
[ 249.304665] NVOC: __nvoc_objDelete: Child class OBJIOVASPACE not freed from parent class OBJVMM.Allocator 00000000d4fbfba6 released with memory allocations
[ 249.304686] [NvPort] *************************************************
[ 249.304686] NvPort memory tracking information for allocator 00000000d4fbfba6:
[ 249.304687] ACTIVE: 1 allocations, 644 bytes allocated (616 useful, 28 meta)
[ 249.304688] TOTAL: 150 allocations, 512133 bytes allocated (507933 useful, 4200 meta)
[ 249.304689] PEAK: 148 allocations, 511980 bytes allocated (507836 useful, 4144 meta)
[ 249.304689] [NvPort] *************************************************
[ 249.304702] nvidia-nvlink: Unregistered Nvlink Core, major device number 234
[ 249.326369] nvidia-nvlink: Nvlink Core is being initialized, major device number 234
[ 249.326373] NVRM getCpuCounts: RmInitCpuCounts: physical 0x8 logical 0x8
[ 249.326722] NVRM rmapiControlCacheInit: using cache mode 1
[ 249.327021] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=io+mem
[ 249.327024] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 249.327038] NVRM halmgrGetHalForGpu_IMPL: Matching PMC_BOOT_42 = 0x164a1000 to HAL_IMPL_TU104
[ 249.327070] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 249.375107] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 515.43.04 Release Build (yusufkhan@) Tue May 24 06:08:38 PM EDT 2022
[ 1317.099154] intel_powerclamp: Start idle injection to reduce power
[ 1318.444129] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1318.445139] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1318.446128] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1318.447127] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1318.448126] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1319.405115] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1319.406135] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1319.407115] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1320.797081] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1320.798091] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #202!!!
[ 1326.121176] intel_powerclamp: Stop forced idle injection
[ 1343.137619] intel_powerclamp: Start idle injection to reduce power
[ 1355.185450] intel_powerclamp: Stop forced idle injection
[ 1372.202022] intel_powerclamp: Start idle injection to reduce power
[ 1384.226791] intel_powerclamp: Stop forced idle injection
[ 1401.243395] intel_powerclamp: Start idle injection to reduce power
[ 1411.275389] intel_powerclamp: Stop forced idle injection
[ 1637.846061] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64 515.43.04 Release Build (yusufkhan@) Tue May 24 06:08:29 PM EDT 2022
[ 1637.846069] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 1637.846071] NVRM rmapiAllocWithSecInfo: client:0x0 parent:0x0 object:0x0 class:0x0
[ 1637.846074] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 1637.846083] NVRM rmapiAllocWithSecInfo: allocation complete
[ 1637.847149] nvidia_drm: unknown parameter 'NVreg_RmMsg' ignored
[ 1637.847478] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[ 1637.847686] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 1637.847688] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 1637.847759] nvidia 0000:01:00.0: Direct firmware load for nvidia/515.43.04/gsp.bin failed with error -2
[ 1637.847771] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x61:0x0:1610)
[ 1637.847787] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 1637.847824] [drm:nv_drm_load [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[ 1637.847891] [drm:nv_drm_probe_devices [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to register device
[ 2690.068695] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 2690.068700] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 2690.068702] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 2690.068710] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 2690.068711] NVRM _threadNodeInitTime: Bad threadStateDatabase.timeout.flags: 0x0!
[ 2690.068712] NVRM rm_get_firmware_version: rm_get_firmware_version: Failed to query gpu build versions, status=0x40
The program I attempted to run on this was
#include <xf86drm.h>
#include <nvidia.h>
#include <stddef.h>
#include <stdio.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <fcntl.h>
int main () {
int fd = open("/dev/dri/card0", O_RDWR);
printf("%i", fd);
struct nvidia_gem_alloc_nvkms_memory_params params;
params.memory_size = 1;
nvidia_gem_alloc_nvkms_memory(fd, params);
return params.handle;
}
with my nvidia-next libdrm branch https://gitlab.freedesktop.org/YusufKhan-gamedev/drm/-/tree/nvidia-next
Please Note that the dmesg didnt change immediately after I ran that program