00:54 noelle: Hi, I'm having trouble booting a fresh Ubuntu 25.10 install (Linux 6.17.0-8-generic) on an RTX 3060 (https://www.evga.com/products/Specs/GPU.aspx?pn=fc6cb106-6765-4840-b472-9dba159cdc30).
00:54 noelle: I dumped the boot logs from journalctl and noticed repeated errors from nouveau (https://paste.sr.ht/~noelle/e3938d1664cfc785b12adb2014b751b0c97f3dcb#L1952):
00:54 noelle: > nouveau 0000:01:00.0: gr: TRAP ch 1 [02ffc8f000 gnome-shell[1897]]
00:54 noelle: > nouveau 0000:01:00.0: gr: DISPATCH 80000001 [INJECTED_BUNDLE_ERROR]
00:55 noelle: > nouveau 0000:01:00.0: timeout
00:55 noelle: I also have the output of `modprobe nouveau`:https://paste.sr.ht/~noelle/e3938d1664cfc785b12adb2014b751b0c97f3dcb#modinfo-nouveau.txt
00:55 noelle: Any assistance in troubleshooting this error and how to fix it would be appreciated.
01:01 redsheep[d]: It's been a minute since I've done much troubleshooting with nouveau but I wonder if this line might be an issue
01:01 redsheep[d]: ```nouveau 0000:01:00.0: [drm] Registered 4 planes with drm panic```
01:01 redsheep[d]: I think you can try disabling the hardware cursor in gnome? Seems potentially related
01:02 redsheep[d]: Oh wait nvm, that's the new panic feature, ignore me 🙂
01:06 redsheep[d]: I assume you've already confirmed the card works with a different setup, but if not a reseat is always a good first step
01:11 karolherbst: noelle: I wonder if booting with "nouveau.config=NvGspRm=1" help
02:10 noelle: karolherbst: That got me to the login screen! There were some artifacts though, where menus like the control center had white backgrounds behind their rounded edges, as if the compositor was replacing transparent pixels with white.
02:54 noelle: karolherbst: Here's what the desktop looks like: Notice the icons in the dock being white rectangles, the white background on the control center, and the screenshot which covered most of the screen in white: https://man.sr.ht/~noelle/tycho-ops/assets/2025-12-29-artifacting.jpeg
03:36 kode54: karolherbst: should I have CC'd you on my boot issue? I guess my hardware is broken :[
03:38 karolherbst: noelle: it tells me that the image contains errors and can't be displayed
03:39 karolherbst: noelle: anyway, could be a legitimate bug in mesa, but you'll probably also run an older mesa version? If it's not 25.2 or newer, try something newer and report it to ubuntu _if_ a newer version fixes it, so they can figure out what to backport or whatever
03:39 karolherbst: oh actually, we only support 25.3+
03:40 kode54: karolherbst: that jpeg url above returns "Invalid repository ID"
03:40 karolherbst: noelle: depending on your mesa version NOUVEAU_USE_ZINK=1 or NOUVEAU_USE_ZINK=0 could fix it. We default to zink on newer GPUs on newer mesa versions
10:26 noelle: karolherbst: I’ve fixed the image link above, will try your suggestions today
11:22 noelle: karolherbst: setting NOUVEAU_USE_ZINK=1 in /etc/environment didn't change anything, and NOUVEAU_USE_ZINK=0 resulted in different and much worse graphical artifacts.
11:22 noelle: I guess I'll try to build and install the latest version of Mesa next
11:26 karolherbst: noelle: yeah.. what mesa version are you running on?
11:27 noelle: karolherbst: Mesa 25.2.3-1ubuntu1
11:27 karolherbst: ahh.. mhh
11:27 karolherbst: it's kinda weird to see those issues... I wonder if something else is going wrong..
11:29 noelle: FTR, the card never had any issues on Windows, so I doubt it's bad hardware
11:39 karolherbst: yeah but some GPUs are more different than others, it's a bit weird
13:12 noelle: karolherbst: I built and installed Mesa 26.0.0-devel (git-ebf1454410 according to glxinfo) and all the artifacts that were visible on 25.2.3 are still present after rebooting. Is there any other troubleshooting information I can provide to help figure this out?
13:29 karolherbst: noelle: I assume zink and non zink are broken?
13:45 noelle: karolherbst: Yes, changing NOUVEAU_USE_ZINK in /etc/environment has the same effect on Mesa 26.0.0 as it did on Mesa 25.2.3
13:49 noelle: I rolled back the Mesa install after testing it the first time, and now installing and testing it again, glxinfo isn't reporting the new version as I would expect, hmm
14:00 noelle: Will report back after some more testing
16:15 noelle: karolherbst: I repeated my tests, verifying that I was running the expected version of Mesa each time and my original findings held true: https://man.sr.ht/~noelle/tycho-ops/known-issues.md#rendering-errors-in-gnome
16:33 karolherbst: noelle: were there any errors reported in dmesg?
16:55 mhenning[d]: noelle: Oh, that's interesting. I run gnome on my 3060 (on arch linux) all the time and I haven't seen that kind of artifact before.
17:18 tdaven[d]: Wayland vs X11 maybe?
17:25 noelle: karolherbst: dmesg is FULL of DMAR "non-zero reserved fields in PTE" errors. Here's a sample: https://paste.sr.ht/~noelle/c62771323b83cd386b1243b2c368e1fd66e1fec1
17:25 karolherbst: okay.. looks like something with memory management is busted then
17:26 karolherbst: noelle: is it on an AMD system with the iommu in a rather restricted mode or something?
17:26 karolherbst: I think DMAR was AMD...
17:27 karolherbst: oh no.. that's intel
17:27 noelle: karolherbst: The CPU is an Intel i7-4790K, Kernel docs say that's Intel https://www.kernel.org/doc/html/latest/arch/x86/iommu.html#basic-stuff
17:28 karolherbst: noelle: booting with "intel_iommu=off" or "iommu=off" might help then
17:28 karolherbst: would be good to confirm it's an iommu related issue
18:09 noelle: karolherbst: Booting with iommu=off hung right after my bootloader with some new crashes, but intel_iommu=off booted fine, and the rendering errors and DMAR errors seem to be gone!
18:09 karolherbst: okay, so something wrong with iommu support there
18:10 noelle: karolherbst: Would that be an issue with the kernel code for the CPU?
18:11 karolherbst: well nouveau needs to register ranges the GPU wants to access ahead of time, so the issue should be somewhere in nouveau
18:11 karolherbst: like the issue is that the GPU accesses memory it wasn't allowed to do
18:12 karolherbst: given it's causing rendering issue, I'm pretty sure it's about memory it should touch, but nouveau doesn't properly register it
18:14 noelle: so in theory, nouveau could be fixed to not require these command line arguments in the future?
18:15 karolherbst: yes
18:16 karolherbst: and afaik it does work with iommu support on certain hardware, at least I know I've fixed an issue on AMD with that and I know it was working on my intel desktop at some point in time
18:17 noelle: Alright, good to know. Thank you so much for helping me troubleshoot this, I really appreciate your time!
18:37 noelle: karolherbst: I'm switching focus to my other issue: when waking from sleep, my monitor never receives any signal. I tested it and inspected the logs and found some more nouveau smoke: nouveau 0000:01:00.0: gsp: init failed, -110. Full error log: https://man.sr.ht/~noelle/tycho-ops/known-issues.md#wake-from-sleep
18:49 noelle: I also attached an error to that page from when I unplug and replug a monitor back in
19:18 redsheep[d]: I kind of think sleep is still not quite right on GSP firmware as a whole, nouveau or not. I started daily driving nvidia prop a few months ago and sleep has usually resulted in a hung machine.
19:20 noelle: redsheep[d]: damn, that's unfortunate to hear
19:21 mhenning[d]: I think _lyude was fixing some sleep issues recently? Not sure if your issue is related
19:22 _lyude[d]: Yeah I've been digging into some issues around my desktop not suspending correctly. redsheep[d] have you tried forcing nouveau onto the r535 firmware? I got slightly better results with that, though I think I still was seeing some issues on my machine (but there's been enough issues I've been going through it was difficult to tell if it was a different issue)
19:24 redsheep[d]: Yeah when I get home I'll do some nouveau tinkering again and see if I can narrow it down. I was talking about with ogk/openrm though, so even 580 GSP or 580 ogk had issues
19:24 redsheep[d]: I just upgraded to 590, I'll play with that too
19:24 _lyude[d]: redsheep[d]: oh that's interesting. Is it an issue that happens 100% of the time?
19:24 redsheep[d]: I don't really know yet tbh
19:25 mhenning[d]: Oh, yeah I was referring to noelle's "gsp: init failed" message
19:25 redsheep[d]: I sleep very infrequently, only a few times per update
19:25 chikuwad[d]: nvprop suspends fine for me on ampere fwiw
19:25 chikuwad[d]: 590 atm
19:25 chikuwad[d]: _sometimes_ it doesn't bring the monitor back up, but that's solved by just bringing up the OSD
19:26 redsheep[d]: If 590 sleep is consistent for me upgrading GSP in nouveau could maybe be an eventual route if the 570 firmware never gets to the point of reliable sleep
19:29 redsheep[d]: Noelle: It might be a useful data point to see if the nvidia drivers are able to sleep for you
19:42 _lyude[d]: If you actually find it makes a difference redsheep[d] let me know. Honestly I'm kind of curious how big of a difference API wise 590 even is
19:43 _lyude[d]: redsheep[d]: what GPU do you have?
22:20 redsheep[d]: _lyude[d]: 4090
22:57 noelle: redsheep[d]: I switched to nvidia-driver-580-open in the "Additional Drivers" section of the "Software & Updates" app. glxinfo says I'm running NVIDIA 580.95.05. I was able to sleep and wake no problem, and unplug and replug my monitor without issue. Things seem to be "just working"!
23:12 redsheep[d]: That's good I suppose. I might have just been on some broken version. Last time it broke, I'll check later.