05:19 asdqueerfromeu[d]: samantas5855[d]: i think it still doesn't
05:49 redsheep[d]: Yeah it doesn't, cyberpunk still has a non-rt rendering path
08:56 magic_rb[d]: okay so someone has to explain to me how is this possible. I woke up my laptop, session crashed, no biggie, logged in, gpu outputs missing. So i thought "huh, ill have to reboot, but what if i just rmmod nouveauv and then modprobe it again" and to my surprise, it worked! how is it possible that nouveau reloaded while xorg was running and now my external monitors are detected
08:57 karolherbst[d]: it kinda depends, but at some point people made it so, that a slong as the display isn't used, there is no reference on the module
08:57 karolherbst[d]: and the userspace process just registers event to get notified on hotplug events instead
08:58 karolherbst[d]: and once a new connector gets detected and is on the secondary GPU, then the reference will be taken
09:00 magic_rb[d]: 👀
09:00 magic_rb[d]: okay thats really cool
09:00 magic_rb[d]: so this is the same mechanism that allows egpus to hotplug then
09:00 magic_rb[d]: i did not expect this to work at all
09:38 OftenTimeConsuming: It's nice with Linux-libre 6.11.0, videos in mpv just corrupt, rather than hanging the video in mpv or crashing nouveau; https://0x0.st/XYow.webp
09:41 karolherbst[d]: yeah... video acceleration was always kinda scuffed, though I'm kinda confused why mpv is corrupting things on linux-libre, because it should be all software, no?
09:43 OftenTimeConsuming: I am using mpv with vo=gpu-next, so it seems to be using the GPU to display the video directly, but without GPU hardware acceleration.
09:44 OftenTimeConsuming: "x11egl"
09:48 karolherbst[d]: I see...
09:48 karolherbst[d]: could be some legit rendering bug somewhere
09:48 karolherbst[d]: what GPU is that with?
09:49 OftenTimeConsuming: gtx 780 Ti
09:49 karolherbst[d]: kinda interesting how some tiles are correct
09:50 karolherbst[d]: kinda like if something is up with strides
09:50 karolherbst[d]: maybe some import/export stuff going wrong
09:50 karolherbst[d]: what video is that with?
09:51 karolherbst[d]: I always had a hard time reproducing those issues, because it usually impacted videos nobody was allowed or willing to share
09:52 OftenTimeConsuming: ytdl://youtube.com/watch?v=FlWjf8asI4Y followed by ytdl://youtube.com/watch?v=eveyRQtD-Sk run at >2x speed, unfortunately it doesn't reproduce.
09:53 OftenTimeConsuming: It happened near the end of the second video. I also changed the screen the video was on halfway through (1200p to 1440p) screen, plus paused it and do some things in Tor Browser in the background.
09:54 OftenTimeConsuming: Tor Browser seems to use the GPU in ways that trips nouveau up and was hanging at random previously, but not seemingly anymore.
09:56 OftenTimeConsuming: RAM corruption can be ruled out at least, as I do have ECC.
09:58 OftenTimeConsuming: Lots of zathura instances open was previously good at causing hangs in TB, but not anymore it seems.
09:59 karolherbst[d]: OftenTimeConsuming: yeah... that issue got fixed at some point
09:59 OftenTimeConsuming: The tor browser issue?
09:59 karolherbst[d]: yeah
09:59 karolherbst[d]: firefox did some multithreading stuff
09:59 OftenTimeConsuming: Great, it's nice not getting random hangs.
10:00 OftenTimeConsuming: firefox really loves tearing up the 32 threads.
10:00 karolherbst[d]: so the corruption is more of a randomly happening bug?|
10:01 OftenTimeConsuming: I've seen it once so far. Previously, videos at random would hang and not play anymore, but not it seems the same bug is there, but just causes corruption instead.
10:02 OftenTimeConsuming: *would hang and result in a stuck video and very desyncronised audio.
10:02 karolherbst[d]: yeah.... those are always a pain to track down. If they happen like once after a few months of use, guess how likely it would be to figure that bug out
10:03 OftenTimeConsuming: They happen once every few days.
10:04 OftenTimeConsuming: I'm sure if I had debug logging going constantly, the relevant logs would be useful for debugging, but I'm not sure how to do that.
10:05 karolherbst[d]: I see
10:06 karolherbst[d]: I think there is a memory corruption lurking somewhere in the area of command submission and you might be hitting the same bug
10:06 karolherbst[d]: I really want to fix this issue, but I also want to get a few other things done before XDC
10:13 OftenTimeConsuming: I'm not sure if I'm imagining things, but it seems minetest and speed-dreams-2 performance has improved by a few fps - nice.
10:25 OftenTimeConsuming: Good to see that emilia-pinball and gltron excellent performance (asides from the particle effect on death)
10:31 samantas5855[d]: redsheep[d]: the benchmark says raytracing enabled
10:35 redsheep[d]: samantas5855[d]: That screen is getting cut off, look at the rest of the column that text is under. Raytracing has not been implemented. Even if the game did thing it's on it isn't working
10:36 redsheep[d]: If it somehow thought it was actually enabled that could maybe explain some rendering issues, but I expect it is off and that's all something else
10:41 redsheep[d]: But yeah it's been discussed a number of times, there isn't enough documentation for the RT instructions yet for anything to be implemented until either docs come out or somebody does a whole lot of reverse engineering. I think there's a branch somewhere with partial progress but iirc there's stiill a lot of stuff with the rt hardware that is pretty murky.
10:55 OftenTimeConsuming: Oh nice, a segfault in xonotic
10:57 OftenTimeConsuming: libgallium rather; https://termbin.com/fgk6
10:59 karolherbst[d]: if you can script the bug that would be helpful
11:00 karolherbst[d]: scripting as in, how to invoke xonotic so it runs into it on its own without having to mess with the ui
11:01 OftenTimeConsuming: I would love it if these bugs were reproducible, but I figured hopefully such instruction dump would contain something.
11:01 karolherbst[d]: not really
11:01 karolherbst[d]: it's a crash on the GPU rather
11:02 karolherbst[d]: so it's more a bug in regards to command submission
11:02 karolherbst[d]: and the command stream being invalid
11:02 karolherbst[d]: even if it only shows up once in 100 time it's still good to have a script to trigger it
11:02 karolherbst[d]: the worst part is having to do manual steps
11:02 karolherbst[d]: but if it's all scripted, a dev could just run it in a loop until it triggers
11:03 OftenTimeConsuming: I see. It showed up in about 20 matches of level 25, but there isn't any specific inputs that triggered it.
11:04 karolherbst[d]: yeah.. I mean it's also fine if a benchmark mode would trigger it
11:04 karolherbst[d]: the bad part in reproducing those issues is always if it involves having to do anything actively
11:04 OftenTimeConsuming: That would be nice, but of course not.
11:08 karolherbst[d]: right, but I mean you could try to figure this out if you have some time for that
11:08 karolherbst[d]: I certainly won't play the game for hours just to trigger some maybe bug I might not even be able to trigger myself
11:10 karolherbst[d]: OftenTimeConsuming: you could run with `NOUVEAU_LIBDRM_DEBUG=2` and then it should print the submitted commands causing those issues, might be enough to figure out what's wrong
11:10 OftenTimeConsuming: There we are, finally a debug command that will show something useful.
11:13 karolherbst[d]: `NOUVEAU_LIBDRM_DEBUG=1` should print all the submitted buffers, but that's gonna be a huge file after a couple of hours, but will give you an idea how that would look like
11:13 karolherbst[d]: but yeah, just set it to 2 and see what's dumped by the application
11:14 karolherbst[d]: however, that option might also mask the issue for weird reasons
11:18 OftenTimeConsuming: I'll run things with LIBDRM_DEBUG from now on.
11:26 OftenTimeConsuming: With gzip compression, NOUVEAU_LIBDRM_DEBUG=1 filesize is not bad, the problem with that is it really kills performance.
11:29 karolherbst[d]: yeah...
11:30 OftenTimeConsuming: Which will probably mask any bugs also, as I've only really had crashes when things are going fast, not slow.
11:30 karolherbst[d]: 2 will also hurt perf, but in a different way
13:44 magic_rb[d]: Interesting, throwing the gpu off the bus, then rescanning doesnt work. Nouveau cannot init it, error -22
13:44 magic_rb[d]: I was hoping i could avoid a reboot with this trick
13:46 karolherbst[d]: huh...
13:47 karolherbst[d]: there is a trick
13:47 magic_rb[d]: Also, i can get my laptop into a state where a reboot will not fix things. A full power off and power on cycle is needed. I feel like nouveau isnt resetting the gou quite properly yet
13:47 karolherbst[d]: `config=NvForcePost=1`
13:47 karolherbst[d]: that _might_ make it work
13:47 karolherbst[d]: yeah...
13:47 magic_rb[d]: *might* 😂
13:47 karolherbst[d]: sounds like something with POSTing the gpu goes wrong
13:47 karolherbst[d]: it's all fuzzy
13:48 karolherbst[d]: and we rely on some GPU registers to indicate whether nouveau needs to post or not
13:48 karolherbst[d]: the system firmware can put the GPU into a proper state if it starts from a clean state, but sometimes enough state persists and it's kinda in a broken state
13:48 karolherbst[d]: those issues are super rare though
13:48 karolherbst[d]: and I have no idea how much that changes with modern GPUs
13:49 magic_rb[d]: Huh, i cant type a upper case f on my keyboard
13:49 magic_rb[d]: Lol
13:50 OftenTimeConsuming: Here you go: F
13:51 magic_rb[d]: No x11 session :)
13:51 OftenTimeConsuming: You're not in the church on emacs in fbcon?
13:51 OftenTimeConsuming: *of
13:51 magic_rb[d]: `sudo modprobe nouveau NvForcePost=1` or `sudo modprobe nouveau config=NvForcePost=1`
13:52 karolherbst[d]: the latter
13:52 magic_rb[d]: Nope
13:53 karolherbst[d]: yeah well.. unloading and loading is supported best effort anyway 🙃 (where everything else in nouveau is also best effort supported)
13:53 magic_rb[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1287411492618108959/JPEG_20240922_155329_2154789056355131164.jpg?ex=66f172e2&is=66f02162&hm=c4db88bae05a33ec1ef56b1867f67e65f1d9895c814562ed11e155f115e64d94&
13:53 magic_rb[d]: Sorry for the "screenshot"
13:54 magic_rb[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1287411611140886598/JPEG_20240922_155359_6709217700188982111.jpg?ex=66f172ff&is=66f0217f&hm=4a12db777f1c2e51e90472bd5faf6c4c100be00b9c489989ed4f64f44418477f&
13:54 karolherbst[d]: mhhh
13:54 karolherbst[d]: that's a GSP based one, isn't it?
13:54 magic_rb[d]: No big deal, can just reboot, was just curios if this work
13:55 magic_rb[d]: karolherbst[d]: Yeah, 3060 mobile
13:55 karolherbst[d]: mhhhh
13:55 karolherbst[d]: it could be a gsp bug
13:55 magic_rb[d]: Most likely
13:55 karolherbst[d]: on very new GPUs the initialization of the GPU is kinda funky
13:55 karolherbst[d]: a lot of the stuff the GPU can do itself these days
13:55 karolherbst[d]: and other things are done by GSP
13:55 magic_rb[d]: Ill try reboot, but that also had issues, i seem to be noticing that if i first boot into proprietary and then reboot into nouveau, it works better
13:56 magic_rb[d]: 🙃
13:56 magic_rb[d]: When are we bumping gsp next?
13:57 tiredchiku[d]: after rtx 5000 cards release
13:57 tiredchiku[d]: hopefully
13:57 magic_rb[d]: Hopefully itll fix some things...
13:59 karolherbst[d]: yeah.. it probably will
13:59 magic_rb[d]: I wonder if i can script it on nixos in such a way where my laptop would first boot into nvidia proprietary, load up X, reboot into nouveau. All automagically
13:59 magic_rb[d]: That would be the most cursed boot process ever lmao
14:00 karolherbst[d]: but why....
14:00 magic_rb[d]: Aand it died
14:00 magic_rb[d]: Ill get dmesg at least
14:01 tiredchiku[d]: overcomplicating it I see
14:01 tiredchiku[d]: just do modprobe.blacklist=nouveau at boot
14:01 tiredchiku[d]: then manually modprobe it
14:01 tiredchiku[d]: ex
14:02 karolherbst[d]: that's my setup, just I won't have to manually load nvidia
14:02 OftenTimeConsuming: My setup is free from nvidia's proprietary software.
14:03 magic_rb[d]: https://termbin.com/8wwo
14:09 magic_rb[d]: Can i fish out any other logs? Idk if there is smth else possibly useful
14:59 OftenTimeConsuming: Is nouveau able to init a GPU that is just present in the system (i.e. the Option (EEP)ROM hasn't been executed)? I've heard something about nouveau being able to read out the necessary fields from the tables if BusyBox/Linux is added as a coreboot payload and a copy of the VBIOS is added to the cbfs, but I don't see why the driver couldn't just read directly from the GPU.
15:00 karolherbst[d]: yeah.. nouveau has to do that for e.g. secondary GPUs in laptops anyway
15:00 karolherbst[d]: but for primary ones the BIOS/UEFI generally are doing it in order to display things
15:00 karolherbst[d]: and sadly the initialization isn't reentrant
15:01 OftenTimeConsuming: Superb, time to give it a go. I don't really care about not seeing the grub screen, as I don't see it usually anyway.
15:01 karolherbst[d]: as long as the driver gets loaded at initramfs time you are usually good to go
15:03 OftenTimeConsuming: I thought that could be an issue. I'm not sure if dracut includes it - better. Having to type a password blind once isn't terrible. Thanks for that.
15:05 karolherbst[d]: yeah, that's specifically the reason on why the GPU driver should exist in initramfs 🙂
15:05 karolherbst[d]: it also helps with laptops being closed at boot and the external display driven by the secondary gpu
15:06 OftenTimeConsuming: I would prefer to not have to use an initramfs, too bad LUKS2 password support for grub isn't there yet.
15:06 OftenTimeConsuming: Even with that, Grub would need nvidia GPU init too.
15:26 OftenTimeConsuming: It works great, thanks.