12:13captainchris: hi everybody
12:13captainchris: I installed xf86-video-nouveau on archlinux but qutebrowser and my 2D Games made in SFML doesn't work
12:14captainchris: I think it's openGL problem
12:32TMM: Hi all! I've been trying to figure out what cards supported by nouveau do not require any proprietary firmwares for 2d/3d operation (including reclocking). It appears that all of the video decoders do? My google-fu seems to be failing me. Some sites seem to suggest that the last generation where that was possible was Kepler, whereas others seem to suggest Maxwell? Is there a list of GPUs on the nouveau website somewhere which could detail this
12:32TMM: information?
13:41cosurgi: imirkin: I just got a kernel NULL pointer dereference
13:42cosurgi: 2020-11-15T14:34:14.853498+01:00 absurd kernel: [9849093.422566] Call Trace:
13:42cosurgi: 2020-11-15T14:34:14.882010+01:00 absurd kernel: [9849093.422605] nvkm_vmm_iter.constprop.12+0x2ce/0x820 [nouveau]
13:42cosurgi: Not sure how it happened ;(
13:44cosurgi: fortunately `slay nauka` the user of suspected Xorg solved the problem. Other Xorg servers seems intact.
13:44cosurgi: However yesterday something happened on that server: something started leaking xclient handles. And I had to close some xterms to be able to open any kind of window.
14:09cosurgi: imirkin: full kernel trace, if it means anything: https://paste.ubuntu.com/p/2yGjV2vVzB/
14:11cosurgi: I suspect that some app was misbehaving, like libreoffice
14:12ignapk: so it seems I managed to successfully reclock my fermi after all: to avoid the cpu soft lockup I had to first start some process in the background using the gpu i.e. DRI_PRIME=1 glxgears &, which changed the pstate from the initial https://paste.rs/up1 to https://paste.rs/6YH and then I could echo 03 > pstate without any problems, which resulted in such pstate https://paste.rs/IMD
14:12ignapk: ig the branch works \o/
14:17cosurgi: ignapk: congrats!
14:17cosurgi: for the moment I don't use 3D accel at all. I prefer 100% stability :) hahah
14:18cosurgi: current uptime 114 days. Huge improvement :-)
14:18cosurgi:opens 200 windows in each of 7 running xservers, and expects them to last forever,
14:19cosurgi: Just like tons of papers scattered on my desk. Huge mess. The papers on the desk are stable though.
14:19cosurgi: Not like those windows.
14:27cosurgi: ... in the xserver :)
14:40RSpliet: ignapk: note that your memory clock is still 324MHz. I presume that's either because NvMemExec=0 or because it's disabled a bit harder in code
14:40RSpliet: That being said: good stuff!
14:40RSpliet: That's a great start
15:32ignapk: Yeah I passed `nouveau.config=NvMemExec=0` together with `nouveau.debug=pmu=debug`, will look for the script now
15:48orbea: im putting together a second box, but no new gpu yet so I stuck in my old nvidia card using nouveau. Any ideas about hdmi sound being full of underruns and broken stuttering sound only with SDL2 programs? Mplayer and firefox sound fine, I even tested two different nestopia ports, one with SDL2 and the other not, only the former had issues.
15:50orbea: maybe I can just expect it to vanish when I get another amd...
15:52RSpliet: orbea: I've experienced infrequent underruns on my emu10k1 since kernel 5.8. I wonder whether there isn't a more fundamental issue.
15:52RSpliet: That being said, HDA audio through NVIDIA GPUs is handled by snd-hda-intel, not by nouveau. There's a 95% change the problem lies in there
15:52orbea: this is alts 5.4 kernel still
15:53orbea: ah, good to know
15:53RSpliet: nouveau just takes care of routing sound to the right output port (one-off config) and infoframes I think
15:54orbea: i still get some underruns, but its not constant
15:54orbea: (with speakers)
16:03orbea: still, im confused why its only SDL2
16:57imirkin: RSpliet: vbios, which i would consider to be proprietary firmware, is always needed.
16:58imirkin: ignapk: oh yeah. back then, runpm would screw with reclocking. it's fixed in modern versions
16:58imirkin: ignapk: does it actually go faster?
16:59imirkin: cosurgi: try to get skeggsb to look at it :)
18:07ignapk: imirkin: It does go slower when I downclock, upclocking seems to be failing - glmark2 hangs, a lot of stack traces and kernel logs spammed with `nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 100c80 [ IBUS ]`
18:08imirkin: ignapk: oh, but you're not reclocking memory?
18:08ignapk: no, only engine for now
18:08imirkin: yeah ... that's not ENTIRELY unexpected then
18:09ignapk: huh, so maybe I should try with memory now
18:09imirkin: it's not like these things are able to clock at arbitrarily different values
18:09imirkin: if the engine clock is high, it might expect memory to respond within a certain number of clock cycles, etc
18:09imirkin: (or rather, it might have cycle-based expectations of memory ... so you have to clock them together)
18:10ignapk: Ok will try that, also RSpliet mentioned something about comparing the script I got from nouveau.debug=pmu=debug with some blob trace, are there any docs about this somewhere?
18:11imirkin: no
18:11imirkin: this is pretty advanced stuff
18:11imirkin: RSpliet knows about it all since he implemented reclocking for the gt21x series
18:11imirkin: i barely know anything about it
18:12imirkin: basically the idea is that the pmu receives a sequence of "instructions" which it then executes
18:12imirkin: this is the "script" which is being referenced
18:12imirkin: those instructions vary based on various vbios parameters as well as current board state
18:13imirkin: i believe demmio has support for decoding the blob's variants of these scripts
18:13imirkin: so you could look at the scripts that the blob is sending to the hardware
18:13imirkin: and compare them to the scripts nouveau is sending to the hardware
18:13imirkin: not that our two script implementations are slightly different, but the general idea is the same
18:14ignapk: ok that does sound advanced :)
18:15imirkin: iirc RSpliet wrote up some docs about it actually
18:15imirkin: https://envytools.readthedocs.io/en/latest/nvrm/pmu/seq.html
18:15ignapk: yay, thanks!
18:15imirkin: but that's just the blob's script "language" - not what actually needs to be done in various cases
18:16ignapk: on another note just re-enabled memory re-clocking and will try to upclock again *fingers crossed*
18:17imirkin: the basic idea is that you can't do this stuff from the cpu side, since you're putting the board in a compromised state for a period of time, and you'd lose the ability to control it from the cpu
18:17imirkin: so you put these scripts together, send them to the board, and pray
18:17imirkin: (like when you turn off memory to change its clocks, etc)
18:21ignapk: yeah same thing
18:23ignapk: meaning enabling memory re-clocking didn't change the stack traces and kernel logs
18:24imirkin: what about 'pstate' output?
18:24imirkin: the last line shoudl indicate what it thinks the current situation is
18:26ignapk: interestingly the engine freq was tiny little bit smaller then what i should be and the memory freq didn't change so I'm testing again
18:26ignapk: *it
18:26imirkin: yeah, the exact frequences are off sometimes
18:26imirkin: the crystals can only generate certain frequencies
18:28ignapk: ok confirmed lack of nouveau.config=NvMemExec=0 with cat /proc/cmdline
18:30ignapk: https://paste.rs/TrQ
18:31imirkin: yeah, so the clock change didn't happen
18:31imirkin: oops
18:31ignapk: nouveau.debug=pmu=debug only affects printing to logs right
18:31imirkin: in theory, yeah
18:31imirkin: one could write really silly code which makes it affect execution
18:31imirkin: but i don't think we would be nasty enough to do that
18:32RSpliet: There's also a good chance it's just hard-disabled with an if-statement somewhere
18:32imirkin: mmmm
18:32imirkin: shouldn't be
18:32imirkin: RSpliet: https://github.com/skeggsb/nouveau/commit/4b2da6ce7cf39233d706c7b8f259b37f1fb26ea7
18:33RSpliet: That's for the PLLs, not the "kick the seq script" bit
18:34imirkin: ignapk: which chip do you have btw?
18:34ignapk: gt540m
18:35imirkin: what's the actual chip?
18:35imirkin: that's the marketing name...
18:35ignapk: oh sorry
18:36imirkin: lspci -nn -d 10de:
18:36RSpliet: think I remember nvc1
18:36ignapk: Yes Nvc1 GF108
18:36imirkin: ah ok
18:36imirkin: so ... the board that i had trouble with
18:36imirkin: was a Quadro 400, which is also GF108
18:36ignapk: huh
18:40imirkin: but skeggsb had the same Quadro 400, so ...
18:40imirkin: iirc the memory did reclock though
18:40imirkin: but the display went nuts
18:42ignapk: I'll get the logs and run traces through gdb maybe that will clear things a little bit
18:43ignapk: btw the script I extracted when downclocking is https://paste.rs/M6m, but I don't know if there's a point for me to compare it with anything if downclocking works as it should
18:48imirkin: execution took 4s??
18:48imirkin: seeing the script for up-clocking is probably more interesting.
18:49ignapk: right, hopefully it was successfully printed
18:49ignapk: is 4s unexpected?
18:49imirkin: yeah, it should be like ... 10ms
18:49ignapk: hmm
18:49imirkin: so ... GF108 memory is a bit special
18:49imirkin: out of the fermi family
18:50imirkin: unfortunately i don't know too much about it
18:50imirkin: but i just know it's different
18:50imirkin: there are multiple partitions or something
18:52ignapk: yes it's there, pasting
18:59ignapk: https://paste.rs/n3f
18:59imirkin: that's a much more reasonable execution time
19:06ignapk: https://paste.rs/n3m - more complete kernel log
19:07RSpliet: it should fit well within a vblank so at most 0.5ms. Usually between 60-120μs , could take longer on GDDR5
19:07ignapk: the very first call trace happens every time btw
19:07ignapk: i.e. on vanilla 4.16 too
19:08imirkin: er right. i was thinking vblank period, but for some reason i thought it was 16ms. but that's for a full frame (@60hz), not just the vblank period. oops
20:48ignapk: 2 shortened call traces with line numbers: https://paste.rs/zEE, any ideas what I could do/try next
23:59orbea: RSpliet: in case you are intrested I think I found the hdmi sound issue, I noticed this in dmesg: "snd_hda_intel 0000:01:00.1: IRQ timing workaround is activated for card #1. Suggest a bigger bdl_pos_adj." and then found this https://bbs.archlinux.org/viewtopic.php?pid=1593613#p1593613