17:50 aron: hi
17:51 aron: I have a GT710 (GK208B) with 2 monitors connected
17:51 aron: I see short white lines on both monitors at random locations
17:52 karolherbst: aron: like randomly when it's in use?
17:52 karolherbst: and what kind of monitors are those?
17:52 aron: they are appear randomly, but goes away if I move a window there
17:52 karolherbst: huh...
17:52 aron: one is a DSUB other is HDMI monitor
17:53 karolherbst: any errors in dmesg?
17:53 aron: I don't remember anything
17:53 aron: (I had to boot my old system, it was super frustrating)
17:54 aron: I can check in a second
17:54 karolherbst: okay, cool
17:58 aron: "dmesg | grep nouv" yields only some information about the card
17:58 karolherbst: mhhh
17:58 karolherbst: aron: what desktop are you using and what's mesas version?
17:59 aron: xfce4, mesa-21.2.6
17:59 karolherbst: hard to say what's going on, because normally that stuff should work. I'd blame the cables, but given one is HDMI and it's happening on both...
17:59 karolherbst: mhhh
17:59 karolherbst: aron: compositiing enabled in xfwm or disabled?
18:00 karolherbst: or is that on by default these days?
18:00 aron: it's on by default, but I instantly disable it
18:00 aron: I don't really like it
18:00 karolherbst: well... without compositing random glitches are not preventable sadly
18:01 karolherbst: could be something else though
18:01 aron: when I enable it, the short white lines move to a different location :)
18:01 karolherbst: ahh.. annoying
18:01 karolherbst: do you know if your X server uses the nouveau or modesetting driver?
18:01 aron: it's an alpine line, based on musl libc, not glibc
18:02 karolherbst: I hope that glibc vs musl isn't causing this, but who knows
18:03 aron: hmm
18:03 aron: (EE) Failed to load module "nouveau" (module does not exist, 0)
18:04 karolherbst: guess it uses modesetting then
18:04 aron: (II) modeset(0): 720x400@70Hz
18:04 aron: etc
18:04 karolherbst: which isn't bad, but given you are on mesa 21.2 you might also run on an older kernel where we fixed some issues around modesetting.
18:04 karolherbst: you could install the nouveau X driver and see if that helps
18:05 aron: kernel is 5.15.55
18:05 karolherbst: mhh, looks new enough
18:06 aron: installing xf86-video-nouveau
18:08 aron: reboot done
18:08 aron: pesky white lines GONE!
18:08 aron: thank you very much!
18:08 karolherbst: np
18:08 karolherbst: still wondering why it happens though..
18:08 karolherbst: probably some bug in glamor or mesa
18:12 aron: an unfortunate development: nouveau tends to crash
18:12 karolherbst: uhh... that's bad
18:12 aron: I'm trying to get some log
18:14 aron: trying dmesg first, let's see
18:17 aron: yes, I got a few lines of error
18:17 aron: [ 211.747682] nouveau 0000:01:00.0: fifo: FB_FLUSH_TIMEOUT
18:17 aron: [ 211.747809] nouveau 0000:01:00.0: fifo: CHSW_ERROR 00000001
18:17 aron: quite a few FB_FLUSH_TIMEOUT
18:18 aron: [ 211.937342] nouveau 0000:01:00.0: fifo: fault 00 [READ] at 0000000003021000 engine 1b [CE2] client 18 [HUB/GR_CE] reason 0c [UNSUPPORTED_KIND] on channel 2 [003fbdc000 Xorg[2893]]
18:18 aron: [ 211.937354] nouveau 0000:01:00.0: fifo: channel 2: killed
18:18 aron: [ 212.159337] nouveau 0000:01:00.0: DRM: GPU lockup - switching to software fbcon
18:18 aron: and then rest in pieces
18:25 aron: https://paste-bin.xyz/77373
18:25 aron: full log
18:25 aron: (not too long)
18:26 aron: software fbcon does not work, tho
18:26 aron: I need to reset
18:31 karolherbst: mhhh...
18:32 karolherbst: might be you are unlucky and nouveau isn't supporting your GPU very well :/
18:32 aron: first time I was able to login at least :)
18:32 aron: actually, I also think that
18:32 aron: according to this:
18:33 karolherbst: I am mildly aware that there are those FB_FLUSH_TIMEOUT issues around, but we also don't have a good idea on why that's even happening
18:33 aron: https://nouveau.freedesktop.org/CodeNames.html
18:33 aron: 01:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT
18:33 aron: 710] (rev a1) (prog-if 00 [VGA controller])
18:33 aron: it's not listed
18:33 aron: only 710M
18:33 karolherbst: we suspect that our own firmware is busted, but that's a really painful part of the driver
18:34 karolherbst: yeah.. though that's more of a "best effort" list. I usually recommend the wikipedia page: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units
18:34 aron: oh, IC
18:35 aron: the funny thing, I have 2 cards, but the second card is for vfio
18:35 aron: this little 710 is just for X
18:37 karolherbst: might also be that bandwidth is an issue
18:37 karolherbst: could check if booting with "nouveau.config=NvClkMode0xf" fixes it
18:37 karolherbst: ehh
18:37 karolherbst: "nouveau.config=NvClkMode=0xf
18:37 aron: where should I put this?
18:37 karolherbst: on the kernel command line
18:38 karolherbst: or as a module option via modprobe.conf
18:38 karolherbst: but updating the kernel command line with "grubby" is usually straightforward: grubby --update-kernel=ALL --args="nouveau.config=NvClkMode0xf"
18:39 karolherbst: _but_ not sure if your distro even wires all of that up
18:39 aron: I just modify the grub.cfg
18:39 karolherbst: or that
18:39 aron: if it works, I do it properly
18:42 aron: yay, login is now successful
18:42 karolherbst: interesting...
18:42 karolherbst: I'd bet that this also fixes those glitches with the modesetting driver
18:43 aron: what it does, btw?
18:43 karolherbst: increases GPU clocks
18:43 aron: okay, it does not hurt much
18:43 karolherbst: you can check "/sys/kernel/debug/dri/0/pstate" for available perf states
18:44 karolherbst: might be that you have a third one in the middle which is "good enough" but doesn't increases power consumption/head generation too much
18:44 aron: 07: core 405 MHz memory 810 MHz
18:44 aron: 0f: core 653-954 MHz memory 5010 MHz AC DC *
18:44 aron: AC: core 953 MHz memory 5009 MHz
18:44 karolherbst: ahh yeah.. only have two then
18:45 karolherbst: given that's a 710 anyway, you probably increases power consumption from 5 to 9 W or so :D
18:45 karolherbst: anyway... long term we want that to happen automatically, but given that the clocking code is broken for quite a lot of cards...
18:45 aron: nice
18:45 karolherbst: is it a passively cooled GPU?
18:46 aron: yup, it is
18:46 karolherbst: might want to make that a "nouveau.config=NvClkMode0xf,NvPmEnableGating=1" instead so you reduce power consumption a little
18:46 aron: it was a "design choice"
18:47 karolherbst: NvPmEnableGating is a feature we are not sure is 100% reliable, but it does reduce power consumption
18:47 karolherbst: so the GPU might stay a bit cooler
18:47 aron: (first I tried open nvidia kernel driver module, but this card was not supported, what a same)
18:47 aron: (reboot)
18:48 karolherbst: only talking about 10-20% though, but for passively cooled cards it might matter
18:49 karolherbst: anyway.. I'd monitor the GPU temperature a bit with the high clocks just in case
18:52 aron: it's worse now :)
18:52 karolherbst: :(
18:52 aron: maybe that missing "=" sign again?
18:52 karolherbst: guess there is a bug in the ohhh
18:53 karolherbst: yeah...
18:53 karolherbst: copy&paste fail
18:53 aron: I didn't noticed that too :P
18:53 aron: let me fix that quick
18:55 aron: yep :)
18:55 aron: it did the trick
18:55 aron: oookay, now I can dump the open nvidia drivers
18:56 karolherbst: have fun
18:56 aron: haha, I have 14 cooling_device
18:56 aron: will be fun to find to correct one
18:57 aron: bunch of CPUs
19:00 aron: it's here: /sys/class/graphics/fb0/device/hwmon/hwmon1
19:00 karolherbst: nvidia exposes a hwmon file now?
19:00 karolherbst: crazy
19:02 aron: thanks for the help again
19:03 aron: I think I can do some progress with the system now
19:03 karolherbst: np
19:19 mynacol: * this reason. See https://nouveau.freedesktop.org/KernelModuleParameters.html
19:20 mynacol: karolherbst: Doesn't NvClkMode require a non-hex value? I have NvClkMode=15 for this reason. See https://nouveau.freedesktop.org/KernelModuleParameters.html
19:20 mynacol: And thanks for the tip with NvPmEnableGating, trying that now as well
19:32 karolherbst: mynacol: I think it can handle both
19:33 karolherbst: well.. obviously it does handle both
19:33 mynacol: Then you might adapt the documentation? ;)