14:18ktp: My GPU keeps random hangs, where do I start debugging?
14:19ktp: Should I boot with some kernel parameter to get logs after restart?
14:20ktp: It started happening suddenly, on 6.12 and 6.14, I have overdrive enabled but I didn't experienced problems before, was playing games or running stress tests without issues
14:21ktp: s/didn't/haven't/
14:24ktp: I'd appreciate hints, because I don't really know how to monitor/log this for further analysis
14:25ktp: Screen freezes and I can't do anything
14:37Remco: Step one is looking at logs
15:02ktp: Remco: where are logs in such case?
15:03Remco: dmesg is a safe bet to check
15:03Remco: But then that also gets logged in journald
15:06ktp: what if one does not have systemd?
15:07Remco: Then look through /var/log and see what is there
15:10ktp: I have only current dmesg in /var/log/dmesg.log, I don't have previous dmesg messages
15:14dwfreed: recommend setting up netconsole to another device on the local network, then
15:25ktp: OK I think I have enabled some based logging
15:25ktp: I will figure out how to connect to that machine and will read logs
15:25ktp: so if it's driver, it's dmesg
15:25ktp: I mean, kernel driver
15:26ktp: what if it's a problem with something like mesa? will it still pop up in dmesg?
15:31ktp: OK I need to close my connection but thanks for pointing me out where to start looking
15:31ktp: bye
16:51ktp: I'm back with dmesg logs: https://bpa.st/DX2A
16:53ktp: It happened again, and concurrently I'm analyzing Xwayland coredump log with gdb and debuginfo as it has been left behind
16:53ktp: if you see tainted flags, it's because of ZFS
16:54ktp: I will be happy to fill a bug report if someone would help me understand how to debug this further, or if that dmesg info is enough
17:01Remco: See if you can find it on https://gitlab.freedesktop.org/drm/amd/-/issues and if not, you can file an issue there
17:09zamundaaa[m]: ktp: Xwayland crashing on GPU reset is expected
17:09zamundaaa[m]: And not likely to be related
17:10ktp: zamundaaa[m]: I have thought Xwayland was causing that
17:10ktp: ie. GPU reset
17:12ktp: I'm looking at that dmesg output, is that actually useful to anyone?
17:14ktp: because I was thinking about describing my findings from Xwayland coredump file, as a context
17:15ktp: but if you say it's the other way around, ie. GPU reset causing Xwayland to crash & leaving coredump behind
17:15ktp: then I'm not sure if only that dmesg is enough to report a bug?
17:25johnny0: ktp: https://gitlab.freedesktop.org/drm/amd/-/issues/4233
17:28johnny0: follow the link to 4238 for more info
17:28ktp: Oh my, I see this causes real mahyem out there
17:28ktp: johnny0: Yes, I also had similar screen freezes with 6.12, as my distro provides longterm and mainline kernels so I tested both
17:29ktp: but no dmesg from 6.12 one, unfortunately
17:31johnny0: yeah, the commit made it 6.12 so that makes sense
17:32ktp: thank you very much for posting this, it's very uncomfortable not knowing what is causing such freezes, especially when you rely only on one PC
17:40johnny0: i hear ya. you should be able to roll back to 6.12.28 until the fix comes in (which is normally pretty fast)
17:42ktp: not that I'm complaining, I'm very happy for that how AMD GPUs are supported these days (I'm old enough to remember fglrx days)
18:06ktp: johnny0: is 6.13.12 affected with that buggy commit?
18:12johnny0: no, 6.13.y no longer gets updates afaik
18:13ktp: Yes, it doesn't, I was just asking because cfb2d418 was introduced like month ago? and last 6.13 was released like Apr 20
18:14ktp: my distro doesn't provide 6.12.28, so I will probably go with 6.13 for now