07:34 vedm: is modprobe -r amdgpu; modprobe amdgpu supposed to not crash, i.e. should I report if I find it crashing?
07:46 MrCooper: yes in principle, though it's always been fragile
07:48 vedm: MrCooper: I will gladly do it
08:00 Venemo: vedm: unfortunately, it is very prone to crashing, especially when your gpu is in a problematic state (eg. after a gpu hang and such)
08:11 vedm: Venemo: on a Kaveri, even modprobe -r crashes, on Kabini it works, but subsequent modprobe crashes the machine to require hard reboot, on Raven2 (R1505G) both commands pass, but dmesg is red with errors and I would expect the GPU to be non-functional afterwards
08:12 vedm: I am talking clean boots, no prior GPU errors in dmesg
08:12 Venemo: yeah, then it's even worse than I had thought
08:21 vedm: Venemo: is there a name for this load/unload feature that I should use in the reports? GPU reinitialization perhaps?
08:21 Venemo: I don't know
08:22 Venemo: technically afaiu what you mean is unloading the amdgpu module
08:22 vedm: yes, and then loading it again, which forces drm to reinitialize the GPU
08:24 vedm: MrCooper: Venemo: thx for the help, I have all the info I need
09:56 Venemo: vedm: do you mind some questions?
09:57 Venemo: vedm: do you actually still use Kabini and/or Kaveri or just trying it for curiosity? and second, does amdgpu give you any benefit on these systems compared to the default radeon driver?
10:01 vedm: Venemo: absolutely not, Kaveri is my mother's desktop machine, it's fine for her mail/office work and Kabini is the (FreeBSD) NAS, which happens to use amdgpu DRM driver via LinuxKPI, and the reason why I need it on headless machine is that the temperature and power consumption go down significantly after loading the module
10:01 vedm: so, why amdgpu on both instead of radeon
10:02 vedm: on Kaveri, radeon driver is unstable and one of the rings hangs after a while, I should probably report that
10:02 vedm: it's a regression
10:02 Venemo: vedm: are you saying it consumes less power on amdgpu? that is surprising
10:02 vedm: on Kabini, FreeBSD for some reason loads amdgpu by default even though radeon supports the device
10:03 vedm: Venemo: no, default when no driver is loaded consumes more than when driver is loaded
10:03 vedm: so headless machine benefits from having (either) driver loaded
10:03 Venemo: right, that makes sense. you get no power management without drivers
10:04 Venemo: vedm: for the regression, I think the best way to make progress on that is to bisect. without that, I don't think anyone has the energy to investigate
10:04 vedm: honestly, I am not sure how well it works with regards to 3D/video/compute, I tried running GROMACS OpenCL somewhat recently for fun and managed to crash the driver in some of the tests
10:05 Venemo: the OpenCL situation is definitely a mess on old GPUs, but it depends on which drivers you tried
10:05 vedm: Venemo: radeon on Kaveri? I think I could do that, compile times are reasonable, and it's also possible to compile on another machine
10:06 vedm: Venemo: rusticl, clover crashed the driver on FreeBSD due to some interface problem
10:06 vedm: but these are all radeonsi, so it should be OK, I guess
10:06 Venemo: I am not sure if anyone here is testing freebsd
10:06 Venemo: I definitely never did
10:07 vedm: it's running DRM 6.1, 6.6 if you are brave and compile it yourself
10:07 vedm: so it is a bit behind what is being worked on, but eventually new stuff will trickle down anyway
10:07 Venemo: I'm sorry but I don't have the personal bandwidth for that.
10:08 vedm: of course, nobody expects it to be supported as well, I always make sure to reproduce on Linux with latest kernel before reporting it
10:08 vedm: and bisecting an amdgpu DRM issue on FreeBSD is basically impossible
10:09 vedm: what I can tell for sure is that GROMACS ran with Clover on radeonsi ~8 years ago without crashing either Mesa or the driver
10:10 vedm: so I guess I should give it a shot on Linux at some point and report if it actually crashes the driver there as well
10:11 Venemo: thanks, appreciated
10:11 vedm: Venemo: anyhow, thanks for asking, my (old) AMD machines are pets in a way, I like finding bugs and bisecting regressions, but the APUs are useful long beyond their official support
10:12 Venemo: main issue I got with those old APUs, is that I don't have the HW, and it's not easy to find. and the CPU part of them is pretty weak, so probably also a pain to work with.
10:14 vedm: Venemo: agreed, and if they are SoCs it's even worse, .e.g for bisecting https://gitlab.freedesktop.org/drm/amd/-/issues/3448 I had to compile the kernel on a different machine, otherwise I would still be narrowing the commit range
10:14 Venemo: exactly
10:16 Venemo: these days I am trying to fix up some things for SI and CI (that is GCN 1-2) dGPUs in amdgpu
10:17 Venemo: the APUs are problematic for the above reasons, especially Kaveri
11:39 johnny0: vedm: i tested clover on hawaii + amd-staging-drm-next a few weeks ago and it crashed the driver. Using the old amdgpu-pro orca lib still works though. I'd expect the same situation on kaveri
11:51 Venemo: johnny0: clover is dead, try rusticl
11:52 johnny0: aye, i thought I *was* trying rusticl which is the only reason I was testing it
12:04 Venemo: did rusticl work then?
12:09 johnny0: sorry, i bailed after that
12:17 Venemo: I see
12:21 vedm: johnny0: nice to hear that it's not APU-specific
12:23 vedm: johnny0: oh, and thanks for figuring out that regression with small UMA, I still have to test with a newer kernel, it will be helpful to increase the available memory on that machine
13:19 johnny0: vedm: right on, hope it works out for you
14:38 DottorLeo: hi!
14:48 Venemo: hi
19:40 vedm: agd5f: I can't edit xorg-wiki; would adding mention of GCN generations in this way to the RadeonFeature page https://paste.centos.org/view/78c988a0 be acceptable?
19:49 Venemo: vedm: we already got this: https://github.com/torvalds/linux/blob/master/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv