IRC Logs of #radeon on irc.freenode.net for 2025-02-28

09:58 tzimmermann: amdgpu keeps imported GEM objects in the GTT domain. see https://elixir.bootlin.com/linux/v6.13.4/source/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c#L840 and below
09:59 tzimmermann: the test at that url only works for imported objects. but what about exported GEM objects? shouldn't they be in GTT as well?
11:13 soreau: mareko: what kind of wayfire bug do you think would cause 12707? it's been working fine for years on many different drivers (including rpi and mobile), and there have been no changes to the wayfire GL code.. but somehow you think this could be a compositor problem when the issue appears (only on radeonsi) when the GL name reuse is in play?
11:14 soreau: The complication with assuming wayfire is at fault, is that there is nothing much else like it to try and repro the bug - I think it happens after using geometry shaders (which is what the cube does with gles3)
11:30 mareko: soreau: so far it seems to be a mesa bug
11:30 mareko: I commented there
11:31 soreau: oh..
11:32 soreau: mareko: but wouldn't 12710 be the dup since 12707 was reported before? :P
12:21 soreau: I built wayfire without gles3 so cube doesn't use geometry shaders but this didn't seem to help
12:56 soreau: mareko: reading through !32715.diff, it seems that it enables name reuse for VAO, DisplayList, Programs and ATIShaders from gl_shared_state shared* additionally
12:58 soreau: would it be useful to know if setting these to false unconditionally helps? or finding out which or what combination causes the problems
13:08 soreau: or to check if it happens with the commit before and force enable it
15:02 MrCooper: agd5f: doesn't that defeat the point of hotplug events, if the compositor has to keep polling just in case anyway?
15:06 agd5f: MrCooper, you'd only do it when the you get input and all of the monitors are in the DPMS off state
15:07 agd5f: but even then, that wouldn't fix the problem of not turning on the monitor until after you've tried to wake the displays and the display was physically turned off
15:07 MrCooper: tzimmermann: I submitted patches fixing that a year ago, message ID of first patch is 20240222172821.16901-1-michel@daenzer.net; any help trying to convince Christian appreciated :)
15:09 tzimmermann: MrCooper, how do I find this from the message id? or do you have a full link?
15:10 MrCooper: agd5f: indeed; does turning on the monitor really not generate an HPD event though?
15:10 agd5f: MrCooper, seems some monitors don't
15:10 MrCooper: tzimmermann: not sure, patchwork seems down
15:10 agd5f: tzimmermann, https://www.spinics.net/lists/amd-gfx/msg103923.html
15:10 MrCooper: queue sad trombone
15:11 tzimmermann: thanks, i'll read through it
15:11 MrCooper: thanks Alex
15:15 agd5f: MrCooper, I was thinking of the case where you have say 2 GPUs an only one has displays attached. The second one without displays gets powered down. Then you plug display into the second GPU, but since it's powered down, you don't get HPD interrupts. So there is no automatic way to get notified of the hotplug on the second GPU.
15:36 Venemo: agd5f: is DC now the default code path on GFX6-7?
15:38 MrCooper: agd5f: right, we have customers hitting that
15:40 MrCooper: I guess there might be no way around some mechanism to trigger the compositor explicitly probing in that case
15:43 Venemo: are we sure the issue is that the monitor doesn't send the event? and it's not a case of the event getting lost somehow?
16:28 agd5f: Venemo, no. DC doesn't support analog encoders and gfx6-7 still have DACs.
16:30 hakzsam: is there a way to disable amdgpu zerovram allocations?
16:44 Venemo: agd5f: if I'm having display issues, is it worth trying amdgpu.dc=1 ?
16:45 agd5f: Venemo, sure if you aren't using an analog display
16:46 Venemo: this card only has 2 DP
16:46 agd5f: ok
16:46 Venemo: agd5f: also, on a side note, Hawaii seems to be really really buggy with the old radeon kernel driver, it's basically useless
16:53 Venemo: agd5f: looks like the system cannot boot on this GPU (Oland) with amdgpu.dc=1
16:55 Venemo: basically not even plymoth shows up. it just boots into a black empty screen
16:58 agd5f: I guess it broke at some point. Worked years ago when I last tried it.
16:59 Venemo: :D
16:59 Venemo: how would I go about diagnosing what's wrong with it?
16:59 agd5f: anything in dmesg?
17:00 Venemo: well, how do I get the dmesg if the system didn't boot?
17:02 agd5f: boot to a non-GUI runlevel and with modprobe.blacklist=amdgpu on the kernel command line in grub, then connect to the system over ssh and then modprobe amdgpu over ssh
17:02 Venemo: wow
17:02 agd5f: or maybe just ssh in if the system is up, but just no display
17:02 Venemo: I'm honestly not sure which is the case.
17:02 Venemo: I was thinking maybe I can attach a USB serial port and have the kernel output the log to that
17:03 agd5f: could do that too
17:03 Venemo: to be honest I'm not sure if I can do this right now, because I have a few more urgent things to do. but I'd like to come back to this topic eventually
17:05 Venemo: I'm curious, what is the typical way you guys develop the kernel driver?
17:05 Venemo: I assume you develop on a separate machine than the one that has the gpu you are developing for
17:05 agd5f: right
17:06 agd5f: I do most stuff on it over ssh
17:06 Venemo: sounds like there are a lot of reboots involved too
17:06 agd5f: yup
17:07 Venemo: or can you compile amdgpu as a loadable module and then just load/unload it? does that sound stupid?
17:11 agd5f: you can do that too. depends on what you are debugging/developing
17:25 Venemo: in theory, could one use a VM with GPU passthrough for development?
17:26 Venemo: for example, if I boot my system with my monitors connected to the iGPU, can I then pass the dGPU to a VM and deal with it that way?
17:26 Venemo: or is that not feasible?
18:07 chithead: Venemo: that is possible, with some limitations. your system needs an iommu and it needs to be enabled. also there are gpu reset problems
18:18 Venemo: chithead: interesting. what are the gpu reset problems in this scenario?
18:24 colo: Venemo: netconsole could also help you debug this, I guess
18:25 Venemo: colo: what is netconsole?
18:25 colo: btw, a KDE Plasma dev shared this suggestion about re-triggering DP HPEs: `sudo udevadm trigger --action=change /dev/dri/card1` (I still have to try it :))
18:25 colo: Venemo: it realys your kernel debug ringbuffer over UDP
18:25 colo: -> https://wiki.archlinux.org/title/General_troubleshooting#netconsole
18:25 Venemo: ah. I am unsure if network is up or not when (whatever the issue is) happens
18:26 Venemo: but it's good to know I have good options
18:26 Venemo: I honestly don't have the energy to look into it right now, so I just plugged back my normal GPU and moved on
18:51 chithead: reset bug happens if you e.g. reboot the vm, or shutdown and start again without a host reboot in between. it affects a number of radeon cards
18:52 chithead: for older cards there is https://github.com/gnif/vendor-reset
21:24 Venemo: chithead: thanks, that is very interesting info. though that suggests that this kind of setup may not be a good choice for the kind of development I was asking about