IRC Logs of #radeon on irc.freenode.net for 2025-08-20

17:22 Venemo: karolherbst: some nice results, congrats! https://www.phoronix.com/review/rocm-rusticl-strix-halo
17:23 karolherbst: Yeah.. I kinda expected worse tbh :D
17:43 HdkR: oooo
17:44 HdkR: Some room for improvement but pretty dang good.
18:20 karolherbst: the kernel launch overhead is probably because of usermode command submission, I think ROCm uses it, but mesa doesn't?
18:29 agd5f: correct
19:00 Venemo: karolherbst, agd5f I assume by usermode command submission you mean user queues. shouldn't those actually have less overhead?
19:00 karolherbst: yeah, that's why ROCm has a lower kernel launch overhead, no?
19:01 karolherbst: I think there was somebody writing an MR for it, but not sure that was ever targeting mesa
19:02 agd5f: yeah, rusticl could target either ROCr or the KFD IOCTL interface if it wanted to use the same user queues as ROCm.
19:05 karolherbst: I suppose those interfaces aren't really suitable for 3D, so it's only viable to use on a compute only screen.. I think the issue in mesa was, that it's not that simple (tm), because radeonsi shares the pipe_screen across APIs within the same process, so not sure it's easily doable without reworking quite a bunch
19:06 karolherbst: I have other places where it's causing significant overhead in this area, so maybe I work on those first and see how close I'm getting
19:47 airlied: agd5f: could just expose user compute queues via amdgpu :-)
19:58 agd5f: airlied, sure, but they'd also need mesa changes too.
19:59 airlied: indeed, but they would be more compatible changes
22:59 karolherbst: anyway.. I looked into why the latency is so much different, and most of the reason is that timestamp queries aren't great in mesa and cause a lot of pointless overhead :')
23:04 HdkR: I do appreciate some pointless overhead.