IRC Logs of #dri-devel on irc.freenode.net for 2025-02-24

07:49 Ristovski: I can "bypass" glvnd (I have mesa built without it) while using nvidia proprietary by something silly like LD_PRELOAD, is there a similarly-naughty incantation that would work for EGL on Wayland as well?
07:54 mahkoh: On AMD and Intel cards, does it make sense to increase VRAM usage to remove dependencies between render passes, i.e. render passes writing to the same buffer, so that they can be rendered to in parallel? Or do those cards always process render passes in series anyway?
07:54 mahkoh: I'm using a single vulkan graphics queue.
07:57 sima: if the vk driver doesn't have multiple queues, then neither does the hw
07:57 sima: and I think for render that holds for everything (unless it's some multi-chip monster)
07:58 mahkoh: My thinking was: if two render passes are only able to use 50% of the GPU capacity each, could the GPU run them in parallel. I guess AMD and Intel can't do this?
08:01 mahkoh: More concretely: I'm starting to use intermediate 16-bit blend buffers in my compositor. The question is: If I have two 1080p monitors, is it sufficient to use a single blend buffer or should I allocate 1 blend buffer per monitor to potentially decrease latency.
08:02 mahkoh: I assume the rendering operations will always use 100% of the GPU anyway. But there might be situations I haven't considered.
08:17 MrCooper: dj-death: sounds like either GitLab accidentally associated a non-MR pipeline with the MR, or there's something wrong in the CI configuration
08:18 MrCooper: Ristovski: why wouldn't that work for EGL on Wayland?
08:21 Ristovski: MrCooper: Not sure, `eglinfo -B` tries to load nouveau which obviously fails at `eglInitialize` since I am using the proprietary ones
08:23 MrCooper: eglinfo prints information about all EGL platforms, nvidia probably doesn't support all of them
08:23 Ristovski: I am aware, yet there is only mesa/llvmpipe in the output, which is peculiar
08:53 MrCooper: maybe try running it in strace to see where it's picking up stuff from
15:31 dj-death: have people noticed that the new shader cache code uses flock?
15:31 dj-death: and through that mechanism, any random application can essentially take a lock on your compositor
15:31 dj-death: running into that multiple times locally
15:31 dj-death: that's pretty terrible
15:32 zmike: yes
15:32 zmike: it's very bad
15:33 MrCooper: not the only issue with the new cache, colour me surprised it's still the default
15:34 dj-death: like I can't open nautilus right now
15:34 dj-death: because of flock
15:34 zmike: yup
15:37 MrCooper: digetx: ^ remember https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31229#note_2613970 ? :)
15:40 Company: make the client/server thing an xdg portal so it works for flatpaks!
15:47 MrCooper: there's no client/server thing
15:49 Company: yet!
15:50 Company: I just wanted to point out the problem that sharing the cache is happening less these days
15:51 MrCooper: not sure what you mean by that
16:02 pac85: mahkoh the vulkan spec says that commands always begin execution in the order they are issued, but can end out of order. This means that if you have two passes one after the other in a cb the first draw of the second pass must begin execution after the last draw of the first. Assuming there are no barriers that prevent it you can have some overlap between the two passes (so some draws from the previous pass might still be doing work when
16:02 pac85: the second pass starts) but in practice this is limited to a handful of draws for desktop GPUs. AMD technically does have more than one GFX queue but I don't think that's exposed. If it where you would have to split your passes into two separate submissions to take advantage of it. So to answer your question, removing dependencies between passes can have some benefit but how much depends on the workload. There is some mobile hw out there
16:02 pac85: with separate queues for vertex and fragment in which removing dependencies between passes allows for quite a bit of parallelism (those GPUs can't overlap vertex and fragment work unless they are from different passes, unlike most desktop HW and some mobile GPUs which can do it at a draw granulairty).
16:26 mahkoh: "first draw of the second pass must begin execution after the last draw of the first" - if the two render passes share no resources, then I suppose this would be indistinguishable from draws running in any order, which the GPU exploit to make use of available parallelism
16:27 mahkoh: but for now I'm going with a single buffer shared by all monitors
16:42 alyssa: glehmann: supporting everything in divergence is on my todo list
16:42 alyssa: or rather, describing divergence properties in nir_intrinsics.py
16:43 alyssa: (at what point we can build-time assert that everything is described)
16:43 alyssa: just haven't figured out exactly how that should look
17:02 pac85: mahkoh well yeah execution beginning in order doesn't have many semantics implications for most commands but it does match how most implementations on desktop work. (tiling GPUs somewhat do reorder but not the way you suggest).
17:14 digetx: MrCooper: roughly remember, will try getting to it sooner
17:23 alyssa: gfxstrand: is this silly? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33654#note_2796146
17:28 alyssa: [I can relatively easily write the coccinelle to port most of the tree to that, but i can't decide if it's, sensible]
17:31 dcbaker: Okay, after being sick for 3 weeks and then having inclement weather... I am finally almost caught up on everything. I'm planning to cut the last 24.3.x release on Wednesday, and I'll be sending out an email to the list with all of the patches that do not apply to get dev feedback on what they want to get backported before the release happens
17:32 alyssa: dcbaker: sorry to hear about the illness :(
17:33 dcbaker: Thanks. It's kinda my fault for not going to the doctor sooner. Listen kids, don't be a stubborn old man like me. lol
17:34 alyssa: :(
17:58 glehmann: alyssa: I think your nir_progress idea is quite neat
17:59 alyssa: glehmann: thanks :)
17:59 alyssa: my main reservation is that it won't be practical to convert 100% of the tree (just because coccinelle isn't perfect and so on)
17:59 alyssa: and IDK if we end up in a very confusing spot if we have half-way coverage
18:00 alyssa: although maybe it'd be ok? and we'll see how close cocci can get
18:00 alyssa: there are only ~382 nir_metadata_presere calls in tree so if cocci can get most of them, that's a good spot
18:02 glehmann: I think we should fully replace the old helper if we decide to adopt the new one, even if it will lead to weird code like if (progress) nir_progress(true, ) else nir_progress(true, ) in a few places until someone decides to clean it up
18:02 gfxstrand: alyssa: I don't hate it. :)
18:03 alyssa: glehmann: yeah that's fair
18:04 alyssa: gfxstrand: alright. I will see what I can cook up then :}
19:12 robclark: alyssa: ../src/asahi/meson.build:16:15: ERROR: Unknown variable "dep_iokit".
19:20 alyssa: robclark: uhoh
19:21 alyssa: going to need more context for that one..
19:23 robclark: alyssa: I suspect it is because I don't have hk enabled?
19:23 alyssa: robclark: that should be fine. what commit is this, OS, etc
19:23 alyssa: (It builds fine e.g. in CI so clearly something is different with your env)
19:23 robclark: ToT from this morning, fedora
19:24 alyssa: what drivers do you have enabled?
19:24 robclark: https://paste.centos.org/view/77699f0f
19:24 alyssa: ohhhhh
19:24 alyssa: i see what's happening
19:25 alyssa: yes ok that's... entertaining
19:25 robclark: enabling asahi "fixes" it (if there was any doubt)
19:26 alyssa: robclark: https://rosenzweig.io/0001-asahi-fix-Mesa-build-when-asahi-is-not-built-but-too.patch
19:26 alyssa: "tools=asahi but neither gl/vk driver" was the untested build combo you hit
19:27 robclark: oh... blah... -Dtools=all strikes again
19:27 robclark: but r-b for that patch
19:27 robclark: maybe we should make -Dtools=all more clever
19:28 alyssa: konstantin: ^ can you cherry-pick that patch into your MR?
19:28 alyssa: thx
19:32 konstantin: alyssa: https://rosenzweig.io/0001-asahi-fix-Mesa-build-when-asahi-is-not-built-but-too.patch ?
19:33 alyssa: yes
19:33 alyssa: thanks
19:33 alyssa: we don't need two "fix asahi's meson.build files" MRs in the merge queue at once ;)
19:34 konstantin: Warming a lot of air at the hw farms...
19:37 alyssa: real
19:37 alyssa: i still need to setup one of those..
19:47 alyssa: glehmann: gfxstrand: ok class what do we think of https://rosenzweig.io/0001-Via-Coccinelle-patch.patch
19:47 alyssa: IDK what to make of all these 'if(progress){preserve()}` constructions
19:47 gfxstrand: I like the diffstat
19:48 alyssa: what happens if there's not progress for those?
19:48 alyssa: are all of those passes broken across the board?
19:48 gfxstrand: Yeah, I think so
19:48 gfxstrand: It should always be if (progress) preserve() else preserve(none)
19:48 alyssa: yeah..
19:49 alyssa: I don't know that I want to do a functional change in an automated treewide patch of this size..
19:49 gfxstrand: fair
19:49 gfxstrand: so do two patches
19:49 alyssa: hmm, although nir_metadata_preserve(all) should be harmless
19:50 alyssa: like if there are other preserve calls elsewhere
19:50 alyssa: because it ANDs
19:50 alyssa: maybe
19:54 alyssa:tries it
19:58 alyssa: ok this looks better
19:58 alyssa: it *really* wants a clang-format though..
20:14 alyssa: gfxstrand: mind if I re-clang format compiler/nir/?
20:44 alyssa: ok, MR up
20:44 alyssa: I hogged all the labels again
20:58 Kayden: - /* clang-format off */
20:58 Kayden: + /* clang-format off */
20:58 Kayden: is particularly amusing
20:59 robclark: lol
20:59 Kayden: no, clang-format, NO!
20:59 Kayden: hehe
20:59 kisak: what are you doing step-clang-format?
21:02 mattst88: kisak: lol
21:04 alyssa: Kayden: I noticed that yeah
21:09 Kayden: (to be clear that isn't a complaint, I'm just amused)
21:12 uriah: pepp: quick question, does amdgpu native context require kernel 6.14 to work correctly or should 6.13 be fine?
21:16 uriah: i ask because digetx's intel merge request mentions 6.14, just wondering if it is the same for amdgpu
21:43 robclark: uriah: IIRC there was some kvm bug fix that was needed? But digetx would know better the situation on x86
21:44 uriah: thanks robclark, i will try digging it up
21:44 uriah: is it possibly a cause for stuttering, or would that likely be user (me) error?
21:49 uriah: robclark: might these 4 commits be the relevant ones? https://gitlab.freedesktop.org/digetx/linux/-/commits/native-context-iris
21:49 uriah: i'm not 100% sure that virtio-wl is required for amdgpu but perhaps the others
21:50 robclark: virtio-wl is for wayland proxy to host compositor, so not needed if you are using a guest compositor
21:50 uriah: ok
21:51 robclark: I think the others should not be mandatory... in any case, they are all guest kernel patches
21:53 uriah: i see
21:53 uriah: good to know
22:11 robclark: ugg... why does piglit still use cmake.. and what does `Could NOT find PythonNumpy (missing: PythonNumpy_STATUS)` mean (I do ofc have python3-numpy installed)
22:12 dcbaker: robclark: I have some patches around somewhere...
22:12 airlied: usually means it's picked up a different python
22:12 dcbaker: not complete
22:13 robclark: hmm, does look like I have py 3.12 and 3.13 (but nothing more ancient)
22:13 robclark: hmm, cmake picked 3.12.9
22:14 robclark: ok, removing 3.12.9 did the trick.. ugg
22:15 airlied: -ETOOMANYSNAKES
22:15 robclark: indeed
22:27 mattst88: there's some argument you can pass to cmake to tell it which python to use, but I can't find it in my shell history...
22:29 dcbaker: -DPython_EXECUTABLE IIRC
22:44 mattst88: yeah, that sounds like it