IRC Logs of #dri-devel on irc.freenode.net for 2024-11-06

01:39 Lynne: wrote a piece of code that runs much faster when using BDA than regular array accesses+inout everywhere
01:39 Lynne: who said they're useless and the overhead isn't worth it?
01:42 Lynne: it does take uncomfortably long to be compiled to spirv, apparently the compiler chokes on it
08:10 jfalempe: tzimmermann: it looks like there is an issue with drm_client_setup(), the kernel test robot complains on my drm_log series: https://lists.freedesktop.org/archives/dri-devel/2024-November/476830.html
08:18 jfalempe: tzimmermann: ok it's my patch, I think drm_log should also select DRM_CLIENT_LIB
08:30 jfalempe: tzimmermann: is it fine if I change DRM_CLIENT_SELECTION to select DRM_CLIENT_LIB unconditionally?
10:03 dviola: can someone help me close this issue please: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4369 -- it was my bug but now appears as "ghost user"
10:04 dviola: I can't reproduce it anymore
10:04 Ermine: have you shut your fdo gitlab account down?
10:05 dviola: Ermine: I deleted the old account by mistake and created another one
10:06 Ermine: oic
10:07 dviola: I asked if my new account could be assigned to the old one that is now a ghost user but I never got a reply on that
10:18 dviola: s/assigned/merged/
14:59 alyssa: robclark: is amul signed or unsigned?
15:00 alyssa: it's defined as undefined for overflow but not specifying which overflow so i'm a bit confused
15:10 robclark: alyssa: hmm, looks like it lowers to imul or imul24.. although pre-caffeine I don't remember why you want signed address math. I guess so negative array indices work?
15:20 alyssa: robclark: yeah, that's what i'm struggling to wrap my head around
15:20 alyssa: (are there cases where negative array indices don't hit UB? I genuinely unsure)
15:22 robclark: I think things like `foo = &bar[5]; blah = &foo[1];` are allowed.. imul24 is defined to sign extend the result to 32b
15:30 alyssa: hmm
15:31 alyssa: it's kinda funny
15:31 alyssa: the hw addressing mode is clearly designed for C
15:31 alyssa: and glsl is.. not C
15:39 robclark: hopefully the hw works similar to the various ldg/stg on adreno.. because some day I'll have to figure out how to fold 32b offset calc into ldg/stg, similar to what you are trying to do (but haven't really had time to look more closely at your MR yet)
15:42 alyssa: *nod*
15:42 alyssa: the annoying part for me is that the shift is mandatory
15:42 alyssa: I can't do 64-bit + bytes offset addressing
15:43 alyssa: only 64-bit + word offset
15:43 alyssa: and since nir_lower_io gives me bytes...
15:45 robclark: we have both variations, the ".a" version has shift.. https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/freedreno/isa/ir3-cat6.xml?ref_type=heads#L98
15:45 alyssa: ah..
17:26 Lynne: why is there no bitfieldExtract for uint64_t in glsl? forgotten about?
17:57 glehmann: vulkan only supports it for 32bit at the moment: https://github.com/KhronosGroup/Vulkan-Docs/issues/2434
18:07 DemiMarie: alyssa: I am not surprised because Metal is very much C/C++-like.
19:03 alyssa: robclark: yeah i'm still trying to figure out the signed semantics
19:03 alyssa: if max buffer size = N, what bounds are on the signed result of amul
19:03 alyssa: 0 <= amul < N?
19:03 alyssa: or -N < amul < N?
19:04 alyssa: if the latter, and N=2GiB, we haven't actually restricted things ...
19:04 alyssa: (and if N=4GiB like it is for a bunch of desktop drivers ... uh)
19:07 alyssa: I *feel* like there's a solution just out of grasp but.. I can't quite figure it
19:07 alyssa: and boils down to this signedness issue
19:11 DemiMarie: glehmann: Which API considers infinite loops to be undefined behavior?
19:12 alyssa: although.. turnip limits SSBOs to 128MB
19:13 alyssa: which makes this a lot easier
19:13 robclark: alyssa: so it looks like amul and nir_lower_amul pre-date the introduction of umul24.. which I guess is why amul is signed.. fwiw
19:13 alyssa: and it gets spicier..
19:13 alyssa: looking at nir_lower_io, the way you'd get a negative amul is from a negative array index
19:14 alyssa: but like. nir_lower_io seems to try very hard to handle that case correctly which is... ???
19:15 alyssa: can't tell if that's supposed to work, or someone was just being too defensive programming
19:16 alyssa: konstantin: just found 08577bbb703 ("nir/nir_lower_io: Optimize 32-bit inbounds access")
19:16 alyssa: but it looks like the rabbit hole goes deeper?
19:17 alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16729#note_1417101
19:17 alyssa: "ptr_as_array... can have a negative index as long as it stays within the object"
19:17 alyssa: ..well, that answers that
19:19 robclark: right.. I think it has to be signed.. and I guess if you want to have a >2G object, then lolz?
19:23 alyssa: Yeah
19:24 alyssa: if the max buffer size=2G, then 32-bit amul should imply no_signed_wrap.. I think?
19:25 alyssa: and absent signed wrap on the amul, I think the rest of the operations work regardless of overflow. maybe.
19:25 alyssa: the worst grade I got in all my years of school was in number theory
19:26 alyssa: the point is that amul(2, x) can be any 32-bit value, but |x| < uint_max/2 or something like that
19:26 alyssa: sint_max/2 i mean