IRC Logs of #dri-devel on irc.freenode.net for 2025-09-10

08:45 djgl: Hi, I'd like to make a merge request for a Mesa fix but can't fork the repo on the Freedesktop Gitlab. Who do I have to contact to raise the limit for personal repos above 0?
08:49 djgl: never mind, found the answer
11:07 tzimmermann: airlied, sima, hi. please merge last week's PR from drm-misc-next https://lore.kernel.org/dri-devel/20250904090932.GA193997@linux.fritz.box/
11:14 mlankhorst: Should drm_mode_createblob_ioctl() allocate memory with __GFP_ACCOUNT ?
11:14 mlankhorst: Seems a bit of an oversight
11:36 daniels: mlankhorst: yeah I wonder which idiot did that
11:37 daniels: mlankhorst: so if I'm reading this right, that causes the memory to be accounted to the memcg of the current task?
11:38 daniels: if so, feel free to send the trivial addition with my Rb and also a Fixes all the way back to ... hmm, must've been 2015 when we were doing i915 atomic bringup? yikes.
12:02 mlankhorst: Lets just blame robclark, I mean r-b is basically rob already.
12:40 vsyrjala: i think createblob predates __GFP_ACCOUNT
12:44 glehmann: jenatali: do you see what I'm doing wrong for dozen here? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37258
12:53 jenatali: glehmann: not offhand, I can take a look when I get to the office in a few hours
13:01 mlankhorst: vsyrjala: then *digs* alloc_kmem_pages() should have been used
13:08 zmike: nowrep: !
13:43 zmike: what does e.g., PIPE_FORMAT_R32_USCALED correspond to in GL terms? I can't find anything related to this in the specs
13:48 Hazematman: zmike: isn't it for the vertex formats? like glVertexAttribPointer with GL_UNSIGNED_INT and 1 component
13:49 zmike: wouldn't that be R32_UINT ?
13:52 Hazematman: not an expert on this but if you pass `normalized` in I think it uses USCALED instead of UINT
13:53 zmike: then it's R32_UNORM
13:53 glehmann: USCALED is UINT, but convert to float
13:53 zmike: oh
13:53 zmike: huh
13:53 zmike: who did this to me
13:55 glehmann: e.g. R8_USCALED stores 0.0f, 1.0f, 2.0f,..., 255.0f
13:56 glehmann: 32bit USCALED is weird because you can't actually represent that larger integers as float
13:56 glehmann: I guess that's why it's not a thing in Vulkan
13:58 Company: you have that problem with 16bit scaled vs float anyway
14:00 zmike: right...
14:03 Company: to me it looks like it was added so you can use int and float textures with the same shader
14:27 glehmann: Company: what do you mean? pretty sure all 16bit integers are exactly representable as a 32bit float
14:28 Company: glehmann: 16bit float
14:28 glehmann: ah well but image/texture reads in Vk always return a 32bit float
14:29 Company: true
14:30 Company: that does indeed make sense
15:34 agd5f: anyone else having problems with the anubis stuff on kernel.org over the last few days? seems to retry like 50-60 times before it decides it's happy
15:44 dwfreed: agd5f: loading git.kernel.org worked fine here, I got anubis, and it solved in about a second or so
15:50 jenatali: Ugh this new PC hadn't built the CTS yet
16:02 jenatali: glehmann: Pushed you a fixup. Your new wave size bits are set all the time where previously an incoming wave size was only valid to specify for compute. Emitting the metadata nodes was scoped to only happen for compute shaders, but filling out the psv0 wave sizes wasn't scoped that way
16:03 jenatali: I also learned that apparently new versions of the dxil validator print out really nice messages on mismatches so we should probably bump that in our container
16:04 glehmann: thanks, I didn't think about that at all
16:31 K900: Hey folks, does anyone know if there's an ETA for a new Vulkan SDK release
16:31 K900: We (NixOS) are trying to upgrade to LLVM 21 and libclc does not build with the latest stable tag of spirv-headers
16:31 K900: (and possibly other things)
16:54 agd5f: dwfreed, weird. still takes 50-60 rounds here
17:03 dwfreed: agd5f: to be fair, I do not visit git.kernel.org frequently; perhaps anubis has decided you look like an AI because you go there a lot?
17:59 sima: airlied, I was out all day so didn't get around to processing the PR request from tzimmermann, I guess I'll look at it tomorrow if it's still pending ...
18:00 sima: daniels, mlankhorst there's probably a few more we should __GFP_ACCOUNT even if blob is the obvious ones
18:00 sima: essentially all the userspace-created objects really
20:19 robclark: karolherbst: so looking at `./images/clCopyImage/test_cl_copy_images small_images debug_trace 1Darray` fails (specifically just 1Darray fails!).. if I hack the test to read back the src image instead, it fails with identical results (ie. read back from src and dst img are identical).. so somehow we aren't managing to round trip 1Darray image upload+download without scrambling things somewhere.. but not sure which side of that
20:19 robclark: is wrong
20:20 karolherbst: mhhhhhh
20:20 karolherbst: this is interesting, because I've also seen those to fail on intel xe GPUs
20:20 karolherbst: and also only 1Darray failing
20:21 karolherbst: I wouldn't be entirely surprised if something is wrong there with the dimensions, but I couldn't figure out what exactly. It's even weirder that it works on other drivers
20:21 robclark: zink passes, but ends up using row_pitch/slice_pitch=1 (or rather, tightly packed rather than hw layout.. it seems to be doing a staging blit)
20:22 karolherbst: mhhh
20:23 robclark: but for 1x2 1d img array, the hw layout is row_pitch=64, slice_pitch=4096.. so I guess somewhere that isn't handled properly.. but I'm a bit lost in all the different copies rusticl has to do ;-)
20:23 karolherbst: well at least it's less of a disaster as it used to be
20:23 karolherbst: image to image copies should just call into resource_copy_region
20:24 karolherbst: _but_
20:24 karolherbst: wouldn't surprise me if mapping the images has a bug somewhere?
20:25 robclark: I'm a bit suspecting that it _isn
20:25 karolherbst: robclark: is it also broken at Scanline 1 for you?
20:25 robclark: I'm a bit suspecting that it _isn't_ the img->img copy
20:25 robclark: but either the READ or WRITE clEnqueueMapImage()
20:25 karolherbst: let's see what this test is doing...
20:25 robclark: I guess it would be useful if there is some other test that could narrow down which
20:26 karolherbst: okay.. create + map + write + unmap for each image, then a copy and then a map + read
20:26 robclark: right create_image() is where the "upload" happens
20:27 karolherbst: mhhh
20:27 karolherbst: could modify the test to use CL_MEM_COPY_HOST_PTR instead...
20:28 karolherbst: let me try that...
20:31 karolherbst: uhhh...
20:35 karolherbst: robclark: https://gist.github.com/karolherbst/f6711ce1b8480249886c1db1fb55f249
20:35 karolherbst: maybe a bit easier to debug with that. Fails the same way tho
20:35 karolherbst: at least here on modern intel
20:36 robclark: I think this case is CL_MEM_OBJECT_IMAGE1D_ARRAY
20:37 karolherbst: yeah
20:38 robclark: but still fails the same way with:
20:38 robclark: https://www.irccloud.com/pastebin/2W1iEXSk/
20:38 robclark: idk if that tells us the failure is in the read path?
20:39 robclark: I guess I can hack some more hexdump into my transfer_map..
20:39 karolherbst: well..
20:39 karolherbst: it now uses `texture_subdata` to upload the image data
20:41 karolherbst: but yeah.. if mapping it for writing is broken, why would mapping for reading not be
20:42 karolherbst: but debugging a map for reading is way easier
20:43 karolherbst: uhhh...
20:43 karolherbst: I think I found it...
20:44 karolherbst: well maybe not
20:44 karolherbst: yeah soo one thing standing out is why the pitches are different
20:47 karolherbst: PASSED
20:47 karolherbst: mhhh
20:47 karolherbst: robclark: does this fix it for you? https://gist.github.com/karolherbst/418753c2c3bfccf6be26518808752582
20:48 karolherbst: on iris xe I get 324 and 1316
20:49 karolherbst: but the region there isn't adjusted for the image 1D array stuff...
20:52 robclark: hmm, let's see
20:54 robclark: karolherbst: heh, yeah it does seem to fix it
20:55 karolherbst: mhh annoying, so I guess there aren't many drivers who return different pitches for 1Darray images
20:55 karolherbst: *which
20:56 karolherbst: 1Darrays are weird, because the layer is selected in the 2nd coord in CL, but 3rd coord in gallium
20:57 robclark: ahh
20:58 robclark: maybe some drivers are not directly exposing the hw img, but do a blit to/from tightly packed?
20:58 karolherbst: possibly
20:58 karolherbst: it's just weird that with pre Xe iris GPUs it works
20:58 karolherbst: but maybe it's different enough
21:02 robclark: idk enough about layout on intel things to explain that, but maybe some luck
21:04 karolherbst: just have to figure out where to swap the cords so it doesn't end up unmaintainable ...
21:06 karolherbst: .....
21:07 karolherbst: really...
21:07 karolherbst: robclark: Image::write does a "src_slice_pitch = src_row_pitch;" for CL_MEM_OBJECT_IMAGE1D_ARRAY and now I'm not sure why Image::read doesn't do it.. well.. that makes things easy at least
21:08 karolherbst: ehh..
21:08 karolherbst: need to do the reverse
21:09 karolherbst: I need to CI this
21:10 robclark: I suppose it would be nice if we could have a _bit_ of cl cts in mesa ci.. too bad cl cts is it's own bespoke framework
21:11 karolherbst: yeah....
21:11 karolherbst: I have a plan (tm)
21:11 karolherbst: I think deqp-runner can take a list of tests that it would run, but not sure what the details are there
21:11 karolherbst: so I'd just need to generate a list of things to run
21:14 karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37284
21:16 karolherbst: I should make sure it compiles first...
21:21 robclark: ok, I can give that a spin here in ~2min
21:29 robclark: karolherbst: seems to work (cherry-picked onto some other fixes I have)
21:29 robclark: I'll fire off a more complete cts run in a bit
21:30 airlied: okay first coopmat2 tensor layout subtest passed