02:33alyssa: zmike: does Zink use ESO for GL linked shaders ever?
02:33alyssa: if so -- will probably want https://rosenzweig.io/0001-zink-honour-nir-info.next_stage.patch
02:47vignesh: mairacanal: In drm-ci we have d2af13d9, and current testing some patches to send upstream with 376454b5. We can update to 814cd988 (latest)
03:50zmike: alyssa: no
03:51zmike: also you spelled honor wroung
04:44kode54: She's Canadian, so it's likely she'd be using the UK English spelling
04:44kode54: another one of those funny words that differs by locale
04:50ccr: coloUr, honoUr, humoUr. :P
04:59kode54: but I guess you probably knew that already, and were just being pedantic or silly
04:59kode54: but I don't know, that calls for the operation of your mind
09:01dolphin: airlied, sima: DIM is confused big time, it's trying to re-pick all the previous fixes, wonder if it was because the drm-intel-next-fixes got pulled to drm-fixes?
09:15sima: dolphin, I think it's been years I looked at what that script does, so no idea ...
09:15thellstrom: @dolphin, it might be that. From my understanding, if a fix is not already in drm-intel-fixes, it will be picked.
09:17sima: thellstrom, we should filter out already cherry-picked stuff though, but maybe that broke?
09:17thellstrom: We do, but it must be present in the same branch.
09:18sima: ah right ...
09:18sima: dolphin, backmerge should sort it then I guess
09:18thellstrom: I think...
09:18dolphin: well, I rebased drm-intel-fixes into -rc2 already
09:19sima: dolphin, and still all the old ones listed?
09:19dolphin: yeah, they fail for "need to pass --allow-empty"
09:20dolphin: even the one I just sent after -rc1
09:20sima: dolphin, I guess time for bash debugging, meaning --dry and lots of sprinkling echo all over the script
09:21dolphin: "dim: Generalize cherry-pick fixes" <- Could be the smoking gun
09:22thellstrom: dolphin: FWIW, I also got that "need to pass --allow-empty" for all already-present xe fixes, but they didn't end up in the end result....
09:23dolphin: yeah, not in the result but they appear as a list of "failed to cherry pick"
09:23dolphin: which you get the summary at the end
09:23dolphin: It definitely wasn't that way in the past
09:23thellstrom: OK. I never ran it without the commit you mention above.
09:26dolphin: Yeah, reverting that and it works beautifully
09:27dolphin: demarchi: ^^ FYI, please test next time that the code actually works the way it used to before generalizing :)
09:33dolphin: the problem seems to be that there was a two-step checking that was eliminated
09:34dolphin: cherry_pick_from_branches is used incorrectly, when it used to check drm-intel-fixes (and fall back to drm-intel-next-fixes) for checking if something is already backported
09:35dolphin: now it's checking the -next branches
09:35dolphin: it can't really work right for drm-xe-next either
09:49dolphin: https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/31
09:49dolphin: Should fix things
09:51dolphin: jani, sima: ^^ please take a look
09:52dolphin: thellstrom: looks like you have commit rights too
09:59thellstrom: dolphin: I don't have commit rights
10:01mlankhorst: all maintainers should
10:02dolphin: anyway, the original generalization patch dropped the tr '\n' ' ' and simply used the development branches to check for fixes
10:02dolphin: above restores the tr and uses -fixes branches to check for fixes
13:06alyssa: kode54: "never learned my alphabet from eh? to zed"
13:07kode54: lol
13:07kode54: I can't tell if Mike was being serious with his comment anyway
13:14HdkR: Mike is a very serious chair
13:16kode54: right, forgot who I was talking about
14:26demarchi: dolphin: thanks for the fix. it worked for me with the previous state of the branches, but because we didn't have fixes yet skip
15:17mareko: tomeu: is there any description of how to implement the teflon using shaders?
15:17tomeu: no, but for GPUs, it would be better to use the GPU delegate in tflite
15:18tomeu: I do have though a WIP branch that illustrates how NPUs with programmable cores would implement operations
15:18mareko: tomeu: do you mean the GLES 3.1 delegate?
15:18tomeu: yes, or the opencl one if your driver supports that
15:19tomeu: that is the best path for real GPUs (eg. with floating point HW)
15:19tomeu: to be more correct, I'm point to the GPU delegate with the OpenCL or GLES backends
15:21mareko: tomeu: if I don't have OpenCL and the GLES backend doesn't generate int8 dot products, would enabling teflon for GPUs make sense?
15:22tomeu: no, because you would have to recreate the GPU backend inside mesa
15:22tomeu: but if you HW does support int8 dot products, then probably yes, as the last time I asked, the tflite people weren't planning to target integers from the gpu backend
15:22tomeu: that was a couple of years ago though
15:23tomeu: there is an ARM extension for that, afair
15:23mareko: tomeu: your gallium interface mentions int8
15:23tomeu: though of course, you don't need that for teflon, you can just implement your shader in NIR
15:24tomeu: yes, (u)int8 is what embedded TPUs typically support
15:25mareko: tomeu: I see, so is there no differene between GL and OpenCL delegates because they both use float32?
15:25tomeu: well, there is differences, but yeah, because the GPU delegate targets floats, it is not adequate for TPUs, even if opencl can be implemented on top of them
15:26tomeu: I started my experiments by implementing enough of opencl to run the GPU delegate on the programmable core of the vivante NPU, and it was abysmally slow
15:26tomeu: then I realize that there wasn't any float HW in there...
15:27tomeu: at that moment I realized that I had to create a new delegate
15:28mareko: tomeu: the question is more about whether the GL delegate has the same performance as the OpenCL one, so that OpenCL one doesn't have to be used in a solution where GL is available
15:28tomeu: I have read reports of the CL backend of the GPU delegate being faster than the GLES backend of the same
15:28tomeu: but I guess that will depend on the driver implementations
15:30mareko: tomeu: do they use the same type, e.g. float32? or does the CL backend use a smaller type, e.g. float16, bf16, int16, uint16, int8, uint8?
15:30mareko: if they are both float32, it should be the same perf
15:31tomeu: I think they lower all to float32, but I'm not sure
15:31mareko: ok thanks
15:54Company: eric_engestrom: I've cc'ed you on https://gitlab.gnome.org/GNOME/gnome-build-meta/-/issues/791 - I'd be interested if you - or anyone else here really - had any opinion on the best way to do this
15:54Company: what version of Mesa to ship in the gnome nightly sdk
15:59eric_engestrom: Company: I saw the notification about being tagged earlier; reading now
16:01Company: I wasn't sure, that's why I decided to ping you
16:08eric_engestrom: yep, it could've been anyone with that username on that other GL instance :)
16:08eric_engestrom: replying now, but the short version is: I don't know 🤷
16:09Company: I was more thinking that you could've used an old email and never looked at that account again :p
16:10Company: or notifications are off or some gitlab update broke them or...
16:10alyssa: tomeu: for real GPUs and not TPUs, fixing tflite to take advantage of int8 dot products and such makes more sense than teflon, no?
16:11tomeu: I think so, yeah
16:11tomeu: guess their program manager just hadn't thought of that...
16:11tomeu: but it would be a big change
16:11alyssa: ( https://github.com/KhronosGroup/OpenCL-Registry/blob/main/extensions/arm/cl_arm_integer_dot_product.txt is theoretically easy to add to rusticl and support across mesa)
16:11alyssa: (since vulkan has those opcodes already)
16:11tomeu: yep, that is the one I pointed them to
16:13tomeu: but, a GPU executing dot products is going to be massively slower than a NPU with arrays of PEs
16:13tomeu: it is a trade-off between flexibility and performance
16:13tomeu: the NPU might not be able to run the model at all, or only with such big contortions that it is even slower
16:14tomeu: but if the model you want to run targets these "fixed-function" ASICs, then the performance difference should be abysmal
16:17alyssa: no argument there
16:17alyssa: (but if you don't have an NPU, I don't see where teflon would fit in)
16:17alyssa: (which seemed to be the ocntext for mareko 's questions)
16:37austriancoder:is looking for vk deqp that does depth_stencil scissored clears
16:54mareko: tomeu: int8 dot products are much faster than FP16 on our GPUs
16:55tomeu: yeah, I was referring to the overhead of having to fetch each instruction
16:56tomeu: alyssa: yeah, that seems to be the case, but I'm happy to be surprised :)
16:58mareko: it's really just a power consumption difference
16:59mareko: it would probably be less effort to use the GL delegate and select our FP16 dot products in Mesa for now
17:08karolherbst: alyssa: dot_product is already supported in rusticl :)
17:08karolherbst: there is "cl_khr_integer_dot_product"
17:08alyssa: nice
17:08karolherbst: the arm one is probably just a pre CL2 backport
17:08karolherbst: let's see...
17:09karolherbst: heh..
17:09karolherbst: well.. cl_khr_integer_dot_product is from 2021 :D
17:09karolherbst: yeah, the arm one is older
17:09karolherbst: mhh
17:09karolherbst: maybe I want to advertize that one
17:10karolherbst: given there are 0 tests, I'd have to check what the differences are
17:10karolherbst: ohhh..
17:10karolherbst: cl_arm_integer_dot_product -> OpenCL C, where the khr one is SPIR-V only
17:11karolherbst: ohh.. maybe the LLVM SPIR-V backend now works as they fixed my bug report :)
17:12jenatali: Ooh, nice. For LLVM 18?
17:12karolherbst: yeah
17:12karolherbst: https://github.com/llvm/llvm-project/issues/72864
17:12karolherbst: ehh it wasn't me who filed it
17:12karolherbst: but I pinged on it :D
17:13karolherbst: tomeu, alyssa: do you know _anything_ using cl_arm_integer_dot_product? I'd be inclined to add it if people thing there is value in it, but without tests it's kinda.. ehh.. annoying :)
17:14tomeu: mali I guess?
17:14tomeu: qcom also have their extensions for that...
17:14karolherbst: it's probably a 10 line patch (some integration with clang) to make it work or something as the compiler side is already done
17:14karolherbst: tomeu: I meant like which software
17:14karolherbst: or if they have tests for it
22:21gfxstrand: Lyude: Can you +1 my NVK 1.3 submission e-mail?
22:21gfxstrand: And any other board memebers who happen to see this message
22:33Company: congrats on that
22:34gfxstrand: Thanks!