02:42airlied: karolherbst, jekstrand, jenatali : fyi https://gitlab.freedesktop.org/airlied/mesa/-/commits/clover-cl3-wip
02:42airlied: just some hacks in the right direction
02:44jenatali: Awesome. Digging into Cl3.0 more is on my to-do list for the near-ish future
02:47airlied: jenatali: that might speed it up, there's a lot of return 0 from 2.x queries
02:47jenatali: airlied: Yeah, that was my impression but hadn't looked too closely, thanks
03:02airlied: karolherbst: uggh still haven't landed the IR apis
03:22jekstrand: airlied: Do you have a 32-bit clc you can throw me?
03:22jekstrand: airlied: Or do I need to learn how to build it?
03:26jekstrand:copies his spirv64 to spirv and calls it a day
03:42airlied: jekstrand: I do have one somewhere
03:43airlied: https://people.freedesktop.org/~airlied/scratch/spirv-mesa3d-.spv
03:52jekstrand: airlied: So... With the new clc loader, clover no longer fails to initialize if CLC isn't there.
03:52jekstrand: airlied: Any thoughts?
03:53jekstrand: airlied: We could make an "early" check for "do you have libclc" or something
03:54jekstrand: Where it just stats the file or something
03:54jekstrand: And hope that no one deletes a system-installed file between our stat and when we go to load for real
03:56jekstrand: Or we could just load it on clover start
03:56jekstrand: On my system, it's 0.68s if I have to compile and 0.21 if I don't
04:00airlied: hmm I think we could just add a chkec that we get a nir_shader back in clover
04:00airlied: and fail there
04:01jekstrand: It's deferred. :-/
04:01jekstrand: I've got something of a plan.
04:03airlied: ah yes
04:04airlied: a non-deferred stat the file api might be good then
04:08jekstrand: airlied: Do we know the pointer size at spirv init time?
04:09jekstrand: at clover init time rather
04:09jekstrand: Where we would want to do this check?
04:10jekstrand: Doesn't look like it
04:10jekstrand: Well, maybe....
04:11airlied: jekstrand: dev.address_bits()
04:11airlied: should be fine
04:12jekstrand: airlied: What's the overhead of a mmap if I never use the pointer?
04:14airlied: syscall and small address space usage
04:14jekstrand: airlied: Is it ok if the check mmaps the file?
04:14jekstrand: I've got stuff split so it avoids the mmap if it's not needed right now
04:14airlied: yeah that's probably fine, as long as we tear it down later
04:16airlied: okay my cl3 impl now passes test_computeinfo in CTS
04:22airlied: pmoreau, karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7039
04:26jekstrand: airlied: I thought the clc pointer was supposed to be deferred and yet clinfo invokes the compile
04:27remexre: hm, if I'm not doing anything funny with setuid, is there any reason why I'd be able to drmSetMaster, but not drmDropMaster?
04:28jekstrand: Oh, clinfo builds a program. Fantastic.
04:28jekstrand: Of course it does
04:28airlied: jekstrand: yeah it's annoying :-P
04:29jekstrand: still, it's about 0.5s on your first run so I'm not that inclined to care.
05:04remexre: nvm, looks like my issue is fixed in kernel 5.8
07:18pq: jadahl, async or non-block flag? That's a pretty important difference.
07:18jadahl: lack of async flag when doing mode sets made the EBUSY error I saw go away
07:19jadahl: err, lack of non-block
07:19jadahl: as in, I tried to *either* set ALLOW_MODESET *or* NONBLOCK
07:26pq: jadahl, sure, I was just literally concerned about the difference between async and non-block :-)
07:27pq: both exist in the UAPI, but ISTR async is very poorly supported overall
07:28pq: jadahl, yeah, id you do a blocking commit, it'll stall you until the operation is done, which means a following operation cannot EBUSY.
07:32daniels: Mesa repos are 404ing, it's transient
07:34daniels: (and it's fixed)
07:36jadahl: pq: exactly, it's either that or do them one after each other completes as you see hit EBUSYs
07:36jadahl: *one after the other after the other completes
07:41hakzsam: daniels: is the mesa-commit ML off?
07:46daniels: hakzsam: that's an indirect consequence of the mirror being broken, which I should be fixing on the weekend
07:47hakzsam: ok
12:34dschuermann: is there a generic vecN() builder helper in NIR?
12:37pendingchaos: dschuermann: nir_vec()?
12:38dschuermann: oh, thx
12:38dschuermann: didn't expect it so simple :)
13:25EdB: karolherbst: hello
13:26EdB: for the mem issue, the buffer is bind to the kernel like regular one
13:27EdB: maybe the problem lies here
15:16ajax: can i delete dri1 yet
15:20karolherbst: EdB: mhhhh
15:20karolherbst: EdB: well.. that's not wrong though
15:20karolherbst: resource_from_user_memory essentially tells the driver to make a buffer with a host pointer available to the GPU via a normal resource
15:21karolherbst: if it uses a userptr syscall to register it into the GPUs VM or does shadow copies is up to the driver
15:21karolherbst: but.. doing shadow copies usually doens't work out that well :/
15:21karolherbst: as you have some merge problems if both system change data
15:22karolherbst: so, the question is rather, what does resource_from_user_memory do on your system
15:22karolherbst: does it map it into the GPU VM via the proper syscall or does it some weird magic which wouldn't work?
15:28sravn: tzimmermann: Do you see ca6cf78322d1 ("fbdev: sbuslib: remove compat_alloc_user_space usage") in drm-misc-next?
15:28sravn: Arnd syas it in not in linux-next for some reason
15:30sravn: mripard, danvet_ ^^
15:30EdB: karolherbst: I'm not sure. I don't know this part and are no sure how to figurit out
15:30EdB: what it does is : si_buffer_from_user_memory
15:31karolherbst: EdB: is this on r600 or radeonsi?
15:31karolherbst: ahh, radeonsi
15:31EdB: that end up with :
15:31EdB: /* Convert a user pointer to a buffer. */
15:31EdB: buf->buf = ws->buffer_from_ptr(ws, user_memory, templ->width0);
15:31EdB: and then buf->gpu_address = ws->buffer_get_virtual_address(buf->buf);
15:31karolherbst: sooo
15:32karolherbst: amdgpu does the userptr syscall stuff
15:32karolherbst: so this looks fine
15:32EdB: after that, the buffer is bind as usual on clover side
15:32karolherbst: mhh, radeon seems to have something similiar
15:32karolherbst: EdB: is this with amdgpu or radeon?
15:32sravn: tzimmermann, mripard, danvet_ I think it is because said patches are lingering waiting for the merge window to conclude before you push to -next
15:33EdB: lsmod give me : drm 626688 8 gpu_sched,drm_kms_helper,amdgpu,radeon,ttm
15:33karolherbst: EdB: there might be some hardware constraints though. Maybe not all engines support using userptrs or something... I think it's best to ask inside #radeon if they know anything
15:34mripard: sravn: yeah, it's in my branch, but I'm not sure if drm-misc-next is pulled directly by linux-next
15:34karolherbst: EdB: ehh.. maybe lspci -v might tell you what driver is bound to your GPU :)
15:34mripard: I don't think so
15:34EdB: Kernel driver in use: radeon
15:34EdB: Kernel modules: radeon, amdgpu
15:34karolherbst: okay...
15:34EdB: so radeon it seems
15:35karolherbst: EdB: can you use amdgpu on your device? maybe it would be good to know if it works wiht that one...
15:35karolherbst: not sure
15:35karolherbst: anyway, not a radeon expert, so others should be able to help with those details :)
15:35karolherbst: I just know that the path works just fine for me when using SVM
15:35karolherbst: (and the CTS actually requires it for the SVM tests)
15:36EdB: I just need to find the proper kernel option for amdgpu
15:37sravn: mripard: thanks! I was afraid I had done something stupid on my side. Will answer Arnd so he knows what is goinf on
15:39arnd: sravn: I see "git://anongit.freedesktop.org/drm/drm-misc#for-linux-next" is pulled into linux-next, but that's something else, right?
15:40arnd: the last commit there is from mripard dated Sep 23
15:44danvet_: sravn, we're past feature freeze, drm-misc-next is for 5.11
15:44danvet_: arnd, ^^
15:44danvet_: so if this has to go into 5.10, it needs to be cherry-pick over to drm-misc-next-fixes
15:45arnd: no, it's not urgent, and hch sent a different series doing the same thing slightly better
15:46arnd: having it in linux-next would have perhaps avoided duplicating that work
15:57EdB: karolherbst: nice, it works :D
15:58karolherbst: EdB: ahh, so it is indeed only broken with radeon, but works with amdgpu?
15:58karolherbst: oh well :)
15:59karolherbst: I have no idea if the syscall is configured incorrectly or not
15:59karolherbst: but there is an OpenGL extension making use of it
15:59karolherbst: GL_AMD_pinned_memory
15:59karolherbst: and we have some piglit tests...
15:59EdB: well it's likes pushing bug under the carpet, but as long it's working, let go with it
15:59karolherbst: maybe those fail on radeon as well?
16:00karolherbst: maybe it's not even advertised.. dunno
16:00karolherbst: EdB: I think it is probably safer to disable user pointers if radeon is used as a driver
16:00karolherbst: or investigate why it is broken :)
16:03pcercuei: danvet_: I'm having problems with my email provider :( So I don't know if you answered. About my revert of the ingenic-drm "fully cached mmap" patch, what should I do? You said to cherry-pick -x it to drm-for-next-fixes but it does not apply cleanly there, is that OK?
16:14jekstrand: Oh... I think I know why I'm hitting stdout limits on iris but not nouveay....
16:14jekstrand: Piles of
16:14jekstrand: . evenCS SIMD8 shader: 41 inst, 0 loops, 390 cycles, 1:1 spills:fills, 2 sends, scheduled with mode top-down, Promoted 0 constants, compacted 656 to 480 bytes.
16:14jekstrand: CS SIMD16 shader: 47 inst, 0 loops, 448 cycles, 1:1 spills:fills, 2 sends, scheduled with mode top-down, Promoted 0 constants, compacted 752 to 576 bytes.
16:14jekstrand: CS SIMD32 shader: 88 inst, 0 loops, 650 cycles, 2:2 spills:fills, 2 sends, scheduled with mode top-down, Promoted 0 constants, compacted 1408 to 1232 bytes.
16:14jekstrand: CPU mapping a busy "buffer" BO stalled and took 0.236 ms.
16:18jekstrand: karolherbst: How do I shut off driver debug logging in the CTS?
16:18karolherbst: ehh
16:18jenatali: jekstrand: Build a release build?
16:18karolherbst: :p
16:18karolherbst: set some env variables?
16:18jekstrand:doesn't even know how these callbacks get registered
16:19karolherbst: we enable some mesa debugging stuff by default for debug builds
16:19karolherbst: I doubt the CTS enables anything here
16:19karolherbst: jekstrand: try MESA_DEBUG=silent
16:20jekstrand: That got rid of most of it
16:27jekstrand: Some of them are getting logged through the CTS
16:28pmoreau: airlied: I’ll try to find some time to look at it this week.
16:29jenatali: jekstrand: Clover might be invoking the callback associated with the CL context?
16:29jekstrand: It is
16:29jekstrand: It's invoking the notify callback
16:29karolherbst: jekstrand: ohh, right the compiler stuff
16:30karolherbst: oh well...
16:31jekstrand: Pass 102 Fails 4 Crashes 0 Timeouts 0
16:31jekstrand: hiloeo is still overflowing the test runner's stdout handling
16:31jenatali: Heh
16:31jekstrand: I'm kind-of surprised we're running into limits there. It just uses python's communicate()
16:31jekstrand: fails:
16:31jekstrand: basic bufferreadwriterect
16:31jekstrand: basic hiloeo
16:31jekstrand: basic kernel_call_kernel_function
16:32jekstrand: basic kernel_memory_alignment_local
16:32jekstrand: Those are all real
16:32karolherbst: jekstrand: why does it overflow it? hiloeo doens't print all that much
16:32jekstrand: except hiloeo
16:32jekstrand: karolherbst: Iris dumps a message every time you map a busy BO
16:32karolherbst: the last two are local mem alignment
16:32karolherbst: jekstrand: ehh :/
16:32karolherbst: make iris not do that? :p
16:33pmoreau: airlied: I saw you made it possible to select whether to validate or not the SPIR-V, but did you need to also modify `is_valid_spirv()`? That function’s only purpose is to perform validation, so why not call it conditionally instead of making it conditionally do nothing? 😄
16:33jekstrand: Normally, it doesn't unless you specify INTEL_DEBUG=perf
16:33karolherbst: ahh
16:33jekstrand: Or specfiy your own handler which the CTS does
16:33karolherbst: ehhh
16:33karolherbst: annoying
16:33jekstrand: pmoreau: I had the same question. :-)
16:33pmoreau: ;-)
16:44jekstrand: karolherbst: https://gitlab.freedesktop.org/karolherbst/opencl_cts_runner/-/merge_requests/1
16:48karolherbst: ahh, cool
16:48karolherbst: I feared something bigger would be required to deal with it
17:17jekstrand: airlied: Merry Christmas! https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047
17:24ajax: nice
17:25jekstrand: There's a pipe-loader patch in there that people might have opinions about
17:25jekstrand: anholt, maybe?
17:26anholt: sure
17:27anholt: I mean, clover should really just switch to the static loader like dri, imo.
17:28jekstrand: Yeah.....
17:28EdB: karolherbst: an open_cts_runner git, nice :)
17:42mareko: airlied: if there are no visible changes in the image, you can just update the diff
18:15danvet_: pcercuei, Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
18:16airlied: jekstrand: \o/ :-P
18:29airlied: jenatali: oh one other 3.0 tidbit, you do have to add all the 2.x APIs just they return invalid operation
18:29jekstrand: That's a bit annoying but how many are there? 20ish?
18:30jenatali: airlied: Yeah, I figured as much. I started CLOn12 from the loader example ICD, so I have stubs for all the APIs anyway
18:32jenatali: airlied: https://github.com/microsoft/OpenCLOn12/blob/master/src/stubs.cpp
18:32jenatali: Though I guess I should fix their error code...
18:32airlied: jenatali: yeah it's only about 20
18:32airlied: jekstrand: ^
18:41EdB: airlied: I merged the CL 1.2 stuff, most of it is in master now
18:43EdB: your 3.0 branch history is a bit rough :)
18:57airlied: EdB: yeah I'll rewrite it once I get to the end
18:57airlied: the WIP was more to let other people know not to start on it :-P
18:58EdB: airlied: there is also this "{ 300, clang::LangStandard::lang_opencl12}," but im not sure LLVM already have the 3.0 define
18:59airlied: it's also legal to only support CL C 1.2
18:59airlied: I have to read up on what exactly to return though
19:01EdB: ah yeah 3.0 is 1.2 without 2.0 stuff :D
19:02airlied: nearly passes all the api consistency test from 3.0 cts
19:03EdB: well I've grated piglit test that didn't pass 1.2 where as CTS do
19:03EdB: *created
19:03EdB: some time the CTS api check is more tolerant than the spec
19:13EdB: I think that CL_DEVICE_LATEST_CONFORMANCE_VERSION_PASSED should return "" instead of CTS
19:14airlied: EdB: probably, just awnted to give it somethihng :-P
19:20EdB: airlied: yeah CTS is tolerant
19:20EdB: clCreateBufferWithProperties should no have been ok
19:21EdB: for exemple :)
19:22airlied: EdB: it barely tests that, though why not, from what I can see no properties is the only spec case
19:23EdB: you don't fill r_err_code
19:23airlied: ah fail
19:23EdB: I've put somme comments on the MR
19:24EdB: anyway, it's a WIP do :p
19:24EdB: *so
19:28airlied: yay found one bug in get image format supported :-P
19:34airlied: EdB: thanks, will fix up those
19:41airlied: jekstrand: oh an clCloneKernel seems non-optional
19:41airlied: jenatali: ^
19:41jenatali: airlied: Good to know
19:41airlied: jekstrand: my tab complete dislikes you two :-P
19:41jenatali: You're not the only one :P
19:41jenatali: karolherbst tags me all the time when he means jekstrand
19:43karolherbst:the first two letters should be unique in IRC
19:43karolherbst: uhm..
19:43karolherbst:thinks ...
19:44HdkR: That's a very small selection of characters to choose from
19:44vsyrjala: unicode is big
19:45karolherbst: vsyrjala: some IRC servers are stupid and don't support UTF-8 usernames
19:45karolherbst: well most :p
19:46airlied:isn't sure I know enough clover to implement clCloneKernel yet
19:47EdB: airlied: also, other f° return NULL instead of nullptr
19:47karolherbst: return kernel(old_kernel) :p
19:47karolherbst: should do it, no? :D
19:48airlied: karolherbst: hehe that might be my first iml :P
19:48jenatali: Gotta make sure it appropriately refcounts all the bound objects though
19:48EdB: I didn't look too close but there are talking about ref count
19:48jenatali: I haven't read enough of Clover to know if that's going to be automatic or not :P
19:49EdB: so it may need so adjust
19:49EdB: *some
19:49EdB: anyway, should not be that hard
19:51jekstrand: jenatali: I just updated the libclc loader MR. You can now pass -Dlibclc-static=spirv64 to get just spirv64 support. If someone tries to use 32-bit, it'll try to load from disk and fail.
19:51jenatali: jekstrand: Awesome! I'll take a look
19:51pcercuei: danvet_: it's already reverted in drm-misc-next, but I can't cherry-pick it cleanly to drm-misc-next-fixes
19:52pcercuei: I can revert it in drm-misc-next-fixes too, but it will break next time you merge the two
19:53danvet_: pcercuei, yeah hence a) make sure you resolve drm-tip rebuild correctly
19:53danvet_: and b) tell drm-misc maintainers
19:53danvet_: but yes we must fix this in drm-misc-next-fixes
19:53danvet_: pcercuei, oh just noticed I didn't past the link :-/
19:54pcercuei: drm-misc maintainer is not you?
19:54danvet_: no
19:54airlied: karolherbst: we delete the kernel copy constructor :-P
19:55danvet_: pcercuei, https://lore.kernel.org/dri-devel/CAKMK7uEB7xHgnSpnT=Hd3Cw2+uwkimF=4uQuw3NOYz1DsnMY7g@mail.gmail.com/
19:55pcercuei: thanks
19:55danvet_: so yeah first part: this must be cherry-picked to drm-misc-next-fixes
19:55danvet_: should have landed there in the first place
19:56pcercuei: I know. Sorry about that
19:56pcercuei: what does "rebuilding drm-tip" means?
19:56danvet_: and misc maintainers are tzimmermann mlankhorst (not around right now) and mripard
19:56danvet_: they're also in MAINTAINERS
19:56danvet_: pcercuei, the thing that will scream at you when you run dim push
19:56danvet_: even tells you which doc section to consult
19:59pcercuei: alright, pushed to drm-misc-next-fixes. "dim push" didn't complain
20:01pcercuei: mripard: there will be some trouble in ingenic/ingenic-drm-drv.c when merging drm-misc-next... my fault, sorry about that
20:01pcercuei: also sorry about not doing this over the mailing list, my mail server is down since monday
20:02EdB: ah I think I already see those clover image fix in the branch that try to add image support :)
20:03EdB: since I was hesitating beteween adding CL3 api or looks at image support, I may goes for the second one now :)
20:03danvet_: pcercuei, indeed git seems to understand what's going on I guess
20:03pcercuei: it does? That's a good news: )
20:04jenatali: EdB: I think karolherbst was also going to look at image support
20:04karolherbst: yeah
20:04karolherbst: but fixing images looks very nasty
20:04EdB: so I'm unemployed now :p
20:04jenatali: jekstrand: New patches look great, thanks for doing that
20:04jenatali: jekstrand: And thanks for fixing up my old patches :)
20:05karolherbst: EdB: you could fix CTS fails :p
20:05karolherbst: or run applications and see how badly those fail :D
20:05EdB: I tryed blender
20:06EdB: and it look like there is some thinks to fix into LLVM
20:06airlied:has some llvmpipe bits to fix like load/save scratch
20:06EdB: and I'm not sure I'm read for that
20:07karolherbst: mhh
20:07karolherbst: yeah.. I guess fixing CTS bugs is the way to go then
20:07karolherbst: but I suspect that most bugs actually lurk in the compiler
20:07airlied: EdB: I've added some TODO to the 3.0 MR
20:08airlied: I'd not be unhappy to get clone kernel or context destructor done :-P
20:08karolherbst: mhhh
20:08karolherbst: jekstrand: images were bound via set_compute_resources, no?
20:08karolherbst: mhhhhhh
20:09karolherbst: I really think we should let clover do the same as st/mesa for iamges...
20:09karolherbst: and I think I should just fix that up
20:10karolherbst: soo.. read only images are bound via sampler states and sampler views, which is cool
20:10karolherbst: that probably just works
20:11karolherbst: yeah.. cool
20:11EdB: karolherbst: aaron have a branch with image stuff
20:12karolherbst: git?
20:12EdB: https://gitlab.freedesktop.org/awatry/mesa/-/tree/clover-image-support
20:12karolherbst: heh
20:12karolherbst: airlied: https://gitlab.freedesktop.org/awatry/mesa/-/commit/2df1af0b32ad69fe2702235deb65fddf596d93fc :p
20:13karolherbst: EdB: yeah... that's not the right thing..
20:13karolherbst: I think if we keep the way clover does images today it will cause pain
20:14karolherbst: we essentially have to get rid of what clover does, as this is different from st/mesa
20:14karolherbst: which is bad
20:15airlied: pmoreau, jekstrand : https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7051
20:15airlied: karolherbst: lols
20:17EdB: maybe we should RB the aaron one one and push it to master since it have a better commit messages :)
20:17karolherbst: nope
20:17karolherbst: keeping the current stuff is the wrong path
20:18karolherbst: set_compute_resources shouldn't be used for images
20:18karolherbst: period
20:18EdB: I was talking about the image check fix
20:18karolherbst: ahhh
20:18karolherbst: yeah, we can pull that out :p
20:18EdB: the branch is clearly a WIP :)
20:18karolherbst: heh.. only clover is using set_compute_resources
20:18karolherbst: EdB: ohh, I wasn't comenting whether it is technically correct, it's just wrong to keep using set_compute_resources
20:18karolherbst: we should just remove this callback
20:19karolherbst: drivers just implement this function for no reason
20:21karolherbst: mhhh
20:21karolherbst: airlied: do you know how the image stuff differs from normal samplers inside gallium?
20:21karolherbst: feels like there isn't really a big difference
20:22karolherbst: ahh, set_shader_images
20:22airlied: its mostly like GL images vs textures
20:22karolherbst: yeah
20:22karolherbst: and we get pipe_image_view objects
20:23karolherbst: nice
20:23karolherbst: yeah.. I think I will rip this set_compute_resources out of mesa :)
20:23karolherbst: smells like a -250 + 50 loc change
20:23airlied: the only thing is if formats matter
20:23airlied: since cl images can operate different i think
20:24karolherbst: well..
20:24airlied: but it definitely a dig into it and see how far you grt
20:24karolherbst: we can always extend the API a little
20:24jenatali: CL images at the API have formats that map pretty much 1:1 with GL
20:24jenatali: It's only the bits in the kernel that don't know what the format's going to be
20:24airlied: get type of problem
20:25karolherbst: jekstrand: so with your patches the readimage tests are passing :)
20:25jenatali: That was fast
20:26karolherbst: jenatali: huh? readimages are just implemented as texture+sampler things in clover :p
20:26jenatali: Yeah, makes sense
20:26karolherbst: so this just works
20:26karolherbst: the problem is how it deals with write images
20:26EdB: airlied: CL_INVALID_OPERATION is not valid for clCloneKernel. CL_​OUT_​OF_​RESOURCES is probaly what fit best
20:26karolherbst: it uses a callback which... everybody implements
20:26karolherbst: but
20:27karolherbst: it's only used from clover
20:27karolherbst: so.. essentially it's like if the code wouldn't exist :p
20:35airlied: EdB: that was deliberate, dont want to pass any tests with out a real implementation
20:41EdB: karolherbst: I think that most of my CTS worlk now, except for printf but there is a libclc patch validaded that need to land and my pending MR
20:42EdB: there is also a bug with kernel calling kernel but it specif to llvm amd native
20:42EdB: image test are skipped do they don't fails :)
20:43EdB: *so
20:45EdB: ah I've an adjstment to make in one of the getinfo fonction when 1.2 will be set because a behavior change but there is a piglit test to track for ot
20:45EdB: *it
20:47EdB: ad for the application darktable and blender would be nice to have but they both need image support if I'm not work
20:47EdB: wrong
20:49EdB: and blender opencl function are quite heavy to deal with
20:49EdB: if you people think abour another nice application, drop it :)
20:56EdB: airlied: I can implement the context destructor callback
20:59airlied: EdB: cool
21:00airlied: im not in any hurry for the cl3 stuff was mostly trying to dig out what in depth changes are needed
21:01EdB: well I think it can be made in 15m top without test :)
21:01EdB: just give the time to track your branch
21:01EdB: :p
21:02jekstrand: karolherbst: \o/
21:03karolherbst: airlied: how can I enable opencl with llvmpipe again?
21:03airlied: LP_CL=1
21:04karolherbst: huh..
21:04EdB: huhu you have tons of branches :)
21:04karolherbst: why doesn't llvmpipe report PIPE_SHADER_CAP_MAX_SHADER_IMAGES?
21:05airlied: karolherbst: should do
21:05karolherbst: git grep says no
21:06jekstrand: llvmpipe+clover should be called pipeCLeaner
21:07karolherbst: ....
21:07karolherbst: Unsupported intrinsic: vec2 32 ssa_34 = intrinsic image_size (ssa_33, ssa_11) (1, 0, 0, 8) /* image_dim=2D */ /* image_array=false */ /* format=0 */ /* access=8 */
21:07karolherbst: ehhh
21:07jekstrand: CL puns are too easy
21:07airlied: karolherbst: its not in llvmpipe
21:07karolherbst: ahhh
21:07airlied: gallivm has caps
21:08jekstrand: karolherbst: Yeah, the CL CTS likes image_size
21:08karolherbst: ehh.. I guess I'll debug what I messed up then
21:08jekstrand: karolherbst: For read-only images, we lower it to TXS but you need to support the intrinsic for read/write images
21:08jekstrand: or for write-only. This is CL 1 after all :)
21:09karolherbst: jekstrand: :p https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/e5f8ed1a494ae61d47ea8e0a698ce7654cdfccb7
21:09karolherbst: heh, yeah
21:09karolherbst: but I do in nouveau
21:09karolherbst: but I still got some memory issues :/
21:09karolherbst: I suspect I have to do more
21:09jekstrand: karolherbst: \o/
21:09jekstrand: karolherbst: compute_resources needs to die
21:09karolherbst: yeah...
21:10jekstrand: or set_surfaces
21:10karolherbst: but "legacy" paths uses it for constant buffers :p
21:10karolherbst: set_compute_resources Lo
21:10karolherbst: :p
21:10karolherbst: but once images are split out
21:10jekstrand: Yeah, it's set_compute_resources that needs to die
21:10karolherbst: we can do the real pipe_constant_buffer stuff
21:10karolherbst: and everybody will be happy
21:10jekstrand: THat'd be so nice
21:11karolherbst: but now I am wondering if I need to do something besides set_shader_images...
21:11airlied: karolherbst: llvmpipe has image_deref_size
21:11karolherbst: airlied: get rid of derefs :p
21:11jekstrand: Not sure. I just hacked up set_compute_resources and hated every minute of it. :-P
21:12karolherbst: jekstrand: right...
21:12karolherbst: jekstrand: thing is, bind_surface calls into create_surface
21:12karolherbst: but I don't see how that's relevant :D
21:12jekstrand: karolherbst: Though I may NAK any attempt to do so in nouveau to make you fix it. :-P
21:13jekstrand: dcbaker[m]: Can you take a second look at the CLC loader. I reworked a bunch of stuff and may have biffed the meson.
21:13karolherbst: jekstrand: the only time I touch set_compute_resources is when I remove it :p
21:14airlied: set compute resources is so r600
21:15jekstrand: karolherbst: Good. Keep it that way!
21:16jekstrand: airlied: Did you have any thoughts you wanted to share on the CLC loader? I reworked it today and jenatali reviewed it.
21:16jekstrand: airlied, karolherbst: I'm just trying not to step on toes here.
21:16jekstrand:can never tell when everyone is happy with hot-and-fast development or if they want to give input.
21:16airlied: jekstrand: nope all good here, land please
21:17jekstrand: Ok. I'll give dcbaker[m] a chance to tell me I'm doing meson 100% wrong and then we'll start the marge train
21:23jekstrand: karolherbst, airlied, dcbaker[m]: Right now, configure is checking for libclc but is it actually checking for a version that contains the spirv?
21:23karolherbst: I am sure it doesn't check that
21:24dcbaker[m]: probably not, I didn't know what version to check
21:24dcbaker[m]: 10?
21:24jekstrand: 12, probably
21:24jekstrand: assuming it's versioned together with LLVM
21:24karolherbst: it's not
21:24jekstrand: of course not
21:24karolherbst: libclc is 0.2 here ;p
21:25karolherbst: let's see
21:25jenatali: Yeah, maybe we should bump the version so we can tell when it has a SPIR-V blob in it?
21:25jekstrand: Then we'll probably need to get someone to do a release. Who would that someone be? Stellard?
21:25EdB: not sure libclc saw a version bump since a long time
21:25dcbaker[m]: jekstrand: I only saw two things in meson, one looks like a testing bit you left on, and one where you've changed approach and need to test for mmap
21:26karolherbst: yeah.. I think adding spirv justifies bumping the version to 0.3
21:26EdB: yeah tsellard is usally the one that deal with libclc
21:29jekstrand: dcbaker[m]: Right.... the mmap thing may cause problems on Windows
21:29airlied: i think he might be still on parental leave
21:29dcbaker[m]: jekstrand: I'm pretty sure mmap is a unix-ism, but just making configure bail seems reasonable
21:30EdB: airlied: compile time is killing my 15m promises :)
21:30jekstrand: dcbaker[m]: The problem is that, as of my refactor, we always compile all the code paths (except the zstd one) all the time
21:30dcbaker[m]: oh, yeah, that's gonna break windows
21:30jenatali: Oh, yeah, it is
21:31jekstrand: I guess I could ditch the fall-back and say you have to either be static or dynamic
21:31jekstrand: And if you say -Dlibclc-static=spir64 then 32-bit just doesn't exist.
21:32jenatali: That's probably fine
21:32jekstrand: Yeah, I think so. I'll make that change.
21:32jenatali: FWIW Windows does have equivalent functionality to mmap, it's just different API names and much more verbose
21:33jekstrand: Yeah.... It's easy enough to roll back.
21:33jekstrand: It's probably better anyway
21:34airlied: jekstrand: fixing configure will make it messy for hacky ppl to drop the files into libclc installs :-p
21:34jekstrand: airlied: Yeah... I know. Someone has to suffer. :-P
21:38karolherbst: jenatali: I guess that changes in a few years until windows finally switches over to Linux as the kernel :p
21:38jenatali: :P
21:40EdB: airlied: pushed
21:44jekstrand: dcbaker[m]: What's with the -3 comment?
21:45dcbaker[m]: never mind that one, I misread the man page
21:45dcbaker[m]: I was reading the --fast option, not the -f option
21:46jekstrand: Yeah
21:46jekstrand: -f makes it overwrite the file
21:46jekstrand: Which is necessary to avoid build breakage if it ever chagnes
21:47jekstrand:uninstalls zstd and re-builds just to sanity check
21:51jekstrand:assigns marge
22:00airlied: daniels: oops, btw it seems to run piano twice on radeonsi
22:03daniels: tomeu: ^
22:06daniels: given it takes about a year to run anything on those AMD APUs, eliminating as many traces as we can is definitely a good thing tbh
22:06EdB: clSetKernelExecInfo is a strange implementation
22:06daniels: airlied: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7052
22:07EdB: I'm having hard time understanding it purpose
22:14karolherbst: ehhh
22:14karolherbst: that images stuff should work, but somehow the shader accesses an invalid address mhh
22:17jekstrand: :-/
22:18daniels: jekstrand, karolherbst, dcbaker[m], jenatali: a version check isn't sufficient for libclc, because you have to make sure the target is actually built as well
22:18jekstrand: daniels: So we should just check that the files we're looking for exist?
22:18jekstrand: dcbaker: I assume that such a check exists in meson?
22:19dcbaker[m]: yeah, meson has the fs module for that
22:19daniels: yeah, and whoever complains that they only install the libclc spv into DESTDIR and not sysroot gets to eat the pain
22:19daniels: or just keep it as a runtime failure with a competent error message rather than doing a build-time check
22:19dcbaker[m]: of course, that was added in meson 0.53 and I think we depend on 0.52... sigh
22:20daniels: (assuming you're loading it dynamically rather than inlining it through xxd)
22:20karolherbst: jekstrand: can you give my commit a go on top of whatever you had and see if you can figure out if I did something terribly wrong?
22:20dcbaker[m]: sigh, yes, we require >= 0.52
22:21jekstrand: karolherbst: What commit?
22:21karolherbst: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/e5f8ed1a494ae61d47ea8e0a698ce7654cdfccb7
22:21karolherbst: that's a bit hacky, but it sould be enough for test_basic writeimage
22:22karolherbst: but maybe iris tells us what I did wrong or something? dunno..
22:26karolherbst: ehhh wait...
22:26karolherbst: why is it a u32 image...
22:26Kayden: can we require 0.53 if you're using CLC? :D
22:26Kayden: "Oh you said that option? NOPE."
22:27karolherbst: jekstrand: could it be that our image format inside the shader is just busted?
22:28karolherbst: ahh, no, it does use write_imagef
22:28karolherbst: mhh, int coordinates
22:29jekstrand: Uh... lots of fails....
22:29karolherbst: jekstrand: but any idea what goes wrong?
22:31karolherbst: ohhh.. crap.. I think I know
22:31jekstrand: karolherbst: I've got piles of fails in my image branch. Not sure why
22:31jekstrand: Maybe too much rebasing?
22:31karolherbst: ahh
22:31karolherbst: maybe
22:31karolherbst: but
22:31karolherbst: "format=0"
22:31karolherbst: this somehow sounds very wrong
22:32jekstrand: Yeah, that sounds wrong
22:33karolherbst: and because we do in shared coords adjustments based on the format, running out of bound could be a side effect or something.. dunno
22:35karolherbst: mhhh
22:35karolherbst: OpStore %_compoundliteral23 %vecinit25 Aligned 8
22:35karolherbst: jenatali: how can I deduce what format the image has?
22:36karolherbst: ehh "OpTypeImage %void 2D 0 0 0 0 Unknown WriteOnly" :/
22:37karolherbst: what the
22:37jekstrand: Yup. It's a thing
22:37karolherbst: https://gist.github.com/karolherbst/7deb1d904f156dd43e8384ce9f44a84d
22:37karolherbst: :/
22:38karolherbst: jekstrand: I guess we have to deduce from the arg type?
22:38karolherbst: OpImageWrite %100 %103 %104
22:38karolherbst: int coords and float values
22:38jekstrand: karolherbst: You always have int coord
22:38karolherbst: okay
22:38jekstrand: karolherbst: But the value type is now the intrinsic's dest_type
22:38karolherbst: ahhh
22:39karolherbst: so I guess I should use that instead?
22:39Kayden: For...writes?
22:39jekstrand: src_type for reads
22:39jekstrand: dest_type for reads src_type for writes
22:39jekstrand: sorry
22:39karolherbst: intrinsic image_store (ssa_105, ssa_102, ssa_4, ssa_97, ssa_105) (1, 0, 0, 8, 160) /* image_dim=2D */ /* image_array=false */ /* format=0 */ /* access=8 */ /* src_type=float32 */
22:39karolherbst: ahh, okay
22:39karolherbst: jekstrand: is that some vulkan/CL only thing then?
22:39karolherbst: because for now I ignore src_type
22:40karolherbst: all I can consume for now is a pipe format :)
22:42karolherbst: mhhh
22:42jekstrand: karolherbst: Just pushed my branch again. I had to fix a few things in the rebase
22:42karolherbst: sadly that doesn't change a thing
22:44karolherbst: yeah.. let me try with iris :)
22:44jekstrand: iris is still failing a bunch
22:44karolherbst: mhh
22:44karolherbst: did you got writeimages to pass before?
22:44jekstrand: I think so
22:46jekstrand: karolherbst: Looks like writeimage is failing right now
22:46karolherbst: test_basic: ../src/intel/isl/isl_storage_image.c:196: isl_lower_storage_image_format: Assertion `!"Unknown image format"' failed. :)
22:46jekstrand: let me see if I can sort it out
22:47karolherbst: yeah.. would be cool to have something working with the old path
22:47jekstrand: karolherbst: set INTEL_DEBUG=norbc
22:47karolherbst: mhh okay
22:47karolherbst: old path works
22:48karolherbst: new path still gives me test_basic: ../src/intel/isl/isl_storage_image.c:196: isl_lower_storage_image_format: Assertion `!"Unknown image format"' failed.
22:48karolherbst: mhhhh
22:48jekstrand: let me see if I can sort that out
22:48jekstrand: ugh... right. BGRA
22:49karolherbst: I am not even sure that's correct what I am doing there...
22:49jekstrand: karolherbst: There's a thing in my "iris: HACKS" patch which fixes that for the old path by not advertising support for PIPE_BIND_COMPUTE_RESOURCE together with BGRA
22:49karolherbst: ahh
22:50jekstrand: Technically, we can handle it. We just have to throw in a swizzle and I was lazy.
22:50karolherbst: mhhh... let's see
22:50karolherbst: the API exposes what is supported though
22:51jenatali: karolherbst: Sounds like you got your answer from jekstrand? Or did you still have a question for me?
22:51karolherbst: yeah
22:52karolherbst: jekstrand: heh...
22:52karolherbst: somethign is weird
22:53daniels: Kayden: nope, that's circular - once you're running in non-uniform Meson logic, it's too late to require a specific Meson version, because the version might affect the interpretation of the conditionals
22:55karolherbst: jekstrand: ahh...
22:55karolherbst: bgra is required :D
22:55karolherbst: that's why the CTS just uses that
22:56jekstrand: karolherbst: Of course it is. :)
22:56karolherbst: but why does the old path doesn't crash...
22:56Kayden: daniels: makes sense. probably better for people just to upgrade anyway
22:56karolherbst: uhm
22:56karolherbst: not crash
22:57jenatali: What's "new path" vs "old path" here?
22:57jenatali: Just curious
22:57jenatali: I skimmed too fast apparently
22:57karolherbst: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/e5f8ed1a494ae61d47ea8e0a698ce7654cdfccb7
22:57jenatali: Ahh
22:58karolherbst: mhh
22:58karolherbst: let's break in set_compute_Resources
22:58karolherbst: PIPE_FORMAT_B8G8R8A8_UNORM
22:58karolherbst: okay.. hey
23:02karolherbst: jekstrand: apparently access and shader_access matter :p
23:02karolherbst: it passes now
23:04karolherbst: yay... I guess
23:04karolherbst: so with iris getting rid of set_compute_resources is quite trivial :p
23:04karolherbst: just have to figure out those access bits
23:06karolherbst: jekstrand: LOL...
23:06karolherbst: with fixing access and shader_acces it passes with nouveau as well :D
23:07karolherbst: nice :)
23:08karolherbst: or maybe because I pulled in your entire branch.. dunno :D
23:09karolherbst: yeah...
23:09karolherbst: oh well
23:20karolherbst: nice
23:21karolherbst: with iamges enabled: Pass 91 Fails 7 Crashes 8 Timeouts 0
23:21jenatali: That's totally progress in the right direction :P
23:21karolherbst: :D
23:21karolherbst: not my fault that test passes if you don't support the feature :p
23:23karolherbst: mhh, 3d images borked oh well
23:23karolherbst: okay.. first cleaning up what I really need to maybe just submit an MR? dunno :D
23:23karolherbst: I think porting the stuff over is probably the first step...
23:24karolherbst: not sure how it bites r600 though
23:24karolherbst: or radeonsi
23:24airlied: radeonsi doesn't work anyways
23:24airlied: r600 who knows (caes? :-P
23:24airlied: cares
23:26AndrewR: as remartk, I think ffmpe (4.3, git at least) depend on OpenCL 1.2 + images for some filters ( a bit faster to build than darktable/blender? with fewer deps)
23:27karolherbst: heh... I guess when I just blender now things just break terrible
23:28karolherbst: what...
23:28karolherbst: no compatible GPUs found?
23:28karolherbst: blender lies to me
23:29karolherbst: I think blender is just annoying and checks against AMD or whatever
23:33karolherbst: let's see if luxmark works :D
23:34AndrewR: karolherbst, :}
23:35karolherbst: ahh heh
23:35karolherbst: the compiler crashes :D
23:35karolherbst: ufff
23:36karolherbst: uff
23:36karolherbst:cries in ssa
23:36karolherbst: vec1 64 ssa_19182...
23:36karolherbst: jekstrand: wanna debug a nir sahder and why nir_opt_deref messes up? :D
23:37karolherbst: oh wow
23:37karolherbst: this kernel is insane
23:38karolherbst: " vec1 64 ssa_17793 = deref_struct &ssa_17792->field5 (global struct.DecomposedTransform) /* &(*(struct.InterpolatedTransform *)ssa_17789)[0].field5 */" :D
23:38karolherbst: the spaces are real
23:39karolherbst: jenatali: mind testing your stuff with luxmark 4? :D
23:39karolherbst: :D
23:39karolherbst: http://wiki.luxcorerender.org/LuxMark_v4 :p
23:40karolherbst: I actually like those kind of CL benchmarks where you actually see stuff
23:40jenatali: Sure, why not
23:40karolherbst: have fun :p
23:42karolherbst: jekstrand: we need a function to print a single nir instruction...
23:43karolherbst: mhhh
23:43karolherbst: is_vector_bitcast_deref
23:43karolherbst: glsl_get_bit_size(parent->type)
23:43karolherbst: but parent->type is a struct
23:44karolherbst: well.. union
23:44karolherbst: and glsl_get_bit_size really doens't like that
23:44jenatali: karolherbst: How far into it did you end up getting?
23:44karolherbst: compiling kernels :p
23:45jenatali: karolherbst: I got ~50% through rendering I think but then it looks like it barfed after that?
23:45karolherbst: wait... it compiles for you?
23:45karolherbst: unfair
23:46karolherbst: the... heck?
23:46karolherbst: it casts an union to a float :D
23:46karolherbst: jekstrand: I am sure you like that :p
23:47jenatali: [LuxCore][104.922] Saving persistent PhotonGI cache: scenes/hallbench/hallbench.pgi
23:47jenatali: RenderSession starting error:
23:47jenatali: ios_base::failbit set: iostream stream error
23:47karolherbst: lol...
23:47jenatali: No idea what happened
23:48karolherbst: libclc probably
23:48karolherbst: or something else :p
23:48karolherbst: 4 is in alpha/beta whatever
23:48jenatali: Nah, that's after 105 seconds of running kernels
23:48karolherbst: maybe we should try out 3?
23:48karolherbst: http://wiki.luxcorerender.org/LuxMark#Binaries
23:48karolherbst: but 4 is the first one with real linux binaries afaik
23:48karolherbst: ahh.. 3.1 has linux as well
23:52karolherbst: ehh.. decl_var uniform INTERP_MODE_NONE sampler @0 (~0, 0, 0)
23:54karolherbst: mhhh
23:54jenatali: I'm trying food now instead of hallbench and now I see it taking a while to compile kernels
23:55karolherbst: something is wrong with assigning uniform locations
23:55karolherbst: mhhh
23:55karolherbst: https://gist.githubusercontent.com/karolherbst/2c7542aeecd641ce24cff68a4a7a0fe0/raw/0df821105fe0ec20910bc7f9328c62a87a47feff/gistfile1.txt
23:56jenatali: I'm up to 9GB of RAM used by this shader compiler...
23:57jenatali: 10GB...
23:59airlied: may be one case where inline all the things is a bad plan
23:59jenatali: Yeah... let me try adding the patches that optimize *before* inlining...