00:44jekstrand: imirkin: Yeah, but we can pack/unpack around it
01:41mareko: packing 64 bits like 2x32 and splitting them to different slots seems fine to me
01:43mareko: I would also say that 16-bit varyings are 1000x more important than 64-bit varyings
02:19Kayden: yeah, exactly...imirkin is right, they need to be aligned unless we bother splitting and recombining on both ends. which would be fine to do. but, really, -EDONTCARE about 64-bit varyings...aligning may be a bit wasteful, but *shrug*
02:19Kayden: whichever's easier really
02:19Kayden: packing 16-bit varyings is definitely a thing we want though
11:54EdB: hello karolherbst
12:02EdB: I was sure that we read in the spec that row_pitch and slice have to be filled Oo
12:02EdB: but didn't manage to find it back
12:06karolherbst: it's in clEnqueueMapImage
12:06karolherbst: image_row_pitch and image_slice_pitch args
12:07karolherbst: in imageCreate we only do it for internal reasons afaik.. and it would be important for host pointers anyway
12:07karolherbst: but I think images with host pointers are also just broken atm
12:07karolherbst: but maybe we should keep it 0 if it's not application provided?
12:08karolherbst: but we also have CL_IMAGE_SLICE_PITCH mhhh...
12:09karolherbst: I think the painful part is that the pitches can be different depending on how you get it
12:09karolherbst: clGetImageInfo might return something different compared to clEnqueueMapImage
12:09karolherbst: no idea what you need CL_IMAGE_SLICE_PITCH for anyway...
12:10karolherbst: or CL_IMAGE_ROW_PITCH
12:11EdB: in order to properly read into back when a copy is made on host memory may be ...
12:13karolherbst: all those pitch stuff was a mistake in CL anyway...
12:13karolherbst: either you know what's perfect for the driver or you don't set it at all :)
12:13EdB: host an gpu could differ on that
12:14karolherbst: and hence all those transfer objects
12:14karolherbst: heck.. even for a simple map we create one
12:16karolherbst: although those don't have to necessarily create a copy in host memory
12:17karolherbst: we really should use persistent mapping explicitly
12:17karolherbst: ohh, there is also PIPE_MAP_DIRECTLY.
12:18EdB: ah you remove filling the values because the values filled was incorrect, not because it's incorred to fill them
12:18karolherbst: I have a proper patch later on
12:18karolherbst: llvmpipe has quite some padding
12:19EdB: sorry I didn't get that at firt :)
12:20karolherbst: I am not sure if this just works for 1Darray though... but I will look into it once I clean up the other patches
12:21karolherbst: 1Darrays are annoying..
12:21karolherbst: gallium has the slice on the third dimension, CL onthe second
12:29EdB: so .gitlab-ci/piglit/cl.txt is used to have a way to check if the beahavior stay the same ?
12:55EdB: karolherbst: also r600 and randeonsi image are not activated so the pipe_image_view won't break them :)
12:56EdB: what is the "the proper const buffer interfaces" ?
12:57karolherbst: essentially we just need to create pipe_constant_buffer objects and pass it in
12:57karolherbst: it is quite simple
12:57karolherbst: I had a patch somewhere
12:57karolherbst: but with nir we don't use this path anyway
12:58karolherbst: it would make sense to use it for the kernel input buffer
12:58karolherbst: so we could slim down the gallium launch_grid api a little
12:58karolherbst: but yeah, also makes sense to use for constant* memory which is guaranteed to be a pipe_constant_buffer
12:58karolherbst: with SVM it can also be a host buffer, which makes it quite annoying
12:59karolherbst: airlied: do you think we could just enable system SVM on llvmpipe? :D
13:29EdB: karolherbst: got info. On radeonsi I have ac_rtld error: symbol _Z15get_image_width14ocl_image2d_ro: unknown on test_kernel_image_methods 2D
13:30EdB: so it seems radeonsi is not ready yet !
13:39zmike: Lightkey: oh oops
14:18karolherbst: EdB: mhh
14:18karolherbst: EdB: ohh, right, it uses the libclc headers?
14:18karolherbst: yeah.. I think they need the 1.2 API
14:18EdB: it's a licclc issue
14:19karolherbst: I will take care of the CL 1.2 stuff in the next MR though hopefully.. I still need to handle those format bits as well
14:19karolherbst: at least for nir
14:20karolherbst: EdB: but does the image stuff work alright with pipe_image_view on radeonsi?
14:20EdB: api part seems to work
14:21karolherbst: but I don't think that's relevant :D
14:22karolherbst: EdB: mind checking if kernel_read_write works?
14:22karolherbst: I think that uses simple enough kernels
14:22karolherbst: or is it hitting image_width as well?
14:23EdB: seems to work for most of the part
14:24EdB: I got some ERROR: Scanline 86,7 did not verify for image size 86,91,103 pitch 172,15652
14:24karolherbst: I also got some random fails, but the values were close enough
14:24karolherbst: most of the sub tests are passing though
14:24EdB: FAILED 24 of 304 sub-tests.
14:24EdB: FAILED 2 of 5 tests.
14:24karolherbst: I'd say this is close enough :D
14:25EdB: but 1Darray passed is said as passed...
14:25EdB: let me be back to CL1.1
14:25karolherbst: do you have my old branch or what's in my MR right now?
14:27karolherbst: btw, all the fails I see are with CL_FILTER_LINEAR
14:27EdB: I have a special branch of mine with clover: use pipe_image_view for images instead of set_compute_resources
14:27karolherbst: I see
14:28karolherbst: mind testing with my MR instead? And if you need any additional patches I can include those as well
14:28EdB: + some awatry patches
14:28karolherbst: would like to "fix" it first and then add new features
14:29karolherbst: also.. the tests are soo slow.. but probably beacuse I build with O0
14:29karolherbst: mesa and the CTS
14:31EdB: there is a lot to test
14:32EdB: for nir, you will use so stuff from libclc ?
14:32karolherbst: we already do, but not the headers
14:32karolherbst: we just link against a selected list of functions
14:32karolherbst: and we link on a nir level
14:33EdB: because I'm wondering how faster it would be to link to amd ROC cl device lib instead of re implements missing bits inot libclc
14:33karolherbst: I think those are already implemented
14:33EdB: no for 1.2
14:33karolherbst: Microsoft also uses libclc for their stack and somebody added that kind of stuff
14:34karolherbst: oh, I am sure there is stuff somewhere
14:34karolherbst: maybe not for AMD
14:34karolherbst: but the generic code should be there
14:34karolherbst: not sure if it's a PR or master branch or something
14:34EdB: yeah, libclc have a target part
14:34EdB: where some target specif are imlemented
14:34karolherbst: but you can also compile from the generic code, no?
14:35EdB: but there is some intresic to call depending on the target
14:35EdB: for exemple for image
14:35EdB: %img_id = call i32 @llvm.OpenCL.image.get.resource.id.2d(
14:35karolherbst: it might be those are not implemented as we have spirv equivalents
14:36karolherbst: and the spirv-llvm-translator translates that stuff for us already
14:36karolherbst: but I suspect it's just the image functions
14:37EdB: mazybe some atomic too
14:37karolherbst: ohh, right
14:37karolherbst: those as well
14:37karolherbst: maybe you need a similiar approach as we do now where you can just keep using llvm intrinsics directly without having to go through libclc...
14:38karolherbst: but I also don't know how that works in detail
14:38karolherbst: just that the spirv-llvm-translator has some passes to translate some stuff
14:38karolherbst: on an llvm level
14:38karolherbst: all those functions have 1:1 mappings anyway
14:39karolherbst: and there wouldn't be an "implementation", just a mapping rather
14:42EdB: is the libclc part merged ?
14:43karolherbst: what do you mean by libclc part?
14:43EdB: for the nir stuff
14:59EdB: karolherbst: your branch seems ok for me
15:01EdB: (especially the commits from this Serge Martin one ;) )
15:02karolherbst: but cool, glad to hear
15:03karolherbst: I think I was dropping a few patches as they weren't needed anymore.. not quite sure
15:03karolherbst: but anyway. If those work for radeonsi as well, I think we should get those merged then :)
15:03karolherbst: and with the CL 1.2 MR I try to fix all the remaining issues as well
15:05karolherbst: EdB: what I am curious about is how read only images are implemented, because based on their natures those should be texture ops
15:05karolherbst: and hence allow more texture objects than images
15:05karolherbst: this is the "clover/device: use PIPE_MAX_SHADER_SAMPLER_VIEWS for max_images_read" change
15:05karolherbst: no idea if that causes any issues for radeonsi
15:06karolherbst: I think not, as we use the sampler/sampler_views API for those already
15:06karolherbst: can't test it :)
15:06EdB: ad I said, I guess no one ever really use images on amd vard
15:06karolherbst: I think some people said it works somehow on r600?
15:06karolherbst: no idea really though
15:07EdB: even this stuff can't be activated using a varaible
15:07karolherbst: but seeing how many little things are broken, I doubt anybody would notice
15:07karolherbst: or I hope nobody uses it in production atm :D
15:08karolherbst: like the host ptr stuff you ran into with the radeon kernel driver
15:08EdB: it might have work since there is some stuff in libclc
15:08karolherbst: yeah.. or it was just added to be able to use one application or something
15:08karolherbst: no idea really
15:09EdB: but that was probably when amd people think clover was the way to go
15:10EdB: Jan Vesely seems to have some interst to have clover work for r600
15:11karolherbst: ahh, right, thats jvesely, but doesn't seem around atm
19:12airlied: karolherbst: llvmpipe svm is probably a good idea, esp if it helps testing
19:14EdB: karolherbst: hacking r600 libclc back didn't help :/
19:14EdB: ac_rtld error: symbol llvm.OpenCL.image.get.size.2d: unknown
19:15EdB: it's seems that some stuff vanish
19:18karolherbst: airlied: yeah.. I wouldn't be surprised if it just works (tm)
19:37EdB__: ok, r600 had specif lowering pass
19:37EdB: amdgcn hav not
19:37EdB: have not
19:38EdB: now I need to find where the image size and format are expected to be found
19:40karolherbst: EdB: clover puts some of that inside the input buffer
19:40EdB: karolherbst: only r600 could potentialy had image support
19:40karolherbst: module::argument::image_size and module::argument::image_format handling
19:41EdB: not for amdgcn
19:41karolherbst: I guess that just happens to work
19:41karolherbst: or not at all
19:41EdB: those were bind by if (type_name == "__llvm_image_size")
19:42EdB: by those implicit args are only added by LLVM r600 target
19:44EdB: so either I try to figure out how they do on ROC or I try to do it like it's done for other part of mesa
20:00AndrewR: karolherbst, I'm getting "ERROR: unknown nir_intrinsic_op image_deref_format" from attempt at running OpenCL-CTS/build/test_conformance/images/kernel_image_methods . Is this normal for now?
20:04karolherbst: AndrewR: yeah
20:04karolherbst: I will fix it soon enough
20:13AndrewR: karolherbst, thanks!
21:06macc24: is there anything in mesa that i could use to utilize gpu of another computer?
21:21linkmauve: macc24, while still running your program on the local CPU? What is the latency you expect?
21:22macc24: linkmauve: between server and client? ping shows ~0.1ms
21:22macc24: on gigabit ethernet link
21:23linkmauve: (There were two questions.)
21:23macc24: yes, while still running program on local cpu
21:24kisak: maybe something with virgl?
21:24linkmauve: Indirect GLX might still exist inside of Mesa, but that ties you to X11 and the performances might be terrible depending on your workload.
21:24macc24: isn't indirect glx limited to opengl 1.4?
21:25macc24: and due to driver issues i am limited to running wayland on both machines
21:25imirkin: i don't think so - GL 2.x or even GL 3.0 should work OK
21:25imirkin: and you get all the various extensions
21:26linkmauve: macc24, Xwayland exists.
21:26macc24: yeah i use it
21:27linkmauve: If you could also run the CPU side on the remote computer I’d recommend waypipe.
21:27linkmauve: I use it and it’s great!
21:27macc24: oh i used virtualgl
21:56macc24: technically indirect glx works, but the opengl program looks more like a slideshow than anything useful
21:56macc24: on local display...
21:56HdkR: That's the expected result
21:57HdkR: Remote GL is unlikely to ever be good. It's serviceable and that's it