00:44 jekstrand: imirkin: Yeah, but we can pack/unpack around it
01:41 mareko: packing 64 bits like 2x32 and splitting them to different slots seems fine to me
01:43 mareko: I would also say that 16-bit varyings are 1000x more important than 64-bit varyings
02:19 Kayden: yeah, exactly...imirkin is right, they need to be aligned unless we bother splitting and recombining on both ends. which would be fine to do. but, really, -EDONTCARE about 64-bit varyings...aligning may be a bit wasteful, but *shrug*
02:19 Kayden: whichever's easier really
02:19 Kayden: packing 16-bit varyings is definitely a thing we want though
11:54 EdB: hello karolherbst
12:00 karolherbst: hi
12:02 EdB: I was sure that we read in the spec that row_pitch and slice have to be filled Oo
12:02 EdB: but didn't manage to find it back
12:06 karolherbst: it's in clEnqueueMapImage
12:06 karolherbst: image_row_pitch and image_slice_pitch args
12:07 karolherbst: in imageCreate we only do it for internal reasons afaik.. and it would be important for host pointers anyway
12:07 karolherbst: but I think images with host pointers are also just broken atm
12:07 karolherbst: but maybe we should keep it 0 if it's not application provided?
12:08 karolherbst: but we also have CL_IMAGE_SLICE_PITCH mhhh...
12:09 karolherbst: I think the painful part is that the pitches can be different depending on how you get it
12:09 karolherbst: clGetImageInfo might return something different compared to clEnqueueMapImage
12:09 karolherbst: no idea what you need CL_IMAGE_SLICE_PITCH for anyway...
12:10 karolherbst: or CL_IMAGE_ROW_PITCH
12:11 EdB: in order to properly read into back when a copy is made on host memory may be ...
12:13 karolherbst: all those pitch stuff was a mistake in CL anyway...
12:13 karolherbst: *that
12:13 EdB: yeah
12:13 karolherbst: either you know what's perfect for the driver or you don't set it at all :)
12:13 EdB: host an gpu could differ on that
12:13 karolherbst: right
12:14 karolherbst: and hence all those transfer objects
12:14 karolherbst: heck.. even for a simple map we create one
12:16 karolherbst: although those don't have to necessarily create a copy in host memory
12:17 karolherbst: we really should use persistent mapping explicitly
12:17 karolherbst: ohh, there is also PIPE_MAP_DIRECTLY.
12:18 EdB: ah you remove filling the values because the values filled was incorrect, not because it's incorred to fill them
12:18 karolherbst: yeah
12:18 karolherbst: I have a proper patch later on
12:18 karolherbst: llvmpipe has quite some padding
12:19 EdB: sorry I didn't get that at firt :)
12:19 karolherbst: np
12:20 karolherbst: I am not sure if this just works for 1Darray though... but I will look into it once I clean up the other patches
12:21 karolherbst: 1Darrays are annoying..
12:21 karolherbst: gallium has the slice on the third dimension, CL onthe second
12:29 EdB: so .gitlab-ci/piglit/cl.txt is used to have a way to check if the beahavior stay the same ?
12:29 karolherbst: yeah
12:55 EdB: karolherbst: also r600 and randeonsi image are not activated so the pipe_image_view won't break them :)
12:56 EdB: what is the "the proper const buffer interfaces" ?
12:57 karolherbst: set_constant_buffer
12:57 karolherbst: essentially we just need to create pipe_constant_buffer objects and pass it in
12:57 karolherbst: it is quite simple
12:57 karolherbst: I had a patch somewhere
12:57 karolherbst: but with nir we don't use this path anyway
12:57 karolherbst: but
12:58 karolherbst: it would make sense to use it for the kernel input buffer
12:58 karolherbst: so we could slim down the gallium launch_grid api a little
12:58 karolherbst: but yeah, also makes sense to use for constant* memory which is guaranteed to be a pipe_constant_buffer
12:58 karolherbst: with SVM it can also be a host buffer, which makes it quite annoying
12:59 karolherbst: airlied: do you think we could just enable system SVM on llvmpipe? :D
13:29 EdB: karolherbst: got info. On radeonsi I have ac_rtld error: symbol _Z15get_image_width14ocl_image2d_ro: unknown on test_kernel_image_methods 2D
13:30 EdB: so it seems radeonsi is not ready yet !
13:39 zmike: Lightkey: oh oops
14:18 karolherbst: EdB: mhh
14:18 karolherbst: EdB: ohh, right, it uses the libclc headers?
14:18 karolherbst: yeah.. I think they need the 1.2 API
14:18 EdB: it's a licclc issue
14:19 EdB: indeed
14:19 karolherbst: I will take care of the CL 1.2 stuff in the next MR though hopefully.. I still need to handle those format bits as well
14:19 karolherbst: at least for nir
14:20 karolherbst: EdB: but does the image stuff work alright with pipe_image_view on radeonsi?
14:20 EdB: api part seems to work
14:21 karolherbst: cool
14:21 karolherbst: but I don't think that's relevant :D
14:22 karolherbst: EdB: mind checking if kernel_read_write works?
14:22 karolherbst: I think that uses simple enough kernels
14:22 karolherbst: or is it hitting image_width as well?
14:23 EdB: seems to work for most of the part
14:24 karolherbst: cool
14:24 EdB: I got some ERROR: Scanline 86,7 did not verify for image size 86,91,103 pitch 172,15652
14:24 karolherbst: I also got some random fails, but the values were close enough
14:24 karolherbst: most of the sub tests are passing though
14:24 EdB: FAILED 24 of 304 sub-tests.
14:24 EdB: FAILED 2 of 5 tests.
14:24 karolherbst: I'd say this is close enough :D
14:25 EdB: but 1Darray passed is said as passed...
14:25 EdB: let me be back to CL1.1
14:25 karolherbst: do you have my old branch or what's in my MR right now?
14:27 karolherbst: btw, all the fails I see are with CL_FILTER_LINEAR
14:27 EdB: I have a special branch of mine with clover: use pipe_image_view for images instead of set_compute_resources
14:27 karolherbst: I see
14:28 karolherbst: mind testing with my MR instead? And if you need any additional patches I can include those as well
14:28 EdB: + some awatry patches
14:28 karolherbst: would like to "fix" it first and then add new features
14:29 karolherbst: also.. the tests are soo slow.. but probably beacuse I build with O0
14:29 karolherbst: mesa and the CTS
14:31 EdB: there is a lot to test
14:32 EdB: for nir, you will use so stuff from libclc ?
14:32 karolherbst: we already do, but not the headers
14:32 karolherbst: we just link against a selected list of functions
14:32 karolherbst: and we link on a nir level
14:33 EdB: because I'm wondering how faster it would be to link to amd ROC cl device lib instead of re implements missing bits inot libclc
14:33 karolherbst: I think those are already implemented
14:33 EdB: no for 1.2
14:33 EdB: not
14:33 karolherbst: Microsoft also uses libclc for their stack and somebody added that kind of stuff
14:34 karolherbst: oh, I am sure there is stuff somewhere
14:34 karolherbst: maybe not for AMD
14:34 karolherbst: but the generic code should be there
14:34 karolherbst: not sure if it's a PR or master branch or something
14:34 EdB: yeah, libclc have a target part
14:34 EdB: where some target specif are imlemented
14:34 karolherbst: but you can also compile from the generic code, no?
14:35 EdB: but there is some intresic to call depending on the target
14:35 EdB: for exemple for image
14:35 karolherbst: mhhh
14:35 EdB: %img_id = call i32 @llvm.OpenCL.image.get.resource.id.2d(
14:35 karolherbst: it might be those are not implemented as we have spirv equivalents
14:36 karolherbst: and the spirv-llvm-translator translates that stuff for us already
14:36 karolherbst: but I suspect it's just the image functions
14:37 EdB: mazybe some atomic too
14:37 karolherbst: ohh, right
14:37 karolherbst: those as well
14:37 karolherbst: mhhh
14:37 karolherbst: maybe you need a similiar approach as we do now where you can just keep using llvm intrinsics directly without having to go through libclc...
14:38 karolherbst: but I also don't know how that works in detail
14:38 karolherbst: just that the spirv-llvm-translator has some passes to translate some stuff
14:38 karolherbst: on an llvm level
14:38 karolherbst: all those functions have 1:1 mappings anyway
14:39 karolherbst: and there wouldn't be an "implementation", just a mapping rather
14:42 EdB: is the libclc part merged ?
14:43 karolherbst: what do you mean by libclc part?
14:43 EdB: for the nir stuff
14:43 karolherbst: yeah
14:59 EdB: karolherbst: your branch seems ok for me
15:01 EdB: (especially the commits from this Serge Martin one ;) )
15:02 karolherbst: :p
15:02 karolherbst: but cool, glad to hear
15:03 karolherbst: I think I was dropping a few patches as they weren't needed anymore.. not quite sure
15:03 karolherbst: but anyway. If those work for radeonsi as well, I think we should get those merged then :)
15:03 karolherbst: and with the CL 1.2 MR I try to fix all the remaining issues as well
15:05 karolherbst: EdB: what I am curious about is how read only images are implemented, because based on their natures those should be texture ops
15:05 karolherbst: and hence allow more texture objects than images
15:05 karolherbst: this is the "clover/device: use PIPE_MAX_SHADER_SAMPLER_VIEWS for max_images_read" change
15:05 karolherbst: no idea if that causes any issues for radeonsi
15:06 karolherbst: I think not, as we use the sampler/sampler_views API for those already
15:06 karolherbst: but..
15:06 karolherbst: can't test it :)
15:06 EdB: ad I said, I guess no one ever really use images on amd vard
15:06 karolherbst: yeah...
15:06 karolherbst: I think some people said it works somehow on r600?
15:06 karolherbst: no idea really though
15:07 EdB: even this stuff can't be activated using a varaible
15:07 karolherbst: but seeing how many little things are broken, I doubt anybody would notice
15:07 karolherbst: or I hope nobody uses it in production atm :D
15:08 karolherbst: like the host ptr stuff you ran into with the radeon kernel driver
15:08 EdB: it might have work since there is some stuff in libclc
15:08 karolherbst: yeah.. or it was just added to be able to use one application or something
15:08 karolherbst: no idea really
15:09 EdB: but that was probably when amd people think clover was the way to go
15:10 karolherbst: maybe
15:10 EdB: Jan Vesely seems to have some interst to have clover work for r600
15:11 karolherbst: ahh, right, thats jvesely, but doesn't seem around atm
19:12 airlied: karolherbst: llvmpipe svm is probably a good idea, esp if it helps testing
19:14 EdB: karolherbst: hacking r600 libclc back didn't help :/
19:14 EdB: ac_rtld error: symbol llvm.OpenCL.image.get.size.2d: unknown
19:15 EdB: it's seems that some stuff vanish
19:18 karolherbst: airlied: yeah.. I wouldn't be surprised if it just works (tm)
19:37 EdB__: ok, r600 had specif lowering pass
19:37 EdB: amdgcn hav not
19:37 EdB: have not
19:38 EdB: now I need to find where the image size and format are expected to be found
19:40 karolherbst: EdB: clover puts some of that inside the input buffer
19:40 EdB: karolherbst: only r600 could potentialy had image support
19:40 karolherbst: module::argument::image_size and module::argument::image_format handling
19:41 EdB: not for amdgcn
19:41 karolherbst: right...
19:41 karolherbst: I guess that just happens to work
19:41 karolherbst: or not at all
19:41 EdB: those were bind by if (type_name == "__llvm_image_size")
19:42 EdB: by those implicit args are only added by LLVM r600 target
19:44 EdB: so either I try to figure out how they do on ROC or I try to do it like it's done for other part of mesa
20:00 AndrewR: karolherbst, I'm getting "ERROR: unknown nir_intrinsic_op image_deref_format" from attempt at running OpenCL-CTS/build/test_conformance/images/kernel_image_methods . Is this normal for now?
20:04 karolherbst: AndrewR: yeah
20:04 karolherbst: I will fix it soon enough
20:13 AndrewR: karolherbst, thanks!
21:06 macc24: is there anything in mesa that i could use to utilize gpu of another computer?
21:21 linkmauve: macc24, while still running your program on the local CPU? What is the latency you expect?
21:22 macc24: linkmauve: between server and client? ping shows ~0.1ms
21:22 macc24: on gigabit ethernet link
21:23 linkmauve: (There were two questions.)
21:23 macc24: yes
21:23 macc24: yes, while still running program on local cpu
21:24 kisak: maybe something with virgl?
21:24 linkmauve: Indirect GLX might still exist inside of Mesa, but that ties you to X11 and the performances might be terrible depending on your workload.
21:24 macc24: isn't indirect glx limited to opengl 1.4?
21:25 linkmauve: Dunno.
21:25 macc24: and due to driver issues i am limited to running wayland on both machines
21:25 imirkin: i don't think so - GL 2.x or even GL 3.0 should work OK
21:25 imirkin: and you get all the various extensions
21:25 macc24: huh
21:26 linkmauve: macc24, Xwayland exists.
21:26 macc24: yeah i use it
21:27 linkmauve: If you could also run the CPU side on the remote computer I’d recommend waypipe.
21:27 linkmauve: I use it and it’s great!
21:27 macc24: oh i used virtualgl
21:56 macc24: technically indirect glx works, but the opengl program looks more like a slideshow than anything useful
21:56 macc24: on local display...
21:56 HdkR: That's the expected result
21:57 HdkR: Remote GL is unlikely to ever be good. It's serviceable and that's it
22:02 macc24: ...