06:03 kurufu: Is it possible to get perf to report symbols for amdgpu module stacks, they seem to be recorded normally for `sudo perf -a`, and `perf script (--show-kernel-path)` seems to report built in modules fine however amdgpu was built m instead of y so maybe I need something different to symbolize the stacks?
06:05 kurufu: i.e. drm symbols between amdgpu stacks show up fine.
08:54 MrCooper: is it expected that importing a dma-buf for a dumb BO with eglCreateImageKHR requires passing the offset from struct drm_mode_map_dumb? (https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/4240#note_2340394)
09:31 emersion: i haven't seen that
09:52 mlankhorst: The mmap offset is used when mmaping into the drm fd, I don't think it's required for importing
09:54 mlankhorst: In fact counterproductive, you would mmap the dma-buf at offset 0
09:59 MrCooper: per the mutter discussion I linked, passing offset 0 to eglCreateImageKHR results in all black, whereas passing the offset from struct drm_mode_map_dumb results in correct output
10:04 sima: MrCooper, yeah that sounds like busted driver implementation
10:06 sima: since the map_dumb offset is within the drm_fd and doesn't make sense anywhere else, and has nothing to do with any kind of in-buffer offset you might pass when recreating and image from metadata
10:15 MrCooper: right, thanks, I guess it could actually be a mutter issue though, maybe it just "worked" in a different way than assumed
10:21 karolherbst: alyssa: sooo... I'm kinda thinking about using some of your CL C stuff, though I'm also wondering if I want it to be a bit more higher level... like a CL meta thing. What I want to do is to accelerate certain CL APIs by using kernels instead of well... CPU copies 🙃 but I also don't really want to integrate any of those helpers on the rust side,
10:21 karolherbst: so I was wondering if I want to simply compile cl kernels to spirv, and then do the meta thing instead using it as an internal helper lib like done for the other drivers.
10:52 sima: airlied, btw on missing drops and accidentally wrong nesting, quite a while ago we discussed adding lockdep annotations to Arc around kref_put so that you never drop a reference while holding locks that would cause trouble if it's the final reference
10:52 sima: I think lina had the patches once somewhere, but no idea where they are
10:52 sima: dakr, ^^
10:53 sima: it's already a pain in C, but with rust's drop mostly not being visible in the source code, it's worse there imo
11:32 krh: karolherbst: rust to spirv instead of CL C?
11:37 karolherbst: not sure I want to open that can of worms yet
11:38 glehmann: rewrite rusticl in C? :P
11:43 karolherbst: I think the simplest path is to write kernels and simply use whatever internal APIs I have to load the spirv... maybe embed it into the lib so I won't have to do weirdo fs operations
13:29 Ermine: MrCooper: I guess the driver in question is nvidia?
13:29 krh: karolherbst: yeah, kernels in rust is definitely not a well-trodden path
13:30 karolherbst: I only need them for memory copies, so writing the kernels itself isn't the issue here anywya
13:31 karolherbst: CL allows for strided buffer copies and other funky things
13:32 MrCooper: Ermine: good guess in general, and it's close but nouveau; anyway, I suspect it's something else funky, not a driver issue
13:40 Ermine: it uses default dumb_map_offset impl, so yeah
14:03 vignesh: sima: I'm going to test this lockdep patch https://gitlab.freedesktop.org/vigneshraman/linux/-/commits/wip/vignesh/lockdep-detection-v2. Which branch you wanted to test - drm-misc-next or drm-tip ?
14:05 sima: vignesh, probably enough fail that it doesn't matter :-/
14:06 vignesh: sima: I will check on drm-misc-next, thanks :)
14:20 zmike: kusma: re: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438 do you remember why this checks modules_changed ?
14:22 zmike: I guess this is for internal GS detection?
14:24 eric_engestrom: karolherbst: I assume you already know about https://github.com/Rust-GPU/rust-gpu, but pointing it out just in case :)
14:24 karolherbst: yeah, I'm aware
14:25 pac85: zmike: kinda reminds me of this https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32399
14:25 karolherbst: I don't think it's worth pulling huge libs like that in, just because I want to run some loops on the gpu copying memory around
14:25 zmike: pac85: 😬
14:26 eric_engestrom: karolherbst: ack
14:39 alyssa: karolherbst: so the easy thing to do is to use vtn_bindgen2, which will let you write CL libraries (not kernels) and then it exposes nir_builder bindings for them
14:39 alyssa: which is all upstream now
14:40 alyssa: you still are responsible for wrapping it up in a nir_builder_init_simple_shader and such but it at least lets you express the logic in CL C
14:40 alyssa: the fancy stuff isn't really ready for common code to use yet (i'm working on that) and definitely not for Rust code
16:39 mrbro: Crypto Announcement: https://comcoin.fun/ - this will moon today at 20:30 CET
16:42 karolherbst: alyssa: mhh yeah.. might be a bit toooo much for what I need, which is simply to write some kernels to run against two memory objects + offset + range inputs, so maybe I just write kernels by hand and embed the spir-v then
16:55 alyssa: karolherbst: use vtn_bindgen2
16:55 alyssa: it is what you want lol
16:55 karolherbst: maybe
16:55 alyssa: see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33362
16:55 alyssa: it is like 0 integration
16:56 alyssa: you just copypaste a few lines of meson adn go
16:56 karolherbst: but for that I have to use the nir_builder, no?
16:56 alyssa: yes, and..?
16:57 karolherbst: I don't use the nir_builder
16:57 alyssa: I guess you already have code to ingest spir-v's. lol. fair
16:57 karolherbst: yeah...
16:57 alyssa: at least use mesa_clc then
16:57 karolherbst: that's kinda the thing :D
16:57 alyssa: no binary blobs in tree please
16:57 karolherbst: yeah.. that was my original plan
16:57 alyssa: cool
16:57 alyssa: :P
16:58 karolherbst: I just wonder if I want to do raw CL calls (so basically creating OpenCL meta) or just my internal core API and create stuff directly...
16:58 karolherbst: well.. I should prototype it
17:28 alyssa: karolherbst: I've contemplated doing a generic Gallium meta backend for my driver CL stuff
17:28 alyssa: depending on how much of a hurry you're in
17:29 karolherbst: mhhhhhhhh
17:29 alyssa: (I've started designing this stuff, will probably start typing next week at this rate, still arguing with myself over lifetime issues & locking & such)
17:29 karolherbst: 90% of the time I spend on this will be on working on my own code anywya
17:30 karolherbst: I really only want to accelerate some memory copies 🙃
17:30 karolherbst: mostly for copies between textures and buffers, and for strided buffer copies
17:30 alyssa: karolherbst: you can also just make this the gallium driver's problem
17:30 karolherbst: feels like something hardware can do, apparently gallium doesn't have interfaces for it
17:30 alyssa: because most gallium drivers can do something better than you can
17:30 karolherbst: yeah....
17:31 karolherbst: so that's the alternative approach
17:31 karolherbst: but that means making a mess out of resource_copy_region
17:31 alyssa: any vulkan-capable hw has impls of all this in the vk driver
17:31 alyssa: it's just not plumbed into gl because nothing gallium has caredyet
17:31 karolherbst: also strided buffer copies?
17:31 karolherbst: like..
17:32 karolherbst: you have two buffers, but for some reasons you hallucinate them being images, but not real images, just buffers, but you copy lines with strides around
17:32 karolherbst: (clEnqueueCopyBufferRect)
17:33 alyssa: uhhh ok that one is pretty wacky
17:33 karolherbst: which I also need for things like image from buffer stuff.... I was wondering if I just write it all with a kernel and then I think about using optimized driver paths
17:33 karolherbst: so I work on the fallback first, so it works everywhere
17:33 karolherbst: because atm I'm doing copies on the CPU for those things
17:34 karolherbst: and then if I care enough, I make gallium and drivers be more competent
17:34 zmike: I think at this point I've been saying for literally years to just ram it through resource_copy_region
17:34 zmike: some drivers already handle it
17:34 karolherbst: yeah...
17:34 karolherbst: but I want to nuke my CPU path
17:35 karolherbst: so I will probably do the fallback first
17:35 karolherbst: and then use resource_copy_region
17:35 karolherbst: atm that I stall everything is worse than not using optimized driver paths but a kernel instead
19:52 airlied: karolherbst: seems pointless to do fallback
19:52 karolherbst: airlied: I have a lot of other things drivers won't be able to provide accelerations for
19:52 airlied: since most gpus will want to use copy engines anyways
19:53 karolherbst: I'll need that infra anyway
19:53 karolherbst: I'll have to do buffer clears faking to be a 2D image, meaning I'm not allowed to touch gaps
19:53 karolherbst: it's annoying
19:53 karolherbst: there are a couple of those around
19:54 karolherbst: plain image <-> buffer things should hit hardware paths, yes, but that's just one of the issues
19:56 airlied: anything vulkan exposes should hit hw paths, except in rare circumstances
19:57 karolherbst: airlied: what vulkan API should I use for clEnqueueCopyBufferRect then?
19:58 karolherbst: though I think mike knows a dirty hack for that 🙃
19:59 zmike: how dare you
19:59 zmike: such accusations will not stand
19:59 karolherbst: :D
19:59 karolherbst: though we both know which one I'm talking about
20:00 karolherbst: though there are still other APIs which are even more of an issue
20:31 airlied: unrolled copy buffers :-P
20:31 jenatali: That's what I've got in CLOn12 right now
20:31 airlied:won't tell how amd transfer queues do image copies
20:34 karolherbst: I doubt that's any faster...
20:34 karolherbst: but yeah, that's what I'm doing atm
20:35 karolherbst: uhm...
20:35 karolherbst: was thinking of fill buffer
20:35 karolherbst: same issue
20:35 karolherbst: but unrolled copy buffer is better than a cpu copy for real
20:38 karolherbst: no idea why I haven't thought of that...
20:38 karolherbst: prolly I was copying clover too aggressively there 🙃
20:42 jenatali: I've got a better solution I can/should do for D3D now too, if I ever get back to CL... the image<->buffer path can technically have buffers on both sides of it which are strided like images
20:42 jenatali: But originally the strides had to be 256-byte aligned. Now that's relaxed though
20:42 karolherbst: mhhh yeah...
20:43 karolherbst: ohh, I see
20:43 karolherbst: yeah, that could work for you
20:50 alyssa: https://gitlab.freedesktop.org/asahi/mesa/-/commit/b136c59b239b53c7bdabcf3e6f425d1f5498ec24
20:50 alyssa: karolherbst: jenatali :P
20:50 alyssa: look ma, no vk_meta
20:51 jenatali: :P
21:00 alyssa: i need to give my image copies the same treatment
21:00 alyssa: though that requires plumbing a bunch of weird stuff into CL universe
21:00 alyssa: and IDK if I even want to precompile those kernels since i'm keying to pipe_formats
21:19 karolherbst: alyssa: what weird stuff?
21:19 karolherbst: but yeah.. the format stuff is a bit of a problem