01:07gfxstrand: dcbaker: woo
05:19mareko: tarceri: any comment on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25392/diffs?commit_id=87ec7f1091c520cf7246a093d142ae25c7fc86c5 ? I can drop it if people prefer algebraic to be more aggressive
05:29Company: oh sweet
05:29Company: DRI is smarter than the drivers apparently
05:31Company: eglQueryDmaBufFormats() goes via dri2_yuv_dma_buf_supported() which assembles YUV formats from other formats but eglCreateImage does not, so I get claims of supported formats when the formats in fact aren't supported
07:56daniels: snark-and-quit is definitely an antipattern, yeah
08:11MrCooper: daniels: if you mean Company, he said he's only interested in real-time conversations, otherwise we should just ignore what he wrote
08:11MrCooper: if only the latter was that simple
08:21tnt: well he quit 2h later, it's not like he joins, rents, quits instantly ... not everyone has a bouncer ...
08:21sima: mdnavare, we could probably put that WARN_ON into drm_atomic_get_crtc_state
08:22sima: just need to set a flag in drm_atomic_state before we enter the driver's ->atomic_check code
08:24sima: mdnavare, https://paste.debian.net/hidden/5ae7baf2/ this should give you a nice backtrace in exactly the offending code, completely implemented in generic code
08:24sima: vsyrjala_, ^^ thoughts?
11:16karolherbst: gfxstrand: any specific person you want to see review https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22580 ? I really want to land this asap, because it does fix real issues and at least CI seems to be clean.
11:59mripard: sima: ack for https://lore.kernel.org/dri-devel/20230919121549.722965-1-mripard@kernel.org/ ?
12:09sima: mripard, uh I meant to delete the exploratory ones, not all
12:10sima: from a very quick look drm_test_mm_init|debug|once look reasonable at least
12:10sima: and drm_test_budy_alloc_limit too
12:11sima: buddy_alloc_pessimistic/optimistic might be fine too
12:14sima: essentially keep the ones that are efficient and check some corner case like whether we can allocate to the last page forward/backwards (since those go down by page order they should be log(n) of the allocator size, which should be ok)
12:15sima: but maybe I misread some of the tests
12:15sima: mripard, it's also easier to encourage people to fill the test holes if there's a little bit there still :-)
12:22mripard: Ack :)
12:23mripard: I'll send a v2, thanks
12:48sima: mripard, I'd say in the end just measure the runtime and then sanity check the tests that are left, if they look like log(n) or less it should be fine even on more extreme cases
14:02pq: sima, I certainly don't disagree with getting something useful even if somewhat deficient, if the known desires have been evaluated and rejected. I only oppose to not even evaluating, and even if rejected for now, reducing potential future burdens should be done if feasible.
14:16DavidHeidelberg: shadeslayer: which flight u talking from BCN to LCG? 20:20?
15:04shadeslayer: DavidHeidelberg: It's the one that leaves on Monday morning
15:04Company: pq, emersion, kusma: re my question from a few days ago about https://developer.arm.com/documentation/ka004859/latest/ - the answer is no, that code doesn't need to work.
15:05Company: Imported EGLImages have an implementation-defined formats and it is valid to for example consider them compressed (like Mesa). And compressed formats are not color-renderable, so cannot be attached to framebuffers
15:06shadeslayer: DavidHeidelberg: I land in LCG at 08:30 AM
15:06pq: Company, they can be attached, but the FBO won't be complete, and it needs to be complete for glReadPixels to work?
15:12Company: pq: yes
15:13pq: alright, good to have an answer.
15:13Company: so it's implementation-defined if the code will work
15:15MrCooper: offhand it seems odd that color-renderable would be required for glReadPixels, I guess it's plausible though
15:22Company: ndufresne: did your adventures with https://gitlab.freedesktop.org/mesa/mesa/-/issues/8112 lead anywhere?
15:23Company: oh, you made an MR - because I'm hitting that issue with Mesa git now
15:47ndufresne: Company: I made an MR, it got split in half, and it ended up in a blunt argument with the devs
15:48ndufresne: I'm not a mesa dev really, but I can recognize a mess when I see one, this driver is a real big mess
15:48Company: ndufresne: this whole code is broken on Intel and (my) Radeon
15:48ndufresne: but I guess if I'm not the only one pushing we can get this done
15:49Company: so that's a bigger mess than just your Radeon I guess
15:49Company: have you seen mutter's YUV support that robertmader[m] did?
15:50Company: because mclasen wants something similar and I don't want such code in GTK if I can avoid it
15:50ndufresne: Company: so the patch the got merged is this one, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20815
15:50ndufresne: It was merged since it fixed 2-3 piglit tests
15:52ndufresne: Company: but in my initial version, to actually fix the issue, I had this change, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20815/diffs?diff_id=2094392
15:53Company: yeah, you got further than I did trying to understand things
15:53ndufresne: The second half was rewritten into -> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20967
15:53ndufresne: apparently merged, didn't notice this one
15:54Company: right, so the remaining issue is that eglCreateImage() fails because the stride is wrong
15:54ndufresne: correct, it was halved, resulting in some sort of split view
15:54ndufresne: I would see myself twice
15:57ndufresne: Company: how to you currently reproduce this ?
15:58Company: with a custom GTK branch mclasen did
15:58ndufresne: cause in gst, the EGLImage import get rejected apparently with my original pipeline ... not sure why
15:58Company: and https://gitlab.gnome.org/matthiasc/pipewire-media-stream/
15:58Company: yeah, same here
15:59ndufresne: it use to work with direct YUY2 import
16:02ndufresne: ah, the driver now only accepts UYVY
16:02ndufresne: my camera produces YUYV
16:04ndufresne: anyway, I'll have to debug this new DRM modifier stuff, cause its rejected ....
16:05ndufresne: that being said, it avoid the bug very well
16:08Company: that's just a few swizzles away
16:10ndufresne: out of curiosity, what does this media-stream thingy do
16:11Company: it's a simple thing that uses pipewire to get dmabufs
16:11Company: and then uses the new GTK dmabuf support to display them
16:11Company: and GTK tries to use EGL to draw it
16:14mareko: zmike: VK_EXT_host_image_copy is pretty unlikely to happen on AMD
16:15zmike: shame
16:18alyssa: hardware hic-cup?
16:20alyssa: Could someone at Intel look at why https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625 causes some sample variable tests to fail (both iris and anv+angle, seemingly zink+anv is fine)
16:21alyssa: I don't see how it'd be possibly affected which makes me suspect an intel backend bug or something
16:21alyssa: only the opt algebraic changes could even possibly change things and, the patterns are correct and all non-intel hw is happy
16:22alyssa: Kayden: maybe ^
16:34zmike: mareko: on the other hand AMD isn't really my target for this
16:45Company: ndufresne: I think the problem here is that code doesn't agree if the magic RGRB format is a 2px-block 32bit format or a 1-px block 16bit format
16:45Company: and half the code assumes the first while the other half assumes the 2nd
16:45Company: so you inevitably run into sanity checks that bail out somewhere
16:47Company: like, AMD allocates memory for a 1px-block 32bit format and then bails because the size of the dmabuf is only half of the allocated GPU buffer
16:49Company: no idea where intel gets confused
16:49Company: I might track that down next
16:49mareko: alyssa: it would be CPU intensive to do tiling coord->address conversions and mapping textures in VRAM directly would confuse memory management that could move them to RAM permanently
16:50Company: I wonder who originally wrote the code
16:51mareko: so the extra RAM->VRAM upload copy is actually good for you
16:53Kayden: alyssa: so dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.bit_count_per_sample.singlesample_rbo regresses due to the last patch, "Optimize LLVM booleans"
16:53Kayden: not sure why yet but that narrows it down a little
16:59alyssa: mareko: fair enough!
16:59alyssa: Kayden: smells like intel compiler bug
17:00Kayden: yeah, definitely
17:00Kayden: it looks like we propagated a negate into the first thing but not anywhere else
17:13ndufresne: Company: I'm starting to recall this, and yes, I diged up on why the BAD_ALLOC on image import, and reason was bad expected stride/size, and this is the bit that never got merged, then the swizzling was wrong, which is the part that got merged as-is
17:13ndufresne: AMD requires the stride to be exactly ROUND_UP(width, 256) fyi, anything else get discarded
17:15ndufresne: but someone reported that the old vaapisink (which is pure GL machinary inside mesa for AMD), got the swizzling wrong, I showed them what wrong there, and they refused to correct it, anyway, we are dropping vaapisink support in gst, so I don't have any interest in this blunt argument thingy
17:16Company: this is a whole-stack issue though
17:16ndufresne:wonder if meson devenv works these days with mesa ...
17:16Company: kinda like HDR
17:17Company: because we want GStreamer to produce dmabufs (either via pipewire/webcam or via hardware decoding, then move thm through GTK and the Wayland compositor into the display hardware
17:17Company: while allowing both GTK and the compositor to unpack it if they need to
17:18Company: so putting hacks in one place won't work anyway
17:18ndufresne: ah, so you are looking at possibly have a passthrough inside some GTK code path ?
17:18Company: yes
17:18Company: that's the long-term goal
17:18ndufresne: that's a good idea, does it mean you endup with a sandswitch of renders ?
17:19Company: that goal is somewhat independent of working YUV
17:19ndufresne: e.g. glrender | passthroug sub-surface | glrender ?
17:19Company: because Boxes wants it for virtio
17:19ndufresne: I see
17:20ndufresne: but swizzling issue is extremely common with RGB formats too ;-D
17:20Company: and yes, likely that's the best solution - have GTK create additional surfaces and distribute the rendering to them
17:20Company: but we can swizzle ourselves in RGB!
17:21ndufresne: well, when direct dmabuf import fails on gst side, all colors are correct :-D
17:21ndufresne: (shader side csc)
17:22ndufresne: but the direction I want to push, and you saw robertfoss work I guess, is that this GL glue in gst should not be needed
17:22ndufresne: whatever gst do, the compositor can do really
17:22ndufresne: and widget libraries can take better decisions too
17:22Company: yeah
17:23Company: so I think https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/mesa/main/formats.csv?plain=1#L96-99 is one thing that's wrong
17:23Company: that's block size 2 and 16bit - either you want block size 1 or 32bit
17:23Company: but no clue what the right choice is
17:23ndufresne: wow, ok, that seems very familiar, they can't all have the same swizzling lol
17:24ndufresne: yeah, that's the scary part, when I wrote the fix for YUY2, it was written in trial and error, and I pushed the MR as a draft, cause I had no rationale
17:24ndufresne: but since it fixed all the piglit tests, they just flipped it over and pushed it
17:25Company: the swizzling part happens elsewhere I think
17:25ndufresne: Company: what I did is check dri2, check which one is picked by default, and gave that one xyz1, and then adjusted all the others accorindgly
17:25Company: when it maps YUYV => RGRB or wherever
17:26ndufresne: this code is only used for internal shader based swizzling, panfrost notably does not use it, it has an other swizzler which have an incompatible configuration interface
17:26ndufresne: at least this is my understanding
17:26Company: I should see what my rpi does
17:28tleydxdy: is it proper to export a vulkan binary semaphore and import it as a timeline semaphore? either via opaquefd or syncobj
17:30Company: ndufresne: but on the GTK side, the way forward is adding dmabuf import for well-working formats (read: RGB), then switch boxes to that and everything that uses GdkGLTexture now, and then look into pushing stuff straight to Wayland without going through GL's renderer
17:30Company: ndufresne: because I want to get away from GL textures and to dmabufs as a sharing mechanism - it avoids the GLContext mess and it works with Vulkan
17:31Company: which means GtkGLArea would work with a Vulkan renderer
17:35Company: and once YUV works, we can hook it in
17:35ndufresne: at least I'm happy to say it works on panfrost here ;-D
17:35Company: or if it does already on panfrost or whatever, we can hook it in now
17:36ndufresne: (testing on Chromebook Spin 513, MTK8195 running debian) GST_GL_API=gles2 GST_DEBUG="glcontext:4,gleglimage:7" gst-launch-1.0 v4l2src device=/dev/video4 ! glimagesink
17:36Company: v4l2src doesn't do dmabufs - or did you add that recently?
17:37ndufresne: yes, its a dmabuf produces (linear only)
17:37ndufresne: * producer, just like pipewire v4l2 support
17:37ndufresne: (or libcamera, but you get the point)
17:38Company: neither v4l2src nor pipewiresrc produced dmabufs for me, when I recently tried (but on stock F38)
17:38ndufresne: the difference with pipewire, is that you don't have to use the buffer before pipewire have run over the array of FDs (something to be fixed in pw)
17:38ndufresne: I've turned v4l2src in dmabuf only about 7 years ago I think
17:38ndufresne: but its not using the dmabuf caps feature
17:39Company: glimagesink used glTexImage2D() to upload
17:39ndufresne: when dmabuf are linear, you can pass that as normal memory, we have wrapper for mmap and dmabuf sync around these fds
17:39Company: and not eglCreateImage()
17:39ndufresne: most of the time, this is because the GPU can't import it
17:39ndufresne: modern GPU are pretty limited in regard to foreign memory
17:39Company: well, this one claims it can and then breaks - but GStreamer didn't
17:40ndufresne: which platform was that ?
17:40Company: so I set some breakpoints and they were never hit
17:40Company: Fedora 38 - my AMD desktop
17:40ndufresne: e.g. on panfrost, this import thing only works with GLES2, since there is no external eos texture on big gl
17:41Company: I tried GLES
17:41ndufresne: what was the resolution ? was the width a multiple of 256 ?
17:41ndufresne: cause AMD you know ...
17:41Company: probably not
17:41ndufresne: panfrost needs 64 (or a multiple), AMD 256 (exactly)
17:42ndufresne: except that panfrost for YUV supports 2 I think
17:42ndufresne: which is because there is a specialised sampler
17:42ndufresne: (in short, we can zero-copy with YUV direct import, but can't always fallback or R8/RG88 + shader)
17:43ndufresne: sharing dmabuf is a complex world, you always need a good fallback
17:43Company: I had naively assumed that every GPU has specialized samplers
17:43ndufresne: well, maybe they do, but rarely are the one which have that implemented
17:43Company: because they all seemed to support VkYCbCrConversion
17:44ndufresne: I think its more commonly implemented for VK drivers, but haven't been testing this much, gst vk don't have dmabuf support at all
17:44ndufresne: RPi is a good platform to check
17:44Company: I first need to fix my Vulkan code to not use features that the rpi driver doesn't support ;)
17:44MrCooper: ndufresne: yes, meson devenv works with Mesa
17:45ndufresne: MrCooper: thanks, that's good news !
17:45ndufresne: now, do I set gst as subproject of mesa, or mesa as subproject of gst ....
17:45MrCooper: ndufresne: surely AMD GPUs support multiples of 256 as well
17:45ndufresne: (hard choice eheh)
17:45ndufresne: MrCooper: you mean the HW ? cause in the YUV import path, it was pretty much validating with ==
17:46ndufresne: assuming in fact that you know your padding, and deal with the cropping
17:46MrCooper: sounds like a bug then
17:46ndufresne: (fun fact, the padding when its stride driven is actually lost in the pipeline, very commonly at least)
17:47ndufresne: a side effect of how we implemented things 20years ago lol
17:47Company: considering it's failing at x16 vs x32 and blockwidth 1 vs 2, I'm not surprised stride handling is somewhat buggy
17:48Company: what confuses me is that both AMD and Intel are broken in roughly the same way, even though they're different codebases
17:49ndufresne: remind me, what do you test on Intel that breaks ? maybe I can try and repro ? I don't recall having YUV swizzling issues with Intel
17:49Company: eglCreateImage() failing
17:49ndufresne: ah, you mean it get rejected
17:50ndufresne: that require breaking into mesa, cause there is no indication which params got rejected
17:50ndufresne: I think it does matter though that you should support RG88 fallback
17:51ndufresne: (with your own shaders)
17:52Company: I'm not sure I want to do that
17:52Company: because then I need to encode dmabuf implementation details into my code
17:53ndufresne: how do you fallback atm ?
17:54ndufresne: you can't rely on dmabuf import to always work on all PC GTK will be running on
17:54ndufresne: even virtio have copy path
17:55ndufresne: the alternative is to never let anyone else then the GPU driver allocate the memory, but then it may fail import on the other side
17:55Company: by the time GTK imports a dmabuf, it guarantees it can download it
17:56Company: if we can't make that happen, we don't import the dmabuf
17:56Company: what we need to make that happen: no idea
17:56ndufresne: what does this mean ?
17:57ndufresne: you mean that its guarantied to be able to use texture2D fallback ?
17:57ndufresne: so you basically offer direct import or texture2D?
17:57ndufresne: (nothing in between ?)
17:57Company: it means it is guaranteed to be able to give you a GBytes * with pixels that look like the dmabuf
17:58ndufresne: hmm, so DRM modifiers is out of the way here ?
17:58Company: how that has to happen is still up for debate
17:58Company: nah, if we can have a path via GL that works, that is fine
17:58Company: and if that involves copying the dmabuf into a texture before downloading, that is also fine
17:59ndufresne: well, then why not "just" implement pixel upload ?
17:59Company: what do you mean "just"?
18:00Company: I want to ideally support a fast-path where host memory never touches the pixels
18:01Company: but I also want to be able to always support a path where the pixels end up in host memory
18:01ndufresne: well, the reality is that to cover wide range of drivers for zero-copy, you need a set of method, similar to what compositors do, which is direct dmabuf import (let the GL/VK stack convert), indirect dmabuf (use well support R/RG/ARGB formats and shade it), and finally upload the pixels (still need the same shaders here)
18:01Company: oh you mean upload support for YUV pixels?
18:01ndufresne: in the case of wayland, I add : copy the pixels to a dmabuf, cause that can dramatically improve performance over copying into a shm
18:02ndufresne: well, even for RGBA, you should enumerate wha the DMABuf import backendd supports, and use a substitution with a swizzling shader to ensure wider support
18:03ndufresne: (though it quite rare these days that the direct path does not already support all swizzling)
18:04Company: I don't think it's my job to work around broken drivers - fixing the drivers also helps more than just GTK
18:05Company: and I absolutely do not want to encode implementation details about dmabuf formats into GTK - unless they are well-defined somewhere
18:06Company: because that's a game of bug whack-a-mole that I'm not interested in
18:11ndufresne: drivers don't have to support direct YUV import, this isn't a broken HW
18:12ndufresne: *dirver
18:12Company: sure
18:13ndufresne: and YUV through pixel upload is not something I've seen working anywhere
18:13Company: but if drivers don't support it, there's not much benefit in us supporting it
18:13ndufresne: so better just giveup on yuv ;-P
18:13ndufresne: well, we clearly added YUV shading in compositor and gst for a reason ;-P
18:13Company: which one though?
18:14ndufresne: GPU csc is a clear winner against CPU
18:14ndufresne: I personally think we should push into compositors as much as possible, but I'm praying for my personal goals here
18:14Company: well sure, but you can just use an RGB dmabuf and attach some colorspace info to get conversion going
18:15ndufresne: it won't help you toolkit library if you need to blur the video, or other funky transforms
18:15ndufresne: *your
18:16ndufresne: e.g., it won't give you a texture in gtk to play with
18:16ndufresne: (but apps can do that for you)
18:16Company: why not?
18:17Company: I get a dmabuf and a colorspace and then the GTK color conversion routines do the YUV=>RGB conversion
18:19ndufresne: Great, then why not implement import with substitute like I suggested ?
18:19ndufresne: I smell you didn't really get the ladder around dmabuf importation ...
18:20Company: which spec lays out dmabuf substitutes?
18:21Company: as an application I am explicitly meant to treat dmabufs like a black box and just pass everything through
18:21Company: so if I start subsituting random stuff, that's wrong
18:22ndufresne: the contradiction here is that you want to do color conversion routines
18:22ndufresne: if GTK knows about DMR fourcc XYVx, and the modifier is linear, you don't have to tread that as a black box
18:23Company: sure, for linear formats there's a clear definition
18:23ndufresne: e.g. I420 have 3 planes, which can be imported as 3 textures using R8 format
18:23ndufresne: you cannot do that for non-linear formats, except on Intel (but then you started coding HW specificities, which I decided not to in gst fyi)
18:23Company: though that is for map() afaik, not for eglCreateImage()
18:24ndufresne: map or eglCreateImage have the same semantic
18:24ndufresne: just that one is a copy to GPU mem, were the other uses the memory in-place
18:24Company: do they?
18:24ndufresne: yes, as long as you have non "external only" format, which you can query
18:25ndufresne: as soon as you have an eglImage, you can shade it as if it was a pixel upload
18:26ndufresne: you may want to read weston renderer, I believe its in pretty good and clean shape, it implement the full ladder here
18:27ndufresne: e.g., first you try and see if GL can make your buffer look like RGBA, second, try and zero-copy import as substitute before opting for slow pixel upload, the two last will use the same shaders and same substitutes
18:28Company: yeah, something like that could be feasible
18:28Company: I'm stsill not a fan of it because I need to encode implementation details of drm formats for it
18:29Company: and it's not (yet) important, because too much other stuff is missing in GTK
18:30Company: and we're better off letting GStreamer do the relevant conversions to RGB
18:35ndufresne: Company: so RPi, tested gst-launch-1.0 libcamerasrc ! video/x-raw,format=NV12 ! glimagesink, it happily picked NV12 directly, but then VC4 driver crashed on the first render, ;-P
18:37ndufresne: (could be why they say wayland is experimental)
18:43tnt: On intel, modifier == 0 would be ... linear ?
18:43Company: bah, my webcam is USB-C and I have no converters
18:43tnt: Although I though there would be at least the vendor id in the MSB.
18:44Company: modifier == 0 is always linear, no matter where
18:48tnt: Company: tx.
19:00tnt: I'm playing with teh dmabuf export of GL textures ATM. Could running under Xorg and under Xwayland lead to different modifier ? I'm a bit surprised I'm getting a linear modifier for a gl texture, I would have expected some tiled format (this is on a a750 on wayland). And under a Xorg on a 12thgen iGPU, I do get a tiled format.
19:12dj-death: tnt: not so surprising to me
19:13dj-death: tnt: if we don't who is going to be the receiver of that dmabuf, probably the only common thing is linear
19:13tnt: dj-death: I'm just wondering why my iGPU and the A750 behave differently in that respect.
19:14tnt: The use case is for the cl-gl interop with the 'intel compute runtime' fwiw.
19:35dj-death: tnt: it's definitely different drivers
19:36dj-death: tnt: not even sure the compute runtime knows about modifiers
19:38ndufresne: tnt: modifier 0 is linear, not just intel
19:45tnt: dj-death: it doesn't know about modifiers ... ATM I pretty much just pray that whatever the compute stack decide the format is matches what mesa decided the format is.
19:46tnt: Which is obviously not great ... but I have no clue how the compute runtime memory stuff works and have not been able to find anyone that does ...
20:55daniels: tnt: don’t do export from GL - allocate externally and import
20:56tnt: daniels: It's not like I have a choice ...
20:57tnt: The app use CL/GL sharing extension ... I do what I can to support it.
20:57tnt: but I didn't invent it.
21:18Company: daniels: what's the best place to allocate externally?
21:18Company: or is the answer per-task?
22:33daniels: Company: gbm
22:35Company: daniels: that looks neat - would you recommend it for testsuite code (to test dmabuf import)?
22:41anholt: Company: that's what we do in piglit
22:41anholt: best way to go that I know of, short of full integration testing on a target system importing from camera/video decode blocks.
22:41Company: it sounds easy enough to use - assuming I don't need exotic permissions
22:42Company: does that work inside gitlab CI runners?
22:42Company: I suppose not because they have no device available?
22:43alatiera: it depends
22:43alatiera: there are a couple different systems, normal container builds, qemu vms, and also a hardware lab that runs tests
22:44Company: so test_skip() if it isn't available but include it
22:45Company: and then talk to the right people to make it work
22:54Company: anholt, daniels: so I take it basically https://gitlab.freedesktop.org/mesa/piglit/-/blob/main/tests/util/piglit-framework-gl/piglit_drm_dma_buf.c#L333-338 and then pass that fd to gbm_create_device() should be good enough to get going
22:58daniels: yeah that looks about right