IRC Logs of #dri-devel on irc.freenode.net for 2023-10-16

00:13 Company: ndufresne: is there an issue for the multiple-of-256 radeon issue?
00:13 Company: because it affects Vulkan, too
00:14 ndufresne: But it's not a bug, this is how the HW works, or perhaps I missed something?
00:15 Company: but I don't know that
00:16 Company: all I know is that the driver claims it can do NV12
00:16 Company: and when I hand it NV12 it says "nope, can't do that"
00:17 ndufresne: It's very annoying design indeed, on ChromeOS they created minigbm, that hardcode some HW knowledge and let the GPU driver allocate
00:17 Company: there could be a way for negotiating a proper stride, but I don't see one
00:18 ndufresne: Currently DRM does not give you a stride / width / height unless you ask the driver to allocate something
00:19 ndufresne: It's probably to the only thing for which V4L2 does a better job :-)
00:19 Company: the way the eglCreateImage() API reads (and the Vulkan API, but the Vulkan API doesn't even define how DRM formats map to VkFormats), the driver is the one to copy the memory if it can't deal with the stride
00:21 ndufresne: In VK though, there is ways a driver can inform of it's memory constraints (I'm just not very knowledgeable), Lynne might know more
00:22 Company: the question I'm trying to answer is "can this dmabuf be imported?"
00:23 Company: and the answer seems to be "who knows?"
00:23 ndufresne: On EGL side, today's answer is, yes if the EGLImage import succeed
00:23 Company: right
00:24 Company: but that doesn't help me when creating the dmabuf
00:24 Company: the question is what do I put in my GStreamer element's caps?
00:37 kode54: Company: polaris?
00:38 kode54: I found a specific commit where nv12 broke for a 720 wide video, and only on polaris
00:38 ndufresne: Company: on GST, all the formats you can support, unfortunately, you can't guarantee zero copy on any of them
00:39 Company: ndufresne: if my sink says it can do NV12 and then eglCreateIamge() doesn't work in GTK, GStreamer can't help me
00:39 ndufresne: That's why we like modifiers, cause buffer with modifiers rarely fail to import
00:40 ndufresne: Company: you have to fallback to different way to send pixels to GL
00:40 ndufresne: Well, you can, nobody is forced, you would also document that you are only offering zero-copy
00:41 Company: ndufresne: but I don't have a different way - and even if I made one, that still wouldn't work with modifiers
00:42 kode54: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9949
00:42 kode54: That seems related
00:42 kode54: Doesn’t affect RDNA2, but maybe because the block of code being hit there is gfx10+?
00:43 kode54: Possibly some code path has no way of telling non-modifier setups that the buffer has a width constraint?
00:44 Company: I think the multiple-of-256 is a general AMD thing
00:44 kode54: I just know it worked in 23.1.2 and main before that indicated commit
00:52 mclasen: Company: look at the vendor prefix of the modifier, and if its amd, force the stride to be as needed?
00:52 Company: how do I force pipewire to use that stride?
01:08 mclasen: tell pipewire about the modifiers ?
01:10 kode54: mclasen: the card I found that bug on doesn't implement modifier support
01:46 columbarius: for cards without modifiers PipeWire supports linear and implicit (best effort)
02:07 kode54: the issue is occurring with mpv
02:07 kode54: and it's specific to a commit in Mesa
03:13 Company: columbarius: my card has modifiers, but it claims to also do 0
04:11 mareko: do I need to use phi for x in NIR if I have: if (cond) x = ...; other_stuff(); if (cond) use(x); ?
04:43 kode54: is there some environment variable for AMD to turn off framebuffer compression?
04:43 kode54: I found an artifact with Proton TkG running on an RDNA2 card, if I run Borderlands: GOTY Enhanced, and set the game to "Fullscreen" and set the resolution to something less than my native resolution
04:44 kode54: it went away when I switched the game to native resolution (which this card handles fine anyway)
04:45 cmarcelo: mareko: I think yes, the usage should be dominated by the definition and while we "know" cond is the same, I don't think at this level NIR can take that into consideration. you can have the phi = (x [from the if], undef [from not taking the if])
04:52 daniels: Company: eglCreateImage doesn’t copy
04:53 Company: doesn't it somehow have to get the buffer into vram?
04:56 cmarcelo: (and to be clear, I think the use() in the second cond would be use(ssa of the phi))
05:13 daniels: Company: no
05:16 Company: daniels: so who's gonna figure out then how to create a compatible buffer that radv is not failing to import?
05:17 daniels: well, if you want something that radv doesn’t fail on, you can always allocate via radv and upload to it
05:17 Company: sure
05:17 Company: but that assumes I have full control
05:17 daniels: if you want a bulletproof multi-device allocator for zerocopy buffers, we don’t have one of those
05:18 Company: but I'm just the toolkit here
05:18 daniels:shrugs
05:18 Company: and I want to offer an API to app developers that's unlikely to break if they go to v4l or pipewire, grab a dmabuf and hand it to me
05:18 daniels: the slow thing works everywhere and all the time, but is slow. the fast thing is hard.
05:19 Company: does it have to be hard though?
05:20 Company: I mean, obviously it's gonna be harder than the slow thing
05:20 daniels: no, we just like irritating people
05:20 Company: dmabufs feel like that sometimes
05:23 Company: I mean, I don't mind telling people "if you allocate a buffer with DRM_FORMAT_FOO and modifier 12345, it needs to have a stride and alignment of 256 and whatever else I need to tell people
05:23 Company: but EGL pretends like all you need to know is the drm format and the modifier
05:24 Company: or it could be that alignment is part of the modifier, but then radv probably shouldn't advertise modifier 0
05:28 daniels: nah, alignment can't be part of the modifier
05:28 daniels: I assume your API's more complicated than you're letting on, since consuming devices have to be part of the allocation negotiation; as you've picked up, format/modifier sets are part but not all of it
05:29 Company: I don't have an API yet
05:29 Company: I'm trying to come up with one
05:29 Company: the format/modifier idea was the result of looking at EGL
05:29 daniels: stride alignment (to either pixels or macroblock units) is part of it, but that isn't always a single fixed number
05:29 daniels: formats/modifiers are the common bit that everyone agrees on so far, but not a complete answer
05:30 daniels: for more information, go look at the talks from past XDCs about how cross-device allocation is hard, and the design we've agreed on but hasn't yet been implemented
05:31 Company: also Wayland only exposes dmabufs I guess?
05:31 Company: *format/modifiers for dmabufs
05:31 Company: I guess it has the device in there, too
05:33 Company: I'm trying to come up with a GtkDmabufInformation kinda object that I can hand the application that enables them to create dmabufs that GTK can consume via GL/Vulkan (or pass through to the compositor)
05:33 Company: and then GStreamer and pipewire and whoever else can take that and set up their video stuff
05:34 daniels: sure, so those talks will explain what it would be that you'd have to surface for cross-device/subsystem/vendor/etc compatibility, otoh none of the allocation APIs to make use of that exist, so
05:36 Company: so it might not quite be ready for consumption yet
05:36 Company: and needs people from different projects to get together and agree on something and actually implement it
05:45 daniels: right now the zerocopy path involves doing actual allocations then trying imports to see whether or not they succeed; if they don't, you get to fall back to copying yourself
05:46 daniels: there's no 'just do something that is guaranteed to work everywhere' bulletproof api
06:01 Company: the hard part about "copying myself" is that I can't do that with a random dmabuf
06:01 Company: because if eglCreateImage() fails, all I have is an fd that I can't access
06:02 Company: so I need to talk to the actual creator of the dmabuf and have them do the thing for me
06:02 daniels: you can mmap dmabufs
06:03 Company: only linear ones I thought
06:19 daniels: you can mmap non-linear dmabufs too, but the results will be non-linear
06:28 Company: so I'd need to have knowledge about the formats if I wanted to copy stuff out of it
07:53 pq: DemiMarie, yes; do everything in more reliable ways that we can test in CI without actual hardware first, then implement hardware off-loads, and the option to drop correctness requirements to allow use of even more hardware if wanted..
07:56 pq: robclark, what do you refer to with KHR_blend_equation_advanced? On Friday I was talking about a) EGL/GL importing YUV buffers and producing RGB samples automatically, and b) the use of KMS hardware to do color conversion operations in order to use KMS blending and avoid some GPU usage.
07:57 pq: robclark, I've never used KHR_blend_equation_advanced and I don't know what I'd need it for.
08:02 robclark: pq: ignoring (b) kms/display stuff, my point was for gpu the CSC is done by adding a matrix multiply in the shader for nearly all GPUs.. so for linear YUV, having the driver do it or having app do CSC in shader amounts to same thing. (The situation is different when you add modifiers in the picture.. not because of the CSC but because of how the GPU reads the data.. the CSC is still done in the shader. We could invent an extension to
08:02 robclark: return unconverted YUV and let app do whatever it wants with it.)
08:03 pq: robclark, ah, sure. My problem is with EGL_EXT_image_dma_buf_import spec though. But if Mesa can have its own CI tests to enssure that things like EGL_YUV_COLOR_SPACE_HINT_EXT are actually respected, that would be useful.
08:03 pq: That cannot be tested in CTS AFAIU, because the spec does not require taking those hints into account at all.
08:10 pq: Since (Mesa) drivers can individually claim to support the hints but still hand-wave the implementation, if I want to promise correct YUV decoding to my users, I need to either test it on every compositor start-up or just avoid that direct YUV import path altogether. Of course, there will be users who don't care about strict correctness, so they can opt-in to use the import anyway.
08:11 pq: EGL_YUV_COLOR_SPACE_HINT_EXT is the most obvious one, but the chroma siting hints are not just the right matrix, are they?
08:16 pq: I don't particularly want raw YUV samples in my shaders per se, but since I haven't heard of any guarantees or test coverage, I have no choice. And EGL_EXT_image_dma_buf_import is at least missing the constant luminance variant, which would pull in the transfer characteristic too.
08:18 pq: Supporting raw YUV sampling might be easier than adding the constant luminance support, and also more future-proof.
08:19 pq: in the end, there is much more than just the matrix at play here
08:20 pq: There is also ICtCp.
08:21 pq: The generic model would be matrix-TF-matrix for YUV-RGB conversion, but then there is also chroma siting.
08:24 pq: alternatively, if one could program a custom matrix as EGL_YUV_COLOR_SPACE_HINT_EXT, I think that would be enough for everything.
08:25 pq: except chroma siting
08:25 gfxstrand: dcbaker: If 1.3 could have everything we need, that would be amazing! :D
08:28 Company: pq: You'll probably have to have a fallback in place anyway - considering how basically any feature of the Vulkan Ycbcr spec is optional
08:30 Company: (and assuming that Vulkan specs are written against existing hardware features)
08:30 pq: yeah, there's two levels to this: optionality in the first place, and what can be expected when the optional feature is present.
08:32 Company: I also have no idea how it influences performance
08:32 Company: ie if the matrix run by the yuv hardware is faster (and if so, how much) than a shader
08:33 pq: right, I'm not even thinking about performance yet, just correctness
08:34 Company: and afaiu all the matrices convert between YUV and the sRGB colors, so if you have different primaries, you need to run a shader anyway
08:34 pq: no, that's fortunately not that bad
08:35 pq: the YUV-RGB conversion matrix is kind of independent of the colorimetry, though it is common to define both in the same spec.
08:37 Company: fwiw, the Vulkan spec does specify the expected math that is to be performed in https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#textures-sampler-YCbCr-conversion
08:37 pq: While a YUV-RGB matrix could be somehow optimal for a certain set of primaries and white point for certain use, it does not enforce the primaries or white point in any way.
08:39 Company: and that links to https://registry.khronos.org/DataFormat/specs/1.3/dataformat.1.3.html which is a 10 year old document, so I guess we can be kinda hopeful that hardware does the right thing
08:39 Company: if it does a thing
09:12 karolherbst: cmarcelo: okay :) it all works in the end after I fixed the zink side
09:55 wv: currently trying to replace DRM in an application (cog/wpewebkit) by a drm-lease from a drm-master. It looks quiet well, gets detected and all. But does not pass the authenticate step?
09:55 wv: this is with drm: [3083033.269] wl_drm@5.authenticate(1)
09:55 wv: [3083033.520] -> wl_drm@5.authenticated()
09:55 wv: this is with a lease: [2887550.595] wl_drm@5.authenticate(1)
09:55 wv: [2887550.866] -> wl_display@1.error(wl_drm@5, 0, "authenticate failed")
09:55 wv: I'm not really into these. Why is authentication failing? What needs to get authenticated where?
09:57 wv: using drm-lease-manager from AGL btw
10:05 karolherbst: cmarcelo: mhh.. apparently for private memory it doesn't work, but I'm also not sure if it's supposed to work. I've done the same as for shared memory and spirv-val at least doesn't complain
10:05 karolherbst: but I also don't know if the CTS actually tests that private blocks alias
10:06 karolherbst: https://gist.github.com/karolherbst/8b95a28877ef21898100f92475d18a63
10:06 karolherbst: the nir passes kinda split it all apart and dce the store
10:08 pq: wv, that "authentication" is a legacy API for getting GPU rendering access approved by the "display server". Nowadays you don't use that, because we have render nodes that give you the same access with less hassle, and in a more secure way.
10:09 pq: wv, render nodes are /dev/dri/render*
10:10 pq: wv, I don't know if leased DRM devices are supposed to allow authentication or not.
10:11 wv: pq, what do you mean with nowadays? This is just on master... Who is initiating this authenticate? Is this mesa, or the client application?
10:12 pq: wl_drm was supposed to be a Mesa-internal Wayland extension, but it's possible that applications could be using it directly too, so I don't know.
10:15 pq: wl_drm was replaced with zwp_linux_dmabuf_v1 Wayland extension that is public, and with apps opening DRM render nodes directly (done by Mesa).
10:16 pq: wv, maybe check if you have a render node exposed and open-able by the app?
10:19 pq: wv, since I guess wpewebkit might be using Wayland internally, maybe it doesn't actually implement the display server side of zwp_linux_dmabuf_v1? Or maybe you need newer cog/wpewebkit? I'm just wildly guessing here.
10:20 wv: I'll have a look. I know it can run without weston running whatsoever, so just from console (using platform drm), but it gives thes wl_drm outputs anyway
10:37 pq: wv, yeah, that is what makes me believe the app uses Wayland internally; you don't run any Wayland compositor as you know of. Hence there must be one inside the app.
10:38 pq: and it may well be that the auth fail is about the app talking to itself (components in different processes)
10:39 pq: wv, I wouldn't know if it might be even intentional that some processes of the app need to use wl_drm auth. I can imagine a reason to do so.
11:06 wv: pq, well, if I can disable this auth somehow... I dont care a thing...
11:07 wv: is it an option to patch libdrm to return always true? will it havge other impacts?
11:08 wv: I'm in full control of the system, so...
11:09 pq: well, it's the kernel that says whether something is allowed or not, so patching libdrm only helps if the auth was unnecessary to begin with.
11:09 pq: try it?
15:06 tnt: So posting here in case this rings a bell in some intel experts mind :) In rusticl running on DG2 the upper half of the 'Y' dimension of the local group seems to not be executed when running on DG2. Works fine on iGPU 12th gen.
15:18 tnt: Actually localsize 1,1 / 1,2 / 1,4 / 1,8 work fine. starting from 1,16 then only the lower half of the workgroup "executes".
15:22 hch12907: Can anyone check whether they can access registry.khronos.org? For me, it just loads indefinitely until the connection times out
15:24 tnt: hch12907: works for me.
15:25 hch12907: interesting...
15:27 hch12907: huh, I can visit the site just fine if I connect to it by IP (174.138.88.143)
15:28 hch12907: it can't be ipv6, can it?
15:28 hch12907: it is ipv6. I can't ping 2604:a880:800:10::985:a001
15:29 hch12907: weird that firefox isn't falling back to ipv4 though...
15:31 fluix: ipv6 doesn't respond for me either, but FF falls back. maybe something to do with your DNS server? not sure
15:35 tnt: I'm setting a bunch of debug env var ( NIR_DEBUG=print MESA_DEBUG=1 CLC_DEBUG=verbose,dump_spirv RUSTICL_DEBUG=program,clc ) and yet nothing gets printed ...
15:37 robclark: pq, re: EGL_YUV_COLOR_SPACE_HINT_EXT .. nir_lower_tex_options does have bt709/bt2020/etc.. which look like they are wired up in mesa/st. But the yuv import test coverage isn't great, so who knows.. Adding some piglit tests for this would be helpful, right now I would have no way to know if it was/wasn't working properly. (ps. sorry for delayed response.. xdc travel this week)
15:37 hch12907: not sure what's happening - other sites fallback to ipv4 just fine on FF. But nothing the /etc/hosts file can't fix I guess
16:10 austriancoder: anybody has an idea for what mesa_glinterop.h is used?
20:00 zmike: austriancoder: EXT_external_objects
20:00 mareko: austriancoder: Mesa-ROCm interop
20:02 tnt: Also working on making it used for intel-compute-runtime interop. And rusticl uses it for its GL interop.
20:18 austriancoder: Thx