IRC Logs of #dri-devel on irc.freenode.net for 2023-11-14

01:15 jenatali: And there's 4.4 merged and 4.5 ready to go :)
02:50 Lynne: by the way, ffmpeg 6.1 has been released with vulkan video support, so I invite those bored to run with RADV_PERFTEST=video_decode to cause troubles for me and airlied
04:10 kode54: I'll wait for Arch to package 6.1
08:13 mripard: sima, emersion: I don't think using the global CMA heap will work. You might have additional, device-specific, requirements when it comes to where the buffer is that the CMA heap won't address (and won't even know about), so it needs to be tied to the device somehow
08:13 emersion: can you elaborate on these requirements?
08:14 mripard: well, it can be as simple as things like the device can only access the lower 128MB of RAM or whatever
08:14 mripard: so your buffer needs to be within that range to be useful
08:14 emersion: and that restriction is specific to scanout capability?
08:14 emersion: what about render and video?
08:15 emersion: no restriction there?
08:15 mripard: you can have the same story there
08:15 mripard: it's all device specific, really
08:15 mripard: like on Allwinner SoC, the codec is in that situation and can only access the lower 256MB of RAM
08:15 emersion: if we're talking specifically about vc4 and v3d?
08:15 mripard: (the older ones)
08:16 mripard: I don't think vc4 and v3d have those kind of restrictions
08:16 emersion: ok, so some split render/display SoCs have this
08:16 emersion: and some not
08:16 emersion: i see
08:16 mripard: yeah
08:16 mripard: so I think your driver-specific heap makes sense
08:17 emersion: alternatively, we could have a new heap created outside of the driver based on dt info for these "lower than $addr" regions
08:17 mripard: because then you can tie that to the device that will allocate the memory buffers and make sure that it can access the buffer it allocates
08:18 mripard: all those constraints are abstracted away by the DMA API
08:18 mripard: when you call dma_alloc_* it will already make sure those constraints are met
08:18 emersion: but we will want to express the compat links between heaps and devices at some point
08:18 mripard: so doing that all over again seems like a huge duplication to me
08:19 mripard: (even more so since they aren't all expressed in DT)
08:19 mripard: so you really need the struct device to make your allocation
08:20 mripard: and for the like between heap and device, the idea swick[m] (I think?) had to make it a sysfs link sounds good to me?
08:20 emersion: no, sima had this idea
08:20 mripard: that way you can discover it easily, there's no risk of a name clash, etc.
08:20 mripard: oh, sorry
08:21 emersion: but then you need to draw links between all devices' heaps
08:21 emersion: instead of between all devices and the central heaps
08:22 emersion: each device comes with its own heaps, and each device needs to know about each other possible device to advertise the links
08:24 mripard: yeah, it's kind of the discussion I started in my mails. If you want to completely solve the buffer placement issue, it's much more difficult than just adding heaps
08:24 mripard: like, there's no guarantee that the producer device and the scanout have the same requirements
08:24 emersion: it is indeed
08:25 mripard: and you don't even have a guarantee (aside from HW designers sanity) that the range they can access are overlapping
08:25 mripard: so you need to aggregate all the constraints of all the devices involved and allocate a buffer there
08:25 mripard: and we don't have an API for that
08:26 mripard: also, KMS kind of assume that the DRM device is *one* device when it comes to allocation
08:26 emersion: sima said that if we have placement constraints via heaps we shouldn't be too far
08:27 emersion: there is the full buffer constraints proposal sent a few years back by James Jones, but it's pretty complicated and uses heaps anyways
08:27 mripard: but ARM devices will typically have different devices to access DMA, and again there's no guarantee that all the hardware devices that can access RAM have the same constraintns
08:27 mripard: (it's also true for IOMMU mapping)
08:28 mripard: so we would need a new allocation API that would be finer-grained than the one we have right now
08:28 mripard: so I really don't think we should aim at fixing that
08:28 mripard: it's going to be a nightmare
08:29 mripard: and from a practical PoV, what we have works well enough as it is
08:29 emersion: this goal is pretty much the reason i'm working on this, fwiw
08:29 mripard: (or maybe not :D)
08:29 emersion: if the DMA heaps we create are no better than dumb buffers, maybe we can just stick with dumb buffers
08:30 emersion: (and tell people running headless or systemd-less to just chmod their primary node lol)
08:30 mripard: so, yeah, I really think your solution is sound, works well enough to address your immediate needs, and would be a good stopgap measure until we have that discussion again in 5 years :)
08:32 emersion: aha, because now i think my solution is pretty much unsound :P
08:32 mripard: is it perfect? No. Do we need perfection? Also no
08:34 mripard: as long as we provide some way for userspace to discover which heap it should use for a particular device
08:34 mripard: we can revisit later and change whatever we want beneath it
08:34 mripard: (famous last words)
08:35 mripard: but seriously, everything that we would need could be done on device (getting the allocation constraints) or heaps (providing a hint to where the buffer should be allocated)
08:36 swick[m]: Isn't the question really about what a heap should represent? If it's about encoding all the device constraints etc then you need a heap for almost all devices, bit if it's just about the placement then it's fine to have way fewer heaps but we need to communicate the constraints another way in the fituret
08:37 emersion: heaps are not designed to represent each and every constraint
08:38 emersion: they are only designed for placement
08:38 mripard: oh, right
08:40 emersion: so stuff like stride or addr align is out of scope for instance
08:40 emersion: addr lower than given value is a bit of a weird case
08:42 mripard: constraints should definitely be expressed for a given device, and once we have "hinting" for heaps, then we can just use a global one
08:43 swick[m]: A specific address range seems like a placement issue to me
08:43 swick[m]: what's hinting in this context?
08:43 emersion: in our proposal with James Jones, constraints included a set of heaps
08:47 swick[m]: I'm still not sure what you actually want to build here...
08:47 mripard: swick[m]: asking for the buffer to be allocated within a given range
08:48 swick[m]: So allocating on a specific heap?
08:49 swick[m]: Oh, hinting to the heap which range to allocate in
08:50 javierm: emersion: this is the one you are referring to https://lpc.events/event/9/contributions/615/attachments/704/1301/XDC_2020__Allocation_Constraints.pdf right ?
08:50 swick[m]: And aren't constraints different per placement?
08:50 emersion: javierm: yes
08:51 javierm: emersion: I see that ezequielg_ also proposed https://patchwork.kernel.org/project/dri-devel/patch/20200816172246.69146-1-ezequiel@collabora.com/ around the same time
08:51 emersion: oh, good find
08:52 emersion: ah i was CC'ed even
08:52 javierm: but right now it seems that people just create different variants of dma-buf heaps, like the uncached / WC dma-buf healp I shared that Android uses
08:53 emersion: yeah the discussion there is quite similar to the one we're having now
08:53 javierm: I also see that vendor trees have other dma-buf heaps that are tailored to the constraints on these platforms
08:54 mripard: I think we can have something workable if we have something like a) device access constraints in sysfs b) a lib to resolve the constraints of devices we want to "link" together c) some way to tell heaps to allocate a buffer in a particular range
08:55 mripard: I think Lucas Stach also made some work in this area some time ago
08:55 mripard: (not sure if he's on IRC)
08:56 javierm: mripard: I was wondering the other day at what point the DRM/KMS subsystem will need something like the media controller framework that media/v4l2 has
08:56 mripard: I seriously hope we never do
08:56 javierm: with kmsro, multiple display and render nodes, dma-buf heaps with constraints, etc
08:57 mripard: and the media controller API doesn't address that either
08:58 MrCooper: mripard: AFAIK Lucas' nickname is lynxeye, doesn't seem here right now though
08:59 mripard: MrCooper: oh, thanks
08:59 mripard: I couldn't recall his nick
08:59 emersion: :P https://dri.freedesktop.org/wiki/WhosWho/
08:59 javierm: mripard: yeah, that's why I said "something like", in other words some control interface to setup a display and rendering pipeline
09:00 javierm: because you also said a lib to resolve the constraints and what devices can be linked together
09:00 javierm: the best we have right now for that AFAIU is kmsro in mesa and is a lot of hardcoding when I looked at that
09:00 mripard: emersion: what?! I've never seen that page before
09:00 mripard: it's awesome, thanks
09:01 emersion: np!
09:01 javierm: emersion: nice, thanks for that indeed.
09:01 mripard: javierm: right, but it's also something that should work outside of KMS, like for camera-to-display or codec-to-gpu-to-display
09:03 mripard: and imo, atomic modesetting is already superior to the media controller API (but with simpler cases to address too tbf)
09:04 mripard: the only lacking part at the moment is bridges, and that has fairly big implications (like it would probably affect connectors)
09:05 mripard: but I don't think we will ever need to use the media controller API
09:10 javierm: mripard: yeah, right now that constraint negotiation how is handled? user-space just tries to import the dma-buf and fails or ?
09:11 mripard: allocate the buffer on one end, import it on the other end and hope for the best
09:11 mripard: which, tbf, seems to work fairly well
09:13 mripard: because the hardware designers are sane, and the constraints I've seen are most of the time on the "producer" end, not the scanout
09:13 javierm: mripard: I see. That's interesting
09:13 mripard: there was this KMS driver (contributed by paulk iirc) that had those weird constraints where framebuffers had to be at fixed addresses
09:13 mripard: logicvc I think?
09:17 mripard: right
09:17 mripard: https://lore.kernel.org/all/20220520141555.1429041-2-paul.kocialkowski@bootlin.com/
09:17 mripard: "With version 3, framebuffers are stored in a dedicated contiguous
09:17 mripard: memory area, with a base address hardcoded for each layer."
09:18 emersion: ... ouch
09:24 javierm: mripard: /dev/dma_heap/fixed_$addr :P
09:26 javierm: mripard: combining that with a codec like the one you mentioned Allwinner has that requires a dma-buf below a given address would be interesting
09:40 mripard: like I said, even though we don't always like to admit it, HW designers are sane :)
09:41 mripard: but in one of those cases, I don't think how you can solve it without a memcpy (assuming the source buffer is accessible by the CPU)
09:56 javierm: mripard: pretty sure that HW engineers don't consider SW engineers sane looking at all the layers of complexity that create :)
09:57 mripard: haters gonna hate :)
09:57 javierm: :D
10:00 enunes: emersion itoral so I'm finding it quite hard to pass information all the way from the wsi wayland layer to the device specific vulkan allocation, it seems that mesa even uses its own small vulkan extension just to differ wsi allocations from regular vulkan memory allocations
10:00 emersion: indeed
10:01 enunes: I think first I'm going to submit a patch just removing wl_drm usage but still relying on the master fd opened at device initialization, that fixes the bug
10:01 emersion: yeah sounds good
10:01 enunes: but it still doesn't make use of the dmabuf feedback info
10:01 emersion: ideally also gracefully handle missing master FD, by degrading to regular allocations
10:03 enunes: it's a shame because the dmabuf feedback info is right there with the correct device to use, I'm still probably missing something on how it should ideally be used
10:04 dj-death: jenatali: hey, I tried your suggestion of opt_memcpy for CL structures but that didn't really help
10:04 pinchartl: javierm: a DMA heap linked to a DT reserved memory region would make sense
10:04 dj-death: jenatali: most of the variables keep getting lowered to scratch load/store
10:04 pinchartl: the problem is to figure out what heap to use
10:05 dj-death: jenatali: if you have any suggestion on how to solve this kind of issue I would be interested to hear them :)
10:06 javierm: pinchartl: yeah, I didn't know that devices could have constraints as mripard explained
10:06 pinchartl: javierm: something we've seen for cameras is the requirement for Y and UV planes to be located in different DDR banks
10:07 pinchartl: to increase memory bandwidth
10:07 mripard: javierm: to some extent, you did :) the ssd130x driver is so constrained that you have to memcpy from a random buffer into the actual one
10:07 mripard: (also, not mappable)
10:08 emersion: pinchartl: oh that's a fun one...
10:08 pinchartl: emersion: I could imagine a display device having similar constraints
10:08 mripard: I'm not sure
10:09 mripard: YUV isn't as prevalent as it is for v4l2 for example
10:09 pinchartl: it makes more sense for cameras though, as display bandwidth tends to be lower
10:09 mripard: and multi-planar RGB doesn't really make much sense
10:09 pinchartl: 8k @300fps displays are not that common :-)
10:09 pinchartl: (yet)
10:26 MrCooper: javierm: FYI, that simpledrm patch cover letter with no subject being shown in a separate thread is a gmail issue, a decent MUA will show it in a single thread with the patch and follow-ups
10:36 javierm: mripard: re: ssd130x - hehe, you are correct :)
10:38 javierm: MrCooper: it may well be. I use as MUA emacs-notmuch and fetch the email using mbsync, but it could be that gmail mangles the emails and the pulled thread is already broken
10:46 MrCooper: the e-mail headers have all the information for proper threading, Thunderbird showed them in a single thread
10:46 javierm: MrCooper: weird
10:47 MrCooper: (not using gmail in any way)
10:47 javierm: there's something wrong though, because mripard has a script to get the emails threads from lore and that failed too for me
10:48 javierm: but maybe that's a consequence of my MUA breaking the thread
10:48 mripard: stop telling everyone about the bugs in my scripts :'(
10:49 javierm: mripard: there's no bug in your script. The threading is what's broken
10:49 mripard: mutt doesn't follow the threading either though
10:50 mripard: but In-Reply-To and References are set properly
10:50 mripard: so it looks fine by me?
10:50 mripard: but somehow it confuses multiple tools
10:51 mripard: (lore, gmail and mutt)
10:51 MrCooper: most likely due to the empty subject?
11:09 karolherbst: dcbaker: ping on the proc macro native workaround somebody proposed: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25775
14:58 jani: sima: ack? https://patchwork.freedesktop.org/patch/msgid/20230921153429.3822278-1-jani.nikula@intel.com
14:59 jani: sima: or go all the way and remove the warnings too?
15:01 sima: jani, iirc inexisting modparam don't cause module failures anymore, so I think we could just thrash it outright?
15:01 sima: either way a-b: me
15:01 jani: sima: I'll do that, thanks
15:26 emersion: agd5f: should we merge the atomic async flip stuff via the amd tree or via the drm-misc tree?
15:42 emersion: tzimmermann: one of the patches in Javier's series has docs to explain the two kinds of damage
15:44 tzimmermann: emersion, patch 5? i still found it a bit undescriptive
15:50 javierm: tzimmermann: did you follow the referred links ?
15:51 tzimmermann: javierm, https://bugzilla.kernel.org/show_bug.cgi?id=218115 ?
15:51 tzimmermann: i'm going to read that
15:53 tzimmermann: ah, in patch 5. ok!
15:53 javierm: tzimmermann: great and you understood correctly that the goal was to effectively disable damage handling if page flip with a new framebuffer plus damage
15:55 javierm: but that's the best we can do until there is buffer age or similar buffer damage accumulation tracking
16:05 jenatali: dj-death: If you want to paste a nir_print dump, I'd take a look at it. My psychic powers of deduction aren't giving me any more ideas right now though
16:06 tzimmermann: those links in patch 5 were helpful, thanks
16:17 dj-death: jenatali: will try to give an example :)
16:23 jenatali: \o/ GL4.5 merged
16:29 javierm: tzimmermann: you are welcome
16:29 daniels: jenatali: that's awesome!
16:30 jenatali: :)
16:56 karolherbst: gfxstrand: time for a very very quick look at my SyCL MR? It's just trivial work arounds, I just want to know what you think about those: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25701
17:20 karolherbst: pain.. I'm hitting this issue: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/1142
17:21 karolherbst: jenatali: ever ran into this as well? openvino is hitting this pattern :'(
17:21 karolherbst: I left a comment there with the code I've hit
17:22 jenatali: I haven't, that looks annoying
17:23 karolherbst: mhhh yeah..
17:24 karolherbst: _though_ might be easily fixed
17:24 karolherbst: just use Uniform constant when emiting the variable and the loads
17:24 karolherbst: but ....
17:24 karolherbst: well...
17:24 karolherbst: probably the right thing
18:56 Hazematman: Could someone add llvmpipe and lavapipe labels to my MR please? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26153
18:57 karolherbst: done
18:57 Hazematman: Thank you very much karolherbst ! :)
20:21 jenatali: Ugh. Trying to enable SPIR-V in GLOn12 has me chasing bugs in piglit and spirv-tools
20:22 dj-death: fun time
20:23 jenatali: Sure. You could say that
20:28 jenatali: Anybody here a SPIRV-Tools maintainer and want to help along https://github.com/KhronosGroup/SPIRV-Tools/pull/5477 ? :)
20:36 gfxstrand: jenatali: SPIRV-Tools folsk are usually fairly responsive.
20:36 jenatali: 👍
20:45 karolherbst: I have a hack for my issue and I hate it
20:46 karolherbst: jenatali: https://github.com/karolherbst/SPIRV-LLVM-Translator/commit/08ff0a1df47738c1efc2d4793b39a63e6d3d1655
21:11 jenatali: So I don't actually expect to get any reviews on https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/850 because it's Windows stuff. So instead I'll fish for acks
21:23 dj-death: huh
21:23 dj-death: what's the variable to disable threading?
21:25 Company: mesa_glthread=false GALLIUM_THREAD=0
21:25 Company: I needed both when using sysprof
21:35 dj-death: thanks a lot
22:25 DavidHeidelberg: anholt_: Hey! Is this chance expected when going 6.4 -> 6.6 kernel: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/51580478#L764 ?
22:40 dj-death: is there anywhere in the kernel that specifies what clock is used for the drm_syncobj waits ?
22:40 dj-death: looks like it's CLOCK_PROCESS_CPUTIME_ID