08:46MrCooper: enunes emersion: there's also the issue that dumb BOs aren't supposed to be usable for HW acceleration :)
08:46emersion: sure
08:46emersion: that's my original motivation for working on this
08:47emersion: MrCooper: https://lore.kernel.org/dri-devel/20231109074545.148149-1-contact@emersion.fr/T/
08:47MrCooper: cool
08:48emersion: eh, i just notice one leftover comment from old code
08:48emersion: will fix next version
08:50javierm: emersion, pq, sima: I've implemented what we discussed yesterday to fix the issue with FB_DAMAGE_CLIPS in virtio-gpu
08:50emersion: nice!
08:50javierm: emersion, pq, sima: it would be great if you can take a look to make sure that got it correctly: https://github.com/martinezjavier/linux/commits/drm-virtgpu-damage-fixes
08:50javierm: emersion: tested on weston in a VM, was able to reproduce the issue and with those patches the flickering goes away
08:51emersion: perfect
08:51emersion: that does full damage always in that case i'd assume
08:51emersion: (weston/wlroots/mutter case)
08:52javierm: emersion: in mutter it does not, but I guess that mutter doesn't use double buffering? Or at least I wasn't able to reproduce the artifacts with mutter
08:52emersion: if you have extra time, would be nice to (1) add a note/warning in the regular damage helper for driver devs about all of this
08:52emersion: (2) add TODOs
08:53emersion: mutter does use double buffering
08:53emersion: all modern userspace i know of does
08:53javierm: emersion: I see. Wonder why isn't visible there...
08:53emersion: probably because mutter submits an over-damaged region
08:53emersion: possibly a mutter bug
08:54emersion: (it could be that they mix up frame and buffer damage)
08:54emersion: (submit buffer damage, while kernel expects frame damage)
08:59sima: javierm, clearing the damage rects feels a bit like a hack, I'd have set iter->full_damage instead
08:59sima: so that you have both a buffer_damage_iter and buffer_damage_merged version
08:59sima: and wouldn't need to open-code in vmwgfx
09:00sima: but ... eh, not feeling very strongly on this
09:00sima: javierm, the one I'd really like to see though is a doc patch adding links to the two buffer vs frame damage graphs we discussed yesterday
09:00sima: and then maybe in that section also link to the 2 set of different functions
09:01javierm: sima: sure, I'll add that. And also the TODO and note that emersion suggested
09:01sima: also I think the 2 set of functions should link to the other type with a "if your driver needs buffer/frame damage instead", which is maybe a better reason to roll out the full symmetric uapi
09:01sima: javierm, but wrt the uapi logic, I think you have the right if condition
09:02javierm: emersion: the TODO is what pq mentioned, right?
09:02sima: javierm, I also wonder whether there's a good place to put down the rule of thumb/summary I typed yesterday on irc, maybe in the damage helper DOC: section?
09:02javierm: < pq>| if it's per-buffer, you also need "buffer age" or similar damage accumulation algorithm
09:02emersion: the TODO is do something smarter than full-damage when FB_ID changes
09:02sima: I mean the upload target rule: per-plane/crtc->frame damage, per-fb/bo->buffer damage
09:02emersion: yeah, accumulating damage in a ring buffer
09:02javierm: emersion: thanks. Just wanted to make sure I got that correctly. I'll do that then, thanks!
09:03emersion: ty :)
09:03sima: yeah TODO for buffer age/damage accumulation would be good, maybe could put that into the DOC: overview?
09:03javierm: sima: Ok, I'll figure out how to add that content in the docs
09:04javierm: sima: about clearing FB_DAMAGE_CLIPS vs settting iter->full_damage = true, the problem is that with the latter we will need a custom iter_init
09:04javierm: and wanted to keep the change minimal so that could be backported as a fix...
09:04sima: javierm, yeah
09:04javierm: sima: maybe we could go with this approach and then refactor?
09:04sima: javierm, adding a custom iter_init is imo still well within backport rules
09:04javierm: sima: yes, but the driver is using the helper that does both
09:04sima: javierm, I wouldn't, this is tricky enough that a special backport patch version doesn't sound like a good idea
09:05javierm: so would have to duplicate the logic and couldn't reuse it
09:05sima: like gregkh really doesn't like custom backports ...
09:05javierm: sima: yes, that's why I thought the patches would be minimal and applied cleanly for v6.5
09:05sima: hm
09:06sima: javierm, so I think buffer_damage_iter_init could just call iter_init and then adjust like you do?
09:06sima: and for damage_merged I guess you'd need a __damage_merged with a (bool buffer_damage)
09:07sima: that doesn't look too horrendous
09:07sima: and should be still within stable limits (if we ignore the documentation part)
09:07javierm: sima: yeah, that's what I had but felt like a bigger change. I'll do that though if you prefer and is withing backporting rules
09:07sima: I think it's within, and the symmetry I think wins us enough in documentation clarity that it's justified ...
09:07javierm: sima: got it. Will do that
09:08sima: javierm, if you want to, I guess you could split out the doc patch (since damage helpers don't have a DOC: section yet)
09:08sima: and not mark that for cc: stable ...
09:08sima: but ... *shrug*
09:08javierm: sima: yeah, I was going to only add Fixes: to the helper + virtio-gpu patches and only Cc stable on those
09:09sima: yeah
09:09javierm: not even for the vmwgfx, nobody reported nothing for that one :)
09:09sima: and then a patch to add a DOC: section explaining the frame vs buffer damage and which sets of helpers you want in each
09:09javierm: sima: sure
09:09sima: javierm, I guess/hope that when you send it out vmwgfx folks hopefully test and apply themselves
09:10sima: since it's really not great when we have inconsistent uapi like this
09:10javierm: agreed
09:10sima: javierm, also might be a good idea to raise a mutter issue, if that one does indeed do buffer damage and not frame damage?
09:10javierm: sima: yes, I'll discuss with the mutter folks
09:11sima: might need a bit care to make sure the fixes have hit enough stable kernels and distros ...
09:12javierm: sima: I'm not worried because mutter has virtio-gpu (and vmwgfx) is a atomic KMS deny list, I've a local patch to test damage handling there
09:12javierm: *in a
09:12javierm: the reason for this is that the cursor hotspots patches haven't landed yet
09:12emersion: > the Pi5 will have multiple display nodes (4(?) iirc) with different capabilities, vc4 being only one of them
09:12emersion: oh. boy.
09:12emersion: daniels ^ you knew about this?
09:13sima: glorious
09:13sima: re pi5
09:13javierm: sima, emersion, pq: thanks again for your time and explanations, I learned a ton about frame vs buffer damage. I need to do some $dayjob taks but will post the patches later today :)
09:13emersion: javierm: glad i could help, thanks for all of the fixes and mess cleanup! :D
09:14emersion: now i think i really need to order a RPI5
09:14emersion: moar exotic setups
09:15daniels: emersion: lots of fun for us all :\
09:17sima: tell me they managed to create the most cursed venn diagrams of tiling modes
09:17sima: like 3 ip blocks, only joint formats between each pair
09:17sima: or something like that
09:18emersion: did they manage to outdo NVIDIA (can't render to LINEAR)?
09:19emersion: mripard: if you have more details about the RPi 5 display nodes i'd be interested
09:20mripard: did you have a coffee yet ? :)
09:20emersion:grabs popcorn
09:21mripard: so, the basic idea is that they stripped MIPI-DSI, MIPI-DPI and analog TV out of the SoC (and thus vc4) and into a discrete PCIe chip
09:21mripard: vc4 only has writeback and HDMI now
09:22mripard: the RP-1 is that PCIe chip and has a public documentation
09:22mripard: https://datasheets.raspberrypi.com/rp1/rp1-peripherals.pdf
09:24emersion: hm
09:24emersion: in terms of tiling and memory requirements for scanout, do you know what's up?
09:24mripard: yes, they only support XRGB8888 iirc
09:24emersion: LINEAR?
09:24sima: weeeeeeeee
09:24mripard: that's my understanding
09:24sima: vram?
09:25emersion: can you allocate CMA and scanout on the PCIe thing?
09:25mripard: so if you want to output any format output by the GPU or codec in hardware, you have to go through vc4, use the writeback to convert it to XRGB, and use that
09:25emersion: or does the PCIe thing have its own scanout memory?
09:25emersion: ugh
09:25mripard: no, there's no VRAM
09:25mripard: so it's all in RAM afaik
09:26emersion: do you *have* to go through writeback, or can you use GL/Vulkan instead?
09:26mripard: I was only involved on vc4, so that part is a bit blurry to me, sorry
09:27emersion: okok
09:27mripard: no, you don't need to, so as long as you have the proper shaders to convert whatever tiled YUV you have to XRGB, it should work just fine
09:27emersion: ok, that makes it easier for user-space
09:28emersion: so they use writeback as a 2D API lol
09:29mripard: I don't know if they do at the moment, but we discussed it at some point
09:30sima: I guess bit less memory bw at the cost of the external displays being one frame behind?
09:30mripard: the good thing is that it puts us pretty much in the same spot than SPI/I2C displays, so javierm is going to be super happy about whatever comes out of it :)
09:30mripard: sima: yeah, and also less power consumption if using writeback I assume
09:30javierm: mripard: haha
09:30sima: mripard, if they can scan out directly from system memory it's a lot better than most spi/i2c
09:31sima: otoh lack of framebuffer compression in a modern soc ... ugh
09:31mripard: sima: sure, I was talking about the formats mismatch between the GPU/codec output and what the display device supports
09:32sima: also pcie in your scanout path ... I do not pity the poor soul who has to figure out the watermarks for the fifos
09:32sima: afaik no one has tried that yet
09:32mripard: sima: "Framebuffer compression is for those who can't build a memory bus" or something
09:33sima: mripard, yeah, except I don't think there's anyone else than apple who can build a memory bus for a mobile/laptop soc :-P
09:33sima: and they also still have compression
09:34sima: mripard, I guess the plan is to instantiate all the devices with dt, because really it's just an arm soc?
09:34sima: behind the pcie device I mean
09:34mripard: I think it's all enumerated?
09:34sima: oh really?
09:35mripard: like I said, I wasn't really involved on that part, but that was my assumption because it's PCIe
09:35sima: I've seen a few of these "whack a soc behind a pcie bridge" designs, and they're all just "here's a pci bar for the entire thing, good luck"
09:36sima: so would be really suprising if it's all properly enumerated as pci functions
09:36emersion: mupuf, if you have a RPi 5 running with their stack, i'd be interested in a drm_info dump
09:38mripard: sima: looks like you're right: https://www.raspberrypi.com/documentation/computers/raspberry-pi-5.html#attaching-a-display
09:40mripard: sima: the code is there https://github.com/raspberrypi/linux/tree/rpi-6.1.y/drivers/gpu/drm/rp1
09:40sima: mripard, well that's just for the display, kinda makes sense to keep that
09:40mripard: and it's indeed DT based devices
09:41sima: I was wondering about all the devices on the soc, since I guess we have drivers for a lot of them already with dt bindings ...
09:41mripard: https://github.com/raspberrypi/linux/blob/rpi-6.1.y/drivers/gpu/drm/rp1/rp1-dsi/rp1_dsi.c#L513
09:41sima: mripard, ah neat
09:41sima: yeah just found the of matches
09:46kode54: Oh right, I need to supply new drm info dump, even if my hardware is less rare now
09:47kode54: (6700 XT, bet you have dumps of those already)
09:48lynxeye: sima: On DT platforms you can simply describe the soc behind the bar via the usual DT means. No idea if Rpi actually does this.
09:49sima: lynxeye, oh so you just have the dt for behind the pcie bar in your overall soc dt?
09:49mripard: https://github.com/raspberrypi/linux/blob/020ee5029ab0b11e47696f538418105ccfdb44de/arch/arm/boot/dts/bcm2712-rpi-5-b.dts#L168
09:49sima: (it's a separate pcie chip here)
09:49mripard: it looks like it's what they do
09:50sima: this feels like it would break gregkh ...
09:51lynxeye: sima: yep, describe the PCIe endpoint below below the RC in your SoC DT, add DT devices as child devices of your PCIe endpoint.
09:51mripard: sima: I'm sure you didn't meet the USB devices binding yet :)
09:52sima: well I guess it's better than the x86 approach of just having random mmio window mirrored into everything, because a driver on windows cannot access anything outside of its pcie bar ...
09:53sima: (and then the chip/fw dies because there's races)
09:54javierm: lynxeye, sima, mripard: I guess the DT binding could be similar to the ChromeOS Embedded Controllers that have peripherals behind it:
09:54javierm: e.g: https://elixir.bootlin.com/linux/latest/source/arch/arm/boot/dts/samsung/exynos5800-peach-pi.dts#L943
09:54lynxeye: sima: x86 with its fixed ACPI tables in firmware and "driver needs to know how the complex HW behind a PCI ID looks like" just feels awkward as soon as you got sufficiently deep into the DT rabbit hole ;)
09:54javierm: this is a EC in a SPI bus that exposes an I2C tunnel
09:55sima: lynxeye, oh it's very awkward
09:55sima: but mostly it's awkward because windows driver model is crap
09:59pq: javierm, you're welcome. :-)
10:08pq: emersion, writeback as a 2D hardware composition API... and color pipelines incoming. We might actually get a proper 2D API in a decade or two. :-D
10:09emersion: yeah… not sure KMS is a super great API for 2D blitters but oh well
10:12javierm: everything old is new again :)
10:12mripard: if only :)
10:44mupuf: emersion:
11:45mupuf: emersion: this time with the message: will do when I get to it. It's not the highest priority right now ;)
11:46emersion: mupuf: ty, yeah, no problem at all
12:03Company: MrCooper: re our discussion about zmike's blog post
12:03Company: while reading the spec, I just found https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#swapchain-acquire-forward-progress which says:
12:04Company: "After a successful return [of vkAcquireNextImageKHR()], the image indicated by pImageIndex and its data will be unmodified compared to when it was presented."
12:26javierm: sima, emersion: I guess that https://docs.kernel.org/gpu/drm-kms.html#damage-tracking-properties is a good place to put that buffer vs frame damage explanation and links ?
12:27emersion: probably yeah
12:27javierm: emersion: Ok
12:27emersion: and link to it from the functions
12:27emersion: ideally
12:28emersion: ask me if you don't know how to link (took me a while to figure out)
12:28javierm: emersion: yeah, I was planning to do that and also in the TODO
12:28javierm: emersion: perfect, will do
12:45mripard: sima: ack for https://lore.kernel.org/dri-devel/20231025132428.723672-1-mripard@kernel.org/ and https://lore.kernel.org/dri-devel/20231025132428.723672-2-mripard@kernel.org/ ?
12:50sima: mripard, ack on both
13:17kj: For isaspec, what would be the correct way to go about a case where all but one instruction has a specific bit used for a paricular function?
13:18kj: I've defined the field in the top most bitset and then used an override in the special case, with an expr that's always true, and provided an "x" pattern as the alternative to the override
13:19kj: Not sure if that's the correct way of doing it, or if there's a better way, without having the field defined in each instruction separately
13:28danylo: kj Alternatively you could add one more level of inheritance and place that field there. You also could probably move override to the base definition and use params to enable it
13:29danylo: But for a single edge case I guess your way is as good as any
13:38kj: Thanks. I'll have a look into how params work then
13:42MrCooper: Company: that makes more sense, thanks
13:43Company: at least all the spec binging is finally good for something!
14:34mupuf: emersion: ready for 16K memory pages on the Pi 5?
14:35mupuf: marcan won't feel alone anymore :D
14:35emersion: wew
14:35mupuf: https://www.raspberrypi.com/documentation/computers/config_txt.html#kernel
14:36karolherbst: I wonder how long we'll keep the page size a compile time option...
14:37emersion: considering how many places assume it's a constant? probably a very long time
14:37karolherbst: I meant on the kernel side tho
14:37emersion: yea
14:38karolherbst: so userspace can either run in 4k or 16k or 64k mode or whatever
14:38karolherbst: but yeah...
14:38karolherbst: it's a massive project to make it all a runtime thing
14:38emersion: sysconf(PAGESIZE) -- wonder how many user-space programs do that
14:38karolherbst: <1%
14:38emersion: optimistic!
14:39karolherbst: `#define PAGE_SIZE sysconf(PAGESIZE)` ez fix
14:39emersion: ahah
14:40karolherbst: but yeah.. I think the idea of turning it dynamic on the kernel side was shot down a few times "due to performance regressions"
14:40karolherbst: so yeah...
14:43MrCooper: folios seem to provide a lot of the same benefits
14:43jani: mripard: mlankhorst: tzimmermann_: ack? https://lore.kernel.org/r/877cn04hvq.fsf@intel.com
14:45mripard: jani: ack
14:45jani: mripard: thanks!
14:54javierm: sima, tzimmermann_: what's the difference between "Subsystem-wide refactorings" and "Core refactorings" sections in Documentation/gpu/todo.rst ?
14:55javierm: I would had thought the later is for refactoring that don't affect drivers and the former if core + driver changes are needed? But the items in each seems to be inconsistent in that regard
14:56javierm: trying to figure out in which one should fall the "Implement buffer age or other damage accumulation algorithm for buffer damage handling" item
14:56javierm: emersion: ^
15:01emersion: hm i wouldn't call it a refactoring
15:01javierm: emersion: yeah me neither but that's how is divided the TODO doc...
15:02javierm: should I add a new section then?
15:02javierm: there's also "Driver Specific", maybe there?
15:03javierm: emersion: hmm, "Bootsplash" and "Brightness handling on devices with multiple internal panels" also have their own sections, I'll add another one then
15:32tzimmermann: javierm, idk
15:32jani: mripard: aww crap. asked for ack, got ack, forgot to record it in the commit messages *facepalm*
15:34mripard: jani: don't worry about it :)
15:34jani: mripard: thanks :)
17:23enunes: emersion daniels so on rpi4 the main device we get from dmabuf feedback in the client is always the gpu render device, and at first we get only tranches still with the gpu device as target devices too, only a bit later we do get a target tranche with the display drm device
17:23enunes: it seems to be ok as the only tranche with the scanout flag is the target tranche with the display drm device
17:23emersion: yeah that's expected
17:25enunes: that info is not passed all the way to the device though, so I guess I need to store it in device specific data so its accessible at the time we do v3dv_AllocateMemory
17:28emersion: we need to stash it somewhere just like we stash the modifier list
18:02enunes[m]: zamundaaa @zamundaaa:kde.org: I guess it will be ok as long as it sends the proper device and scanout flags in the target tranches (btw I only got your message on matrix not irc)
18:03zamundaaa[m]: Ok. Yeah the IRC bridge went down / restarted / whatever today again. This message should get through
18:07emersion: it did
23:46kiroma: Hey, I've gotten myself recently an ARM board and wanted to try contributing to Mesa, what's usually used to test for breakages between patches?
23:48HdkR: piglit and Khronos CTS
23:50HdkR: Which are a nice suite of test cases to ensure the world doesn't break :)
23:59kiroma: Oh, piglit is a wee bit... large