IRC Logs of #dri-devel on irc.freenode.net for 2024-01-24

00:01 alyssa: gfxstrand: "multiple, say, occlusion queries going at once"
00:01 alyssa: wait what?!
00:01 alyssa: that is. not how the hw i'm familiar with works, like, at all?!
00:01 gfxstrand: On IMR hardware BeginQuery writes a value, EndQuery writes another value and CopyQueryPoolResults subtracts
00:01 jenatali: Hm? There's nothing wrong with that from an API standpoint
00:02 jenatali: Right
00:02 alyssa: gfxstrand: ok, no tiler i'm familiar with
00:02 alyssa: vk lets the same draw count towards multiple active occlusion queries?
00:02 gfxstrand: I thought so
00:03 alyssa: O_o
00:03 gfxstrand: But I could be wrong. I'd have to go deep diving
00:03 gfxstrand: And I don't have the spoons for that sort of deep dive tongiht
00:03 jenatali: AFAIK yes
00:03 alyssa: ^ a nice summary of the VK problem
00:03 alyssa: ("Absolutely critical details hidden in some random VU")
00:05 gfxstrand: And the problem with GL is that the absolutely critical details aren't written down.
00:05 alyssa: :clown:
00:07 alyssa: oh right here
00:07 alyssa: VUID-vkCmdBeginQuery-queryPool-01922
00:07 alyssa: gfxstrand: jenatali: not allowed in vk :-)
00:07 jenatali: :O
00:08 alyssa:vaguely remembers doing that exact deep dive years ago for panvk
00:08 gfxstrand: Well, it's allowed in all my drivers. :P
00:08 alyssa: gfxstrand: >:)
00:08 HdkR: Sounds like something an extension could solve. So all games can start using the feature
00:09 alyssa: noooooooo
00:09 alyssa: implementing it on a tiler efficiently sounds.. painful
00:09 alyssa: doable though
00:09 HdkR: I /definitely/ described to Rob that ES 3.0's occlusion queries also allowed multiple of the same active and was wrong :D
00:09 alyssa: granted everything painful seems doable after compute-based geom+tess+xfb ...
00:10 alyssa: HdkR: the spicy case is, can you have a boolean query active at the same time as a precise one?
00:10 alyssa: for VK, clearly no
00:11 alyssa: for GL, I don't know and my drivers don't allow it and they passed CTS.. :p
00:11 HdkR: alyssa: Luckily the only driver that supports precise occlusion queries active in ES is the NVIDIA blob. So it can be safely ignored there
00:12 HdkR: GL land...madness
00:12 alyssa: heh
00:14 HdkR: Although, I'd still be fine with Mesa gaining support for precise occlusion queries in ES. For those poor people that haven't adopted VK yet
00:26 HdkR: Actually no, I'll change my opinion. If you're stuck on a platform with ES and not GL, just use Zink instead :D
06:13 airlied: zmike: I don't think out quantize to f16 handles outputing infinities properly, I seem to get nans
06:15 airlied: alyssa: oh you wrote it maybe you know
06:16 airlied: I wonder where the dxil lowering is
06:17 airlied: I think we need to initial compares to pass new conformance
06:34 airlied: actually I'm lost on what is going wrong, will keep digging :-P
07:21 airlied: ah need the magic split fp64 flag
07:24 airlied: zmike, alyssa : https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27228 should make sense
07:27 airlied: zmike: any ideas what is up with dEQP-VK.api.version_check.unavailable_entry_points ?
07:45 airlied: hmm I wonder if this could be a layer issue
08:13 tzimmermann: javierm, if you have a bit, could you please review https://lore.kernel.org/dri-devel/20240117125527.23324-1-tzimmermann@suse.de/ ?
08:15 MrCooper: Company: you can use e.g. GBM for allocating the buffers
08:17 javierm: tzimmermann: yes, I have in my TODO but was just busy with some internal tasks. I'll do it this week for sure
08:17 tzimmermann: thank you
09:03 Company: MrCooper: yeah, I know I can use external APIs, but I was hoping for way to convince GL to do that
09:03 Company: MrCooper: because unlike me, GL gets the modifiers right
09:29 MrCooper: Company: "right" WRT what? How could GL know which modifiers are (not) suitable for your use case?
09:31 Company: my use case is mainly rendering with GL
09:32 Company: so my idea was that GL knows how to do that
09:35 emersion: but if you're using DMA-BUFs, then you're sharing that buffer with _something else_ right?
09:35 emersion: and that _something else_ may support different modifiers
09:36 Company: sure, but that'd be step 2
09:37 Company: because right now that someone else is Vulkan or a compositor, and they usually agree on things
09:38 MrCooper: GL still can't magically know what they agree on though
09:38 Company: and pretty much any modifier is gonna look better than glReadPixels()
09:39 Company: sure, and I'm prefectly fine with an API where I tell GL my preferred modifiers
09:39 Company: I guess GL can give me its modifiers
09:40 Company: or are they only for importing?
09:40 MrCooper: if you know the modifiers, you can allocate with GBM?
09:41 MrCooper: or you can create a corresponding GL extension, if you prefer
09:43 Company: I'm not sure I know the modifiers
09:43 emersion: Company: there have been situations where client GL doesn't agree with server GL (e.g. flatpak), and GL doesn't agree with Vulkan (missing format/modifier on one side)
09:43 Company: I can eglQueryDmaBufModifiersEXT() but I don't get the preferred modifier for a framebuffer target that way
09:43 emersion: it's not theoretical, i've seen this multiple times in practice
09:43 emersion: also the compositor might do arbitrary filtering
09:44 emersion: for instance gamescope does
09:44 Company: in Vulkan I can specify the format and Vulkan picks its favorite modifier
09:45 Company: actually, I give it a list of modifiers to choose from I think
09:45 Company: and then it reliably picks the weirdest one
09:45 emersion: one really must follow the format modifier negociation dance described in
09:45 emersion: https://www.kernel.org/doc/html/next/userspace-api/dma-buf-alloc-exchange.html#formats-and-modifiers
09:45 emersion: or else there will be breakage
09:46 emersion: yeah, passing a list of formats/modifiers to the allocator is the right way to do it
09:47 Company: so I'll have to add some internal allocate_dmabuf() and then export that into GL
09:48 Company: and before that I need to figure out somehow which formats/modifiers GL supports
09:48 Company: for rendering to
09:49 Company: I guess it's all the formats that eglQueryDmaBufModifiersEXT() doesn't mark as external_only
09:49 emersion: and intersect that with the compositor's list
09:49 emersion: yes
09:49 Company: ... or with Vulkan, depending on where I want to use it
09:50 any1: It's funny that this text says that "it must query the media API it intends to use" but the most common media API doesn't support this kind of querying. :)
09:51 Company: any1: that's really proof that the docs are correct
09:51 Company: because if anything with dmabufs looks straightforward, you would need to be very suspicious
09:53 emersion: any1: 😥
09:54 Company: so, step 1: create an internal gtk_allocate_dmabuf (width, height, list_of_allowed_formats_and_modifiers);
09:54 any1: With VA-API, you need to either throw a buffer at it and pray that it accepts it or allocate the buffers via VA-API.
09:54 emersion: Company: also note that you loose dmabuf feedback handling when you do things yourself
09:54 Company: step 2: figure out how to get the right list
09:54 emersion: any1: oh i was thinking of a different API
09:55 emersion: for VA-API it has some DMA-BUF negociation in place now
09:55 emersion: not quite complete
09:55 any1: emersion: Ahh, so there is hope for it
09:56 any1: what API were you thinking of?
09:56 emersion: my answer now is: just use vulkan video and burry VA-API
09:56 Company: emersion: feedback handling? You mean the zwp_linux_dmabuf protocol feedback?
09:56 emersion: i think V4L2 still doesn't have modifiers
09:56 emersion: Company: yeah
09:57 Company: emersion: I'm not really concerned with that yet, because for now I just want to pass GL rendered buffers to Vulkan
09:57 Company: emersion: but we'll probably grow our own feedback handling anyway, so we can make GStreamer etc hook into that
10:05 Company: once everything works well, we'll ask the compositor for the preferred formats so that gstreamer can make pipewire tell the compositor those formats
10:05 Company: and then screencasting will be real smooth
10:06 karolherbst: ohh right, I was wondering why screencasting is so slow actually.. so that's basically it? pointless format converstions/copies going on?
10:28 kode54: Yes, just use Vulkan video
10:29 kode54: I think that finally supports h264 on AMD now?
10:29 kode54: Not sure it supports anything else
10:29 kode54: Or encoding
10:31 kode54: And Intel is a special hell because they want to ship media encoding outside of Mesa
10:36 airlied: anv has h264 decode
10:36 airlied: and some h265 as well
10:50 kode54: Too bad nobody wants to make encode work fully on that extreme minority outlier, the Arc Alchemist
10:50 kode54: Alas, every Alchemist owner should just upgrade to Battlemage the instant it launches
10:51 kode54: I’m sure everything will be better this time around
10:53 Company: if hwdecode was just about the hardware
10:53 Company: and drivers
10:53 Company: but the multimedia framework APIs all have their issues, too
10:56 airlied: kode54: encoding is written
10:57 airlied: i even had some of av1 decode working last year
11:02 kode54: I meant the media firmware loading
11:02 kode54: Apparently dg2 needs it for things like bitrate control
11:03 kode54: And dg2 is somehow the only generation of hardware to use a particular form of loading method
11:05 kode54: Unless someone wants to add it to xe, or someone wants to add vm bind to i915
11:05 kode54: But I don’t think even vm bind fixes all of anv’s performance problems
11:06 kode54: Can’t tell how far it’s come now, I sold out
11:06 kode54: Sold my A770 to some eager person for full MSRP and bought a 6700 XT
11:47 CounterPillow: Thank god that after vdpau, vaapi, qsv, nvdec, mmal (lol), rkmpp (2xlol), v4l2-m2m and v4l2-requests we now finally got it right this time for sure with vulkan video
11:53 pq: forgot openmax I think
11:55 kode54: Handbrake still only supporting proprietary video codec libraries
11:56 kode54: even though their entire codebase is just a fork of ffmpeg
11:56 CounterPillow: Question: does a device that only implements Vulkan video, e.g. a hwdec IP on an embedded SoC that is completely separate from the 3D GPU and display engine aside from sharing the same memory, need to implement the entirety of Vulkan in some cursed way as well, or can it just chill as a stub device that *only* implements base Vulkan boilerplate stuff and vulkan video?
12:07 daniels: CounterPillow: the latter
12:07 CounterPillow: Okay, that's good news
12:15 Company: CounterPillow: the magic is https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkQueueFlagBits.html
12:15 Company: which defines what a command queue can do
13:29 danylo: Hi, How could I get access to traces at https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db-private/ ? Also, are these trace the same ones used in https://ci-dashboard.steamos.cloud/ ? I'd like to have more traces for us to test the Turnip on.
13:30 zmike: DavidHeidelberg: ^
13:37 daniels: danylo: I've added you now, and also no
13:39 danylo: daniels Thanks
13:40 daniels: np
14:47 pcercuei: Quick (offtopic) question, https://elixir.bootlin.com/linux/latest/source/drivers/gpio/gpiolib-acpi.c#L144
14:48 pcercuei: Is the coding style now to use __free(), even if that means declaring the variables mid-block?
16:44 demarchi: jani: can you check this? https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/30
18:13 demarchi: daniels: https://lore.kernel.org/all/CAHk-=wi9XK=TQ7tk6+2ymx8Upm6r36vY6wF9REpt1sho2ySteg@mail.gmail.com/ - see complain about fdo mangling emails
18:14 demarchi: daniels: any idea if it's indeed fdo? I wonder if it's because of the mailman misfeature of tweaking the Cc headers
18:15 demarchi: if you cc 2 mailing lists, lore.kernel.org will end with 2 different emails (as far as Cc header is concerned), with the same msgid
19:32 robclark: alyssa: re: same draw, multiple queries, freedreno just does same thing an IMR would do, from cmdstream capture counter value at start and end of query, and then math (the only one we punt on is time-elapsed queries that are read-back from shader ... since that would require doing division and division is hard
19:38 alyssa: robclark: interesting, and that.. works even with gmem?
19:39 alyssa: (It doesn't work on either Mali or Imaginapple)
19:41 robclark: with the usual caveats about ordering of rendering, yes
19:42 robclark: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/freedreno/a6xx/fd6_query.cc?ref_type=heads#L125 we just do `result += stop - start` where the query is stopped for each tile
19:42 robclark: by the end of all tiles you have the final result
19:43 robclark: qbo stuff that reads back from shader ends up flushing a tile pass, IIRC
19:59 alyssa: huh, neat
19:59 alyssa: can adreno process multiple tiles in parallel?
20:15 austriancoder: can anyone name an app/benchmark that makes extensive use of scissored clears?
20:54 robclark: alyssa: so, ignoring pipelining, it is one tile at a time.. but the tile dimensions are somewhat arbitrary and only limited by gmem size
20:56 robclark: (where gmem size can be something like maybe 256KB on smallest gpu, and up to 4MB on a690)
21:17 alyssa: fascinating
21:17 alyssa: this is all. very different from what I'm used to
21:21 robclark: it's as if, ATI took an IMR and added tiling to it :-P
21:30 alyssa: >:P
22:06 mareko: robclark: I'm confused... is adreno just an IMR where you do tiling in the driver, or something else?
22:07 mareko: binning I mean
22:08 mareko: there is also an AMD mobile GPU, FYI
22:21 robclark: mareko: you can kinda think of it like that