IRC Logs of #dri-devel on irc.freenode.net for 2024-03-28

01:06 DemiMarie: Could the needed driver support for virtio-GPU native contexts be standardized as a Vulkan extension?
01:07 DemiMarie: I’m thinking as a VK_EXT_command_serialization extension.
03:17 jenatali: karolherbst: ack. I've briefly thought about it a long time ago so I'm happy to talk through it if you want (when I'm actually around anyway)
03:18 jenatali: zmike: eric_engestrom: +1
10:05 Ermine: Hello, I'm trying to build mesa main, and I've got this error message: https://paste.sr.ht/~ermine/9ea0dd89b6db21f85fe8049b43da7fca79293ba0
10:12 dj-death: Ermine: could you file an issue with the meson arguments you're using?
10:12 dj-death: Ermine: looks like a meson dependency missing
10:13 mripard: jani: thanks for the DRM_DW_HDMI review, I'll merge it today so we can have plenty of testing time
10:13 jani: mripard: not really detailed review, but I did eyeball it through and didn't see anything obviously wrong
10:28 karolherbst: jenatali: yeah.. I think atm I don't really know _how_ I want this to look like. I've pushed a new version where I added a `spirv_create_prog_var_init_shader` function so we can do something differently there.
10:28 karolherbst: and my initial idea was to simply construct a shader from the constants instead
10:29 karolherbst: the use of the global variables is all done and I have tests passing as long as you don't have a ptrAccessChain in the specconstantop
10:29 karolherbst: the difficult part is just silling that value...
10:29 karolherbst: _but_
10:29 karolherbst: we could also just say: fuck caching it, just pass the address when compiling
10:30 karolherbst: and the buffers location is just an input to the compilation
10:30 karolherbst: _but_
10:30 karolherbst: that also requires having a layout for the data, becaue ptrAccessChain...
10:34 karolherbst: so I kinda think building a shader by having a custom `vtn_handle_constant` is probably the way to go...
10:34 karolherbst: and if this shader remains empty because there is no ptrAccessChain you can just throw it away and use the initializer to init the buffer without running a kenrel
10:34 karolherbst: and if there are instructions you just run the code
11:09 pq: emersion, sima, did anyone end up officially documenting that 'stride' is well defined also beyond LINEAR modifier formats? I'd like to link to such doc.
11:09 emersion: yes
11:09 emersion: hm
11:10 sima: I think we accidentally burried into some kernel internal api
11:10 pq: you probably guess what weston issue+MR I'm replying to :-p
11:10 emersion: https://www.kernel.org/doc/html/next/userspace-api/dma-buf-alloc-exchange.html#term-stride
11:10 pq: thanks!
11:10 sima: unfortunately it has this "usually" qualifier in there :-/
11:11 emersion: maybe it's not precise enough
11:11 emersion: yeah…
11:11 sima: I thought we've had a sharper one somewhere
11:11 emersion: same…
11:18 pq: uhh, yeah, it seems to totally miss the wording "divided by tile height" for tile-row to tile-row number of bytes
11:20 sima: rm_format_info_min_pitch is probably the most authoritative source for anything that's block-based
11:20 sima: *drm_format_info_min_pitch
11:20 sima: but it's not documented anywhere I could find it :-/
11:21 sima: so maybe should add that
12:20 karolherbst: cursed: "decl_var global INTERP_MODE_NONE none uint64_t p_var = &a_var"
12:22 karolherbst: though that makes me think...
12:22 karolherbst: do we want to allow deref chains on variable initializers?
12:23 karolherbst: the indicies would be all constant
12:24 jenatali: karolherbst: for constants, why do you need an init shader? You're going to allocate a buffer for the globals, why can't you just parse out the data that needs to be uploaded to the buffer from the host side?
12:24 karolherbst: jenatali: because ptrAccessChain is legal as a specconstantop
12:24 jenatali: I guess for pointers they'd need to be relative to the buffer base address but that still seems resolvable on the host side
12:24 karolherbst: how?
12:24 karolherbst: it can do any math on the pointer
12:25 jenatali: Oh, math, I see...
12:25 karolherbst: yeah.. it's the normal specconstantop and you could cast the address to an int and...
12:25 karolherbst: normal pointer math :)
12:25 karolherbst: so yeah.. without that detail it's all trivial and already working
12:26 karolherbst: what I've seen some compilers doing is to just place it at a fixed address
12:26 karolherbst: in the C world I mean..
12:26 karolherbst: I think gcc was like that?
12:27 jenatali: CLOn12 would probably do that honestly, since our pointers are emulated anyway
12:27 karolherbst: but we could also just always allocate a big enough buffer (as there is a CL limit for the max size) and just use the buffers address..
12:27 karolherbst: but that's wasteful
12:27 jenatali: What's the size got to do with it?
12:27 karolherbst: mhhh.. yeah.. but I can't really add an ABI like that I think? some drivers like iris could handle it, but not sure about zink and others
12:28 karolherbst: jenatali: If I want to know the buffers address at compile time, I need to allocate the memory before compiling
12:28 karolherbst: that would also solve the issue, just has the drawback of allocating early
12:28 jenatali: Oh I see it's because the spec constant stuff compiles out in nir instead of being preserved
12:28 karolherbst: yeah...
12:29 karolherbst: vtn just resolves the entire constant and you get the final result
12:29 karolherbst: we need to support initializer/finalizer kernels anyway, so the runtime needs to be able to do that anyway
12:30 karolherbst: so I think spilling the chain if there is a pointer isn't that worst idea here...
12:30 jenatali: Could you add a way to not do that for addresses, so you can parse out a "load base address" + offset expression?
12:30 karolherbst: I was considering this, but the expression could become very complex
12:30 jenatali: But yeah putting it in an init shader also makes sense
12:30 karolherbst: at which point you could just have a shader doing it...
12:31 pq: sima, I guess it's your turn to argue about 'stride' in https://gitlab.freedesktop.org/wayland/weston/-/issues/896
12:31 jenatali: True, could do pointer math to get a difference, at which point it's no longer a trivial single offset
12:31 karolherbst: and because init shaders need to be supported anyway there isn't much to win
12:31 jenatali: Ok I'm sold
12:31 karolherbst: writing the code is just annoying :D
12:31 karolherbst: but I think I'm almost there
12:32 jenatali: Cool
12:33 karolherbst: the issue is just that the CTS tests with a single `= &var;` thing, so I'll need to test with more complex expressions later
12:49 sima: pq, typed up something ... I kinda don't why we need to have an even more random definition of stride
12:49 sima: at least for the upstream stack
12:50 pq: thanks
12:51 sima: "some drivers are still crap at input validation" is really not a good reason
12:51 sima: and there's a very unfortunate design issue why we can't do this check in generic code, drivers can overwrite the format_info stuff
12:51 sima: we'd need to lift a pretty big chunk of the format validation from drivers to drm core to change that
12:52 sima: plus there's the entire "guess the format_info for the implied modifier case"
12:53 DemiMarie: sima: You can probably guess what I think about moving input validation to generic code 🙂. (I like it.)
12:54 sima: we're trying ...
12:55 sima: but with a hundred or so kms drivers every tiny move becomes an entire internship project real fast unfortunately
14:01 alyssa: mareko: Arm's ISA is really sane here?
14:03 eric_engestrom: zmike, jenatali: you're welcome, and thank you for showing your appreciation... but also: what are you saying thanks to? 😅
14:03 eric_engestrom: from the timing of your message, I'm guessing the mesa releases?
14:04 jenatali: That's my assumption what Mike was going for. That's what mine was at least
14:04 eric_engestrom: :)
14:04 eric_engestrom: ❤️
14:05 zmike: mesa releases 👍
14:07 pq: emersion, sima, maybe you were thinking of https://gitlab.freedesktop.org/pq/color-and-hdr/-/blob/main/doc/glossary.md#stride ? That's hardly authoritative for the kernel though. :-)
14:16 sima: pq, yeah I tried to sign up emersion to type up a really crisp one but I guess I failed :-)
14:17 emersion: lol
14:25 sima: anyway I'll disappear for easter ...
14:42 zmike: anyone remember which desktop GL version/ext added linear float filtering?
14:46 DemiMarie: sima: hundred? Wow! Why are there so many more KMS drivers? Bad vendors that won’t open source their userspace and so their DRM driver gets NACKd?
14:46 linkmauve: zmike, GL 3.0, with GL_EXT_framebuffer_sRGB, IIRC.
14:46 zmike: nice, thanks
14:46 linkmauve: Not sure about GLES though.
14:46 zmike: ES has an extension for it
14:48 pinchartl: zmike: sounds like "there's an app for that" :-)
14:48 zmike: yep
14:53 zmike: anyone remember which desktop GL version/ext added BGRA8? or was that always core functionality
14:54 mareko: some GL 1.x
14:54 zmike: that's what I thought...
14:54 zmike: wew these tests.
14:55 mareko: GL 1.2
14:55 pq: DemiMarie, you can see 65 sub-directories under driver/gpu/drm. Why wouldn't they be separate drivers?
14:56 pq: DemiMarie, successful upstreaming increases the number of upstream KMS drivers.
14:58 DemiMarie: pq: I thought each upstreamed DRM driver had to have a counterpart in Mesa, and that Mesa had far fewer drivers than that.
14:59 pq: KMS drivers don't have matching Mesa drivers at all
14:59 pq: I mean, KMS-only drivers
14:59 DemiMarie: pq: I am trying to understand why the fraction of KMS-only drivers is so high.
15:00 pq: lots of different display chips from lots of vendors?
15:01 pq: not just physical chips, but logical chips that manufacturers can embed in SoC
15:02 mripard: and it's much easier to design as well
15:02 mripard: so vendors will typically use their own display controllers together with an off-the shelf GPU from a handful of vendors
15:11 mareko: alyssa: I don't really know ;)
15:40 alyssa: mareko: recent mali is legitimately nice hw
15:40 alyssa: it's just universally glued to garbage SoCs with little memory bw
16:55 DemiMarie: Are Google’s Tensor cores garbage?
16:55 DemiMarie: *Tensor SoCs
16:56 dwfreed: DemiMarie: as I understand it, they're mostly Samsung Exynos SoCs plus an NPU core
16:57 DemiMarie: dwfreed: are those any good?
16:57 pac85: I always wondered, in those ISAs that pick the lowest PC as a scalar PC when diverging and masking threads based on vector pc==scalar pc like adreno, what is the advantage compared to the AMD or the AGX approach? On nvidia the HW actually runs branches concurrently but when it's still masking one branch at a time how is the extra complexity justified?
16:57 dwfreed: They're not amazing (certainly no Snapdragon), but they're not awful, either
16:57 dwfreed: I have a Pixel 6 Pro and it works fine
16:58 zmike: mareko: how do I make glthread sync for a function?
16:58 zmike: like GetError does
17:06 zmike: punted to ticket
17:13 alyssa: cwabbott: wow @ your latest MR!
17:13 alyssa: amazing
17:15 HdkR: Congrats! More drivers with ray queries!
17:26 pac85: Amazing work!
17:34 jenatali: \o/
17:34 jenatali:really needs to add that to the DXIL backend at some point
17:41 cwabbott: thanks! although in the end it was just a looot of copying stuff from radv and hacking it until I figured out what it actually did
17:42 glehmann: adreno living up to its name
17:43 mareko: zmike: marshal="sync" in the xml
17:49 zmike: thx
20:27 karolherbst: jenatali: finally getting somehwere: https://gist.githubusercontent.com/karolherbst/48ece00b242619e82eadf95ce26f6486/raw/8dc6dbabf8932dcb73b5fde837412c8abf002970/gistfile1.txt
20:28 jenatali: Very cool
20:28 karolherbst: it's all very hacky, but I'm slowly getting an idea how to implement all of this :D
20:30 karolherbst: mhh.. I also need to figure out how to launch internel kernels, because I don't do anything of that yet...
20:42 karolherbst: I wonder if I want to do a CL meta thing...
20:53 HdkR: karolherbst: You want to do a CL meta thing.
20:53 karolherbst: I actually don't
20:54 karolherbst: though I could restructure things internally a bit to make that trivial to do
20:54 HdkR: It sounds like a good idea to me :)
20:55 karolherbst: what shader lanugage is the vulkan meta thing using anyway?
21:00 HdkR: I must have missed the Vulkan specific meta thing rather than the CLC shenanigans
21:02 karolherbst: I need something for running memcpy shaders, but I'd rather not go through the CL API for that, because that's kinda annoying tbh :D
21:07 HdkR: I thought that was the point of the clc integration? The shaders end up being CL, but the API is just fancy NIR compute?
21:08 HdkR: Sounded like the best case scenario, aside from the whole requiring LLVM bits during compile time :D
21:09 karolherbst: sure.. but I need to run internal nir shaders
21:12 HdkR: Isn't that basically what asahi uses them for?
21:12 karolherbst: no, thye have them written in OpenCL C
21:12 karolherbst: which I can't for some use cases
21:12 HdkR: Interesing
21:13 karolherbst: yeah... I need to spill spirv initialzers to a shader, soo...
21:14 HdkR: wha
21:14 HdkR: Cursed
21:14 karolherbst: yes...
21:14 karolherbst: so you can have global variables being pointers and you can initializer them with pointers obviously
21:14 karolherbst: and random pointer path
21:15 HdkR: This smells a lot like cuda
21:16 karolherbst: probably that's where the idea was from..
21:17 HdkR: meta cuda compiler time
23:35 karolherbst: I think I found a spirv translator bug :'(