00:00fdobridge: <gfxstrand> I think that one works. I know the first release with ESO works, at least on my Turing.
00:02fdobridge: <redsheep> I'm thinking about these again and I realize all 3 would be a big boon to nouveau, which is nice
00:02fdobridge: <redsheep> I assume mesa compute would help make dlss possible?
00:05fdobridge: <redsheep> Is common raytracing the next major task? That seems like that might even be worthwhile before going terribly far down the general performance rabbit hole, since you'll probably end up ripping up quite a lot of compiler stuff in order to accommodate the weird things RT does to gpus
00:06fdobridge: <redsheep> It's also most of the rest of the feature work afaict
00:09fdobridge: <redsheep> I'd really love to help work on it any way I can, learning about GPU raytracing after turing was announced was kind of the moment that got me interested in how gpus work
00:10fdobridge: <gfxstrand> I don't have a plan for what's next. That's not really the way my brain works. I've got a bunch of things in the cooker and one of these days it'll suddenly be The Right Time and one of them will suddenly materialize as code.
00:13fdobridge: <phomes_> I wanted to use nvdump to see what SPH NV is setting for the post_depth_coverage tests and if they differ from ours
00:16fdobridge: <phomes_> I posted an update on what I have been experimenting with for that in the MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29194#note_2436180
00:16fdobridge: <gfxstrand> Throw me a GLSL and I'll run it after a bit
00:19fdobridge: <phomes_> `#version 450
00:19fdobridge: <phomes_>
00:19fdobridge: <phomes_> #extension GL_ARB_post_depth_coverage : require
00:19fdobridge: <phomes_> layout(early_fragment_tests) in;
00:19fdobridge: <phomes_> layout(post_depth_coverage) in;
00:19fdobridge: <phomes_> layout(location = 0) in vec4 vtxColor;
00:19fdobridge: <phomes_> layout(location = 0) out vec4 fragColor;
00:19fdobridge: <phomes_> void main (void)
00:19fdobridge: <phomes_> {
00:19fdobridge: <phomes_> const int coveredSamples = bitCount(gl_SampleMaskIn[0]);
00:19fdobridge: <phomes_> fragColor = vtxColor * (1.0 / 4 * coveredSamples);
00:19fdobridge: <phomes_> }`
01:08fdobridge: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1246269161005518879/pdc.nvdis?ex=665bc61a&is=665a749a&hm=773174c9d85934cfeb97fc18293464687d2cd1e33b90c7e521735db23054fa78&
07:43fdobridge: <dadschoorse> that kind of sounds like rdna3+
07:43fdobridge: <dadschoorse> hw scoreboard ensures correctness
07:44fdobridge: <dadschoorse> compiler needs to insert dependency information to not stall on instruction latency
07:45fdobridge: <dadschoorse> except when it doesn't ofc, there are quite a few hazards around that, and some of the are definitely not intended
07:47fdobridge: <dadschoorse> rdna1 and rdna2 are the same in this regard (but rdna1 had some score board bugs that needed compiler intervention)
07:47fdobridge: <dadschoorse> gcn was "everything is 4 cycles"
07:48fdobridge: <dadschoorse> because the simd unit was smaller than the subgroup size, so running the same instruction 4 times already hides all latency
07:49fdobridge: <dadschoorse> makes occupancy on big gpus an issue though, because you need a lot of invocations to fill all simd units
07:51fdobridge: <dadschoorse> makes occupancy an issue on big gpus though, because you need a lot of invocations to fill all simd units (edited)
07:51fdobridge: <redsheep> Yeah they were stuck at 64 CUs for a weirdly long time, and I expect it was exactly that issue. Bigger chips probably just didn't help.
07:54fdobridge: <dadschoorse> I think it was partially that issue, partially an inadequate memory and cache system
07:55fdobridge: <dadschoorse> and I think rdna also improved throughput of the geometry engine a lot, so that probably also helps application like games
07:55fdobridge: <redsheep> Clearly nvidia has found the right formula to make really big gpus work. Sure, you get less frames per peak theoretical compute than AMD (paricularly on RDNA 2), but they are getting so much more compute per area and per watt that it doesn't matter at all, and all of that works great when you aren't gaming.
07:56fdobridge: <redsheep> I understand the above is a grossly simplistic view, but it is actually a model for how to look at a gpu's "efficiency" that I have found is more useful than you might expect
07:56fdobridge: <marysaka> Note for later: We should probably reverse the new format from after 535.xx but I know that the pipeline cache one is still the same so we could probably add support for that in the tool
09:34fdobridge: <rhed0x> I vote for the common RT code 🙃
09:43fdobridge: <dadschoorse> I hope gallium2 isn't named gallium2 tho 🐸
09:43fdobridge: <karolherbst> it will be called gallium71
09:43fdobridge: <!DodoNVK (she) 🇱🇹> gallium727
09:53HdkR: Gallium64
10:10HdkR: Bridge said goodbye and hello
10:11karolherbst: it doesn't work..
10:12karolherbst: ohh wait
10:12DodoGTA: I wonder if this is a quick fix
10:12karolherbst: yeah.. I forgot to update the channel id
10:13karolherbst: mhh
10:14DodoGTA: Hello from IRC
10:16karolherbst[d]: nah
10:16karolherbst[d]: just the channel was for registered nicks only 🥲
10:19karolherbst: oh well
10:20karolherbst[d]: anyway, it should work now
10:20mohamexiety[d]: rhed0x: yeah. lets get the shiny lights party started
10:20marysaka[d]: nice
10:20karolherbst[d]: if spam becomes an issue, I might have to add an IRC bot voicing the puppets
10:21karolherbst[d]: the one issue is, that all puppets can't be pmed, but not quite sure how much of an issue that is
10:21karolherbst[d]: and also if oftc gets upset at some point for having too many IPv6 connections out of the same prefix 🙃
11:50pixelcluster[d]: I mean there already is quite a bit of common rt code in the works
12:18dadschoorse[d]: the question is if that is even usable for nv
13:45cwabbott: gfxstrand: by common rt, hopefully you're not thinking about reinventing https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28446
13:46cwabbott: from the quick look I had it seemed the nv compressed internal node/triangle node encoding was quite similar to qualcomm's fwiw, there was some "convergent evolution" going on
15:00gfxstrand[d]: Nah. We may want to move towards CLC at some point but my goal isn't to reinvent. I certainly don't want to duplicate all the sorting algorithms.
15:01gfxstrand[d]: But also I don't have a solid plan for what "common ray tracing" means. Just a "yeah, there's a bunch of stuff in that area"
15:17cwabbott: CLC would be nicer but I held off on trying to use CLC because all the radv stuff is written in GLSL and also we have no solid plan for a CLC to SPIR-V stack that's actually maintained and not a pile of hacks
15:18cwabbott: spirv-llvm-translator was explicitly rejected by LLVM upstream and so afaik is more in the "pile of hacks" category, I'm not sure if the SPIR-V backend is going to be maintained
15:20cwabbott: what's "common" right now is all the stuff before the HW encoding phase, so actually building the tree is 100% common code
15:22cwabbott: there's a common "IR" and all you have to do is provide a function to encode the tree
15:23cwabbott: there is a lot of commonality and copy-pasting in the encoding kernel, though, but it's still different enough that it would be hard to make something common for that
15:27cwabbott: to quantify it a bit more, the common MR is +9311 lines and subtracting that the turnip MR is +4664 lines (including ray query lowering and a bunch of random stuff because qualcomm decided to be annoying and add a fuse to disable RT that has to read by the kernel and passed to userspace 🤮)
15:47karolherbst: cwabbott: rejected is quite the strong term here, people just agreed from the start that the best solution would be a proper target and until then the translator should simply remain out of tree
15:48karolherbst: unless there was a more recent discussion I've missed
15:50gfxstrand[d]: I think the bigger problem is that the whole thing is pretty laissez faire. Year after year they keep talking about how much better it's gotten while it still can't pass validation.
15:50karolherbst: yeah...
15:51karolherbst: though I get the feeling that people start to care more and more
15:51gfxstrand[d]: I think they do but it's slow going
15:51karolherbst: half my DPCPP invalid spirv bugs are still open :')
15:51karolherbst: the most annoying one: sys valls are still sometimes in the global address space
15:52karolherbst: and mesa totally chokes on that
15:52karolherbst: ohh, and this one: https://github.com/intel/llvm/issues/11531
15:52gfxstrand[d]: That said, I do think we're going to see CLC -> SPIR-V become a hard dependency in Mesa at some point.
15:52karolherbst: I wonder if we want to make bool one byte for CL....
15:52karolherbst: gfxstrand[d]: it already is
15:52gfxstrand[d]: I've heard it suggested that we just write a C compiler. 😅
15:53karolherbst: yeah......
15:53gfxstrand[d]: Not for all drivers
15:53karolherbst: right
15:53karolherbst: but for iris
15:53karolherbst: and anv, which means, it's practically required
15:53karolherbst: nobody is shipping mesa without those
15:53gfxstrand[d]: Yeah, it's required for ANV and Asahi so all the distros have to deal with it.
15:53karolherbst: (well.. except some vendored binaries, like for .. the pi4+ I guess?)
15:54karolherbst: but people want to see rusticl on those, soooo...
15:54karolherbst: (what have I done)
15:54gfxstrand[d]: Really, what we want is to write our kernels in Rust...
15:55karolherbst: yeah...
15:55karolherbst: we have somebody looking into that for zig, but what we really need is somebody to do the same for rust
15:55karolherbst: I can't imagine it to be impossible to do it via proc-macros and stuff
15:55karolherbst: but... it's going to be a lot of work
15:55karolherbst: I wonder if the OpenCL WG could sponsor somebody to work on it...
15:56karolherbst: maybe I should bring it up
15:56gfxstrand[d]: https://github.com/EmbarkStudios/rust-gpu
15:56karolherbst: mhhh
15:56karolherbst: maybe the OpenCL WG should sponsor that project
15:57karolherbst: last updated 4 months ago
15:57gfxstrand[d]: 🤷🏻♀️
15:57karolherbst: anyway, I'll bring it up and see if the WG is interested, then just find the person who wants to work on it
15:57gfxstrand[d]: The CL working group throwing $30K/year at it isn't going to do much
15:57karolherbst: better than nothing
15:58karolherbst: though I think the WG would be more interested
15:58karolherbst: or at least there is interested, and I think the WG would tacke that in a more serious manner if it agrees
16:30cwabbott: karolherbst: from what I remember there was an attempt to upstream spirv-llvm-translator and it was rejected with "this is the wrong approach, we will never support this"
16:31cwabbott: which is why llvm feels free to break it with stuff like untyped pointers with no attempt to make sure it works before making the switch
16:32cwabbott: they explicitly said that khronos should do an llvm backend instead
16:33cwabbott: so our options are something that can break at any time due to some llvm decision or something that's afaik half-complete with no funding to continue it
16:33snektron[d]: Zig mentioned
16:34snektron[d]: I did ask eddy if they were interested in compute for rust-gpu, but it was not on the agenda at that time. I don't think they were against it either, and at least for vulkan, i don't think it would be too hard to either
16:39gfxstrand[d]: The LLVM backend is progressing
16:40snektron[d]: Maybe through that it would be easier to support rust on opencl 🤔
16:44cwabbott: ah, I guess intel is still working on the LLVM backend for compute
16:45cwabbott: it was the graphics/hlsl side which was done by google and that's obviously in question now
16:51djdeath3483[d]: cwabbott: I think we might have to check in translated spirv files for anv/iris, tired of telling people how to install llvm packages...
16:52cwabbott: with the spir-v backend, it should be possible to "just" call clang when building mesa and get a binary, right?
16:53djdeath3483[d]: Also some C kind of language is useful if you want to double dereference some pointer. Can't do that in glsl right?
16:53cwabbott: you can have arbitrary pointers in GLSL but the syntax is ass
16:53djdeath3483[d]: cwabbott: Never tried it
16:54gfxstrand[d]: The problem with Intel working on the LLVM backend is that from their perspective SPIR-V is an ephemeral thing that exists to tie two instances of LLVM together.
16:54cwabbott: we have magic macros to make it slightly less ass
16:57gfxstrand[d]: Yes, with the SPIR-V backed, it's just a clang target
16:57gfxstrand[d]: Which is really nice
16:58gfxstrand[d]: I think that's the ideal if we want to build stuff into Mesa, I think that's the way to do it. Then we can parse and NIRify at run time
17:02cwabbott: yeah, I don't want to impose building llvm onto people with turnip, given all the random users trying to do stuff on android
17:03cwabbott: nvk is presumably the same logic, with random users trying to do stuff on linux instead
17:04cwabbott: I just have no idea who's going to do the work to make upstream llvm-spirv usable for mesa/NIR
17:04karolherbst: yeah.. that's the question
17:05cwabbott: until then I'll have to suffer with GLSL and nir builder
17:05karolherbst: I'm still having untyped pointer and linking functions on my todo list, but uhh... this is the part of the stack I liked to work on the least
17:05snektron[d]: Wait so what's the problem now with clc and spirv-llvm-translator?
17:06i509vcb[d]: I mean at some point isn't the answer to just vendor llvm?
17:06cwabbott: karolherbst: hmm, does the spirv llvm backend have a problem with untyped pointers?
17:06karolherbst: cwabbott: no, but you may end up with incompatible function signatures when linking
17:07karolherbst: and the spirv spec says function args have to match
17:07karolherbst: like.. even the pointer types have to
17:07karolherbst: but if the translator hits a foreign function, LLVM won't tell it what pointer types the function arguments are
17:08karolherbst: there is some workaround to deduct that from how it's used, but that's not always working
17:08cwabbott: fun times
17:08cwabbott: I guess the closest analogue is function mangling in c++
17:08karolherbst: yeah...
17:08karolherbst: my hope is that a spirv backend can get that information somehow, but I think it's also broken the same way
17:09cwabbott: maybe you just have to mangle the names in clang and use that to deduce the type when producing spir-v?
17:09karolherbst: maybe?
17:09cwabbott: or something, dunno
17:09cwabbott: llvm problems...
17:09karolherbst: the annoying part is, that it doesn't matter in C or C++
17:09karolherbst: because the linker doesn't actually care
17:10karolherbst: and I doubt any other language cares
17:10cwabbott: guess you gotta raise the issue in llvm and get the popcorn out
17:10cwabbott: khronos vs. llvm, FIGHT!
17:10karolherbst: I think they don't care, but I might have to
17:12karolherbst: the thing is, it's atm the only issue left which prevents me to file more conformance submissions with llvm-17+
17:13karolherbst: mesa vendoring LLVM, because of this would be some statement tho
17:16cwabbott: given that khronos is committed to supporting SPIR-V somehow in OpenCL and the spir-v llvm backend is the only llvm-sanctioned thing that compiles CLC to SPIR-V, and it's broken, I'd assume that khronos has to care
17:17cwabbott: you could come up with some RFC for how to handle it, get it ripped to shreds and then people will care and tell you what to do
17:18cwabbott: that one's a classic
18:01esdrastarsis[d]: esdrastarsis: I think I figured out why NVK wasn't working on Wayland
18:02asdqueerfromeu[d]: esdrastarsis[d]: Why?
18:03esdrastarsis[d]: It's because mesa 24.1 NVK needs mohammed's kernel patch which is in 6.10
18:05esdrastarsis[d]: I tested it in 6.9.2 and I had a seg fault error and in 6.10-rc1 it's working, so that must be it
18:08esdrastarsis[d]: gamescope is working btw
18:28mohamexiety[d]: that's weird though; NVK checks if the kernel change is done or not, and if it's not detected it defaults to the old behavior, so it should've still worked o.o
18:28gfxstrand[d]: Yeah, all our behavior changes should be based on a kernel check
18:44gfxstrand[d]: It's possible there's other fixed in 6.10, though.
20:24airlied[d]: There is spirv untyped ptr ext
20:24airlied[d]: Not sure where it is, maybe still wip
20:25airlied[d]: Also the translator is technically the better approach than a backend but llvm upstream didn't want to lock things down internally at that level to take the translator on board
20:33dadschoorse[d]: why is it the better approach?
20:34airlied[d]: SPIRV is closer to LLVM IR than traditional assembly. So you lose information in the backend infrastructure that you have to sometimes recreate
20:36airlied[d]: Like you don't really need instruction selection so things are a bit more inefficient
20:40magic_rb[d]: Has plasma 6 wayland been tested on nvk? Does it work? What kernel/mesa do i need (very new, or 6.9 with month old mesa good enough)
20:46karolherbst[d]: airlied[d]: ohh, so what an LLVM target gets is even lower than the LLVM IR stuff? uhh.. pain
20:47karolherbst[d]: airlied[d]: I still think this is all very silly to add an extension mostly to work around LLVM decisions 😢
20:48karolherbst[d]: though I'm sure the extension has other benefits
20:49karolherbst[d]: I think I'll still vendor the spirv linker and just hack around this issue, however... uhhh.. this is all veyr annoying
20:59airlied[d]: A backend has to look like a backend, so has to have isel and possibly some sort of RA
21:00airlied[d]: Stuff that is kinda pointless for spirv
21:37airlied[d]: They should just relax linking to be saner
21:58karolherbst: airlied[d]: yeah.. so that was my idea of adding an optional in spirv-link to simply ignore if the signature doesn't match for function parameters and cast the pointers to whatever we need. But like the spirv-tools have yet another IR and the current linker doesn't really use it, so it's all kinda painful to add
21:59airlied[d]: Yeah an option seems like the easiest answer
22:00karolherbst: airlied[d]: in case you have some time to figure it all out: https://github.com/KhronosGroup/SPIRV-Tools/pull/5534
22:00karolherbst: I kinda want to focus on rpi4+ support 🙃
22:00karolherbst: but I fear that spirv-link part needs a bigger rewrite
22:01karolherbst: and to use the pass manager thing they have going in other places or so
22:10karolherbst[d]: wait what, windows doesn't allow file names to be `aux`?
22:12airlied[d]: Yeah no aux no con
22:12karolherbst[d]: yeah.. there is a broken patch about that on the ML
23:14redsheep[d]: magic_rb[d]: I have been testing it, yes. If you mean on the normal GL driver, yes that works as well as it ever has. If you mean on zink, no it doesn't work at all and I'm still trying to figure out why. Even with the gfxstrand nvk branch of the kernel it fails to load the session, just a black screen. Oddly overview works
23:16redsheep[d]: As for the mesa side, main has all the relevant patches I know of
23:27rinlovesyou[d]: Didn't i just read that it works in 6.10?
23:27rinlovesyou[d]: esdrastarsis[d]: .
23:28rinlovesyou[d]: Or is this saying that nvk in general doesn't work on Wayland? Because last time i tried that wasn't the case
23:29redsheep[d]: Slightly different issues. That was certain kernels not being able to have NVK apps work in any kind of Wayland session, where as the one we were referring to is specifically the plasma Wayland session being completely broken
23:30rinlovesyou[d]: Lovely
23:31rinlovesyou[d]: I'm still waiting on plasma 6.1 until i start using Wayland so all my testing has been on x11
23:31redsheep[d]: Yeah the zink+NVK plasma x11 session is decently usable now
23:31rinlovesyou[d]: Yee
23:32redsheep[d]: Though I've had to resort to shutting off zink for discord and chromium, they have corruption. And I'm still seeing sometimes really aggressive flicker from what looks like sync issues, but it's really hard to replicate
23:32rinlovesyou[d]: Is Wayland sessions not working a zink or nvk thing?
23:33redsheep[d]: When it happens it's really obvious and bad, but you can go an hour or two without it
23:33rinlovesyou[d]: redsheep[d]: I haven't been able to replicate either of those whenever i go to test
23:33redsheep[d]: rinlovesyou[d]: Both? You're only using NVK for the session if you use zink
23:34rinlovesyou[d]: Right but theoretically i could run a zink session with Nvidia prop drivers
23:34redsheep[d]: Unless you use the vulkan renderer in which case some of it is vulkan, and also broken
23:34rinlovesyou[d]: If i did that, would that work? Because if not it's a zink issue
23:34rinlovesyou[d]: If it does work it's a nvk issue
23:35rinlovesyou[d]: That's what i was asking
23:36redsheep[d]: rinlovesyou[d]: Put the system under load, it should happen. Or maybe it's ada or dual monitors specific? Dunno
23:36rinlovesyou[d]: redsheep[d]: Idk i have discord on my second monitor at all times and it never showed any corruption while i was testing games either
23:37redsheep[d]: It takes time for the corruption to build and you have to interact with it during the load
23:38rinlovesyou[d]: I'll try to cause it next time i test
23:39redsheep[d]: rinlovesyou[d]: I'm pretty sure it is an NVK issue but it's hard to say for sure. I don't know if zink sessions on Nvidia prop work, I haven't tried
23:40rinlovesyou[d]: Will try that too
23:41redsheep[d]: What are you doing to enable zink on the session?
23:41rinlovesyou[d]: Well ok nvk i just set NOUVEAU_USE_ZINK in the plasma env
23:43redsheep[d]: Hmm I have been putting it in etc/environment. Wonder if it makes a difference. Are you using the discord Flatpak? I wonder if that might keep the zink environment variable from reaching discord
23:43rinlovesyou[d]: Idk it reaches Minecraft at least
23:43rinlovesyou[d]: And any other opengl app
23:44rinlovesyou[d]: Then again doesn't chromium use vulkan anyways?
23:58redsheep[d]: Clearly it does something to it to have it use zink, so at least partly no
23:58redsheep[d]: If you're using a Flatpak though maybe try native