02:08 fdobridge: <m​henning> I think kepler is pretty similar.
02:09 fdobridge: <m​henning> codegen uses the same lowering across gens for the AL2P thing https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#L3383
02:10 fdobridge: <m​henning> Most notable difference I'm aware of is that before Volta we have actual indirects for fragment inputs
16:26 fdobridge: <g​fxstrand> PSA: Just force-pushed nak/main. It now includes a getparam change for working on the latest kernels as well as all my tessellation work. I'm running CTS now and will post results once I get them but I expect them to be 2 tests improved from the ones I just posted
16:26 fdobridge: <g​fxstrand> @marysaka Sorry if some of that makes hash of the GS stuff. I totally reworked attribute I/O, adding a whole NIR lowering pass.
16:30 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> I wonder if it's time to run `NVK_USE_NAK=all` tests
16:33 fdobridge: <g​fxstrand> Nope. @marysaka needs to finish GS.
16:33 fdobridge: <g​fxstrand> But that's all that's left before we can run with `=all`
16:34 fdobridge: <r​hed0x> ~~what about raygen, anyhit and closest hit?~~
16:35 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Ekstrand already said it's a different beast
16:36 fdobridge: <g​fxstrand> I think that's why they had a line through it. 😉
16:39 fdobridge: <m​arysaka> No worries! I will do a proper rebase tomorrow
16:44 fdobridge: <m​arysaka> normally it should be mostly ready, I had 40 fail but it was related to divergence issues, tho I haven't done a full CTS run as my system always lockup after an hour or two 😅
16:45 fdobridge: <r​hed0x> i am curious though how nvidias RT implementation is different than how it works on AMD
16:46 fdobridge: <g​fxstrand> Very different
16:47 fdobridge: <g​fxstrand> I'm pretty sure NVIDIA has hardware for BVH traversal as well as a HW scheduler.
16:47 fdobridge: <g​fxstrand> AMD is "Here's two instructions. Go build it in compute shaders!"
16:52 fdobridge: <g​fxstrand> There are going to be a lot of similarities, though.
16:52 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> So it's basically software ray-tracing?
16:52 fdobridge: <p​ixelcluster> no
16:52 fdobridge: <p​ixelcluster> I'd still guess there's a lot of compute stack work involved though
16:53 fdobridge: <p​ixelcluster> I'd still guess there's a lot of compute stack work involved though, not just for bvh building (edited)
16:53 fdobridge: <p​ixelcluster> where probably common mesa stuff could provide benefits
16:55 fdobridge: <g​fxstrand> No. It's sort of the minimal added bit of hardware to let you implement the rest efficiently. Also, just because they don't have things like a HW scheduler doesn't mean it's as slow as you'd get if you tried to implement it in what you can do through GLSL. There's a lot of compiler tricks you can play to emulate the HW scheduler pretty efficiently in a SW solution if said SW solution is targeted directly to one particular hardware.
16:55 fdobridge: <p​ixelcluster> yeah I'm trying my way around fancy sw scheduling atm
16:55 fdobridge: <p​ixelcluster> hopefully I'll get some of it into the xdc talk but it's probably too early to say anything meaningful other than (looking into it, kinda hard ngl)
16:56 fdobridge: <p​ixelcluster> hopefully I'll get some of it into the xdc talk but it's probably too early to say anything meaningful other than "looking into it, kinda hard ngl" (edited)
16:57 fdobridge: <g​fxstrand> We should chat about it at XDC. I'm curious to know what all you're thinking there. I've thought about it some but I've also always been on hardware that has a HW scheduler so it's never gone beyond the brain prototype stage.
16:58 fdobridge: <p​ixelcluster> definitely
16:59 fdobridge: <g​fxstrand> If we want to do efficient RT on turnip or Mali, we're probably looking at similar solutions.
16:59 fdobridge: <g​fxstrand> DId I just say "efficient RT on Mali"? 🤡
17:00 fdobridge: <g​fxstrand> Wait... DId I just say "efficient RT on Mali"? 🤡 (edited)
17:00 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Will NVK survive this with NAK?: https://youtu.be/x_d54SejEHA?t=185
17:00 fdobridge: <g​fxstrand> IDK
17:00 fdobridge: <g​fxstrand> Only one way to find out!
17:01 cwabbott: Qualcomm doesn't have a hardware scheduler, but I'm going to guess they will in a future HW generation
17:01 fdobridge: <r​hed0x> is it bad?
17:02 cwabbott: they're moving in that direction, removing things that would cause problems with a bindless compute shader thing
17:03 fdobridge: <g​fxstrand> RT is very bandwidth-intensive. If you can find me a Mali attached to real RAM, it might be okay... Most of them are attached to garbage.
17:03 fdobridge: <p​ixelcluster> so far I think I at least managed to get the compiler to swallow the basic concepts of repacking (you call a magic intrinsic, and suddenly lanes you thought dead come back to life and look for more work 😱)
17:03 fdobridge: <p​ixelcluster> at least it fits a halloween theme
17:04 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Does NAK resolve `ERROR: modifiers currently not supported on nir_alu_src` limitation of codegen? :ferris:
17:04 fdobridge: <g​fxstrand> Spooky!
17:04 fdobridge: <g​fxstrand> Yes
17:04 fdobridge: <g​fxstrand> Why are we using modifiers at all?!?
17:04 fdobridge: <k​arolherbst🐧🦀> I think that's the question here
17:05 fdobridge: <k​arolherbst🐧🦀> somehow they end up in the nir
17:05 fdobridge: <g​fxstrand> WTH?
17:05 fdobridge: <g​fxstrand> That sounds unpossible
17:05 fdobridge: <k​arolherbst🐧🦀> yeah, same
17:05 fdobridge: <g​fxstrand> https://tenor.com/view/inconceivable-princessbride-vizzini-gif-4835840
17:05 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Only DOOM 2016 does (for some mysterious rip-and-tear functionality probably /s)
17:06 fdobridge: <k​arolherbst🐧🦀> probably some weird corner case "nothing should hit"
17:06 fdobridge: <k​arolherbst🐧🦀> but yeah.. having modifiers there sounds scatchy
17:06 fdobridge: <k​arolherbst🐧🦀> mind running it with gdb and `nir_print_shader(nir, stdout)` the nir?
17:07 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> I deleted DOOM a long time ago because of disk space constraints
17:07 fdobridge: <k​arolherbst🐧🦀> ehh wait, that's with wine? pain
17:07 fdobridge: <k​arolherbst🐧🦀> just add it to the code 🙃
17:07 fdobridge: <k​arolherbst🐧🦀> ahh
17:07 fdobridge: <p​ixelcluster> you can gdb with wine too
17:07 fdobridge: <k​arolherbst🐧🦀> pain
17:07 fdobridge: <k​arolherbst🐧🦀> is what it is
17:07 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> gdb somewhat works on Wine but it requires some patches for better functionality
17:07 fdobridge: <p​ixelcluster> it's a bit more involved but definitely possible (and not that bad)
17:07 fdobridge: <g​fxstrand> I can't find that error string in Mesa anywhere
17:08 fdobridge: <g​fxstrand> https://gitlab.freedesktop.org/gfxstrand/mesa/-/commit/df8ad952c03399b0b93ec9098e5289044bfe9ba6
17:08 fdobridge: <p​ixelcluster> you simply gdb attach and give it the wine64 binary as the symbol file
17:08 fdobridge: <g​fxstrand> I can't find that error string anywhere in Mesa.
17:09 fdobridge: <k​arolherbst🐧🦀> ohh, I think we nuked that part?
17:09 fdobridge: <g​fxstrand> Yeah. I suspect you should just try again. We torched modifiers from NIR entirely.
17:10 fdobridge: <p​ixelcluster> for example, if doom were running as "DOOM2016.exe" under proton experimental, you'd have the gdb invocation look something like this
17:10 fdobridge: <p​ixelcluster> `gdb -p $(pgrep DOOM2016.exe) "~/.local/share/Steam/steamapps/Proton - Experimental/files//bin/wine64`
17:10 fdobridge: <p​ixelcluster> for example, if doom were running as "DOOM2016.exe" under proton experimental, you'd have the gdb invocation look something like this
17:10 fdobridge: <p​ixelcluster> `gdb -p $(pgrep DOOM2016.exe) "~/.local/share/Steam/steamapps/Proton - Experimental/files/bin/wine64` (edited)
17:10 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> I tested DOOM a long time ago (back in March)
17:10 fdobridge: <p​ixelcluster> I seem to need to run gdb as root for some reason
17:10 fdobridge: <p​ixelcluster> but that's about it
17:10 fdobridge: <k​arolherbst🐧🦀> I'm not interested in solution, I just want to rant how painful it is to debug applications with wine 🙃 😄
17:10 fdobridge: <p​ixelcluster> I seem to need to run gdb as root for some reason, otherwise it'll error on opening some .so-s (edited)
17:11 fdobridge: <p​ixelcluster> yes it's still pain as heck
17:11 fdobridge: <k​arolherbst🐧🦀> I actually have all the scripts in place to do it
17:11 fdobridge: <k​arolherbst🐧🦀> it's just pain
17:11 fdobridge: <p​ixelcluster> hehe crash at startup go brrrrrrrrr
17:11 fdobridge: <k​arolherbst🐧🦀> it physically hurts me every time I have to debug it
17:14 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> So I should probably do a more destructive cleanup or get a massive storage pool
18:27 fdobridge: <m​ohamexiety> @marysaka sorry for the ping but do you use the GSP while developing? or are you on normal firmware
18:27 fdobridge: <m​ohamexiety> was curious if it's now usable with no issues
18:28 fdobridge: <p​homes> I think that I have a working pipeline shader cache now. I will test some more and then clean it up and submit a MR tomorrow
18:29 fdobridge: <m​arysaka> Was on normal firmware initially but now on GSP
18:29 fdobridge: <m​arysaka> my setup is extra more unstable because it's eGPU based 😅
18:29 fdobridge: <m​ohamexiety> ooo. ic. that's really good then. thanks!
18:30 fdobridge: <m​arysaka> Maybe I should try to run CTS tests on my laptop to see how it behaves
18:30 fdobridge: <m​ohamexiety> my setup is a bit cursed as well. the main GPU refuses to work with the normal firmware so I used a little 1030 for development (+ old mesa version without Ampere support as a base) but that doesn't work at all since the old uAPI pass was removed
18:31 fdobridge: <m​ohamexiety> my setup is a bit cursed as well. the main GPU refuses to work with the normal firmware so I used a little 1030 for development (+ old mesa version without Ampere support as a base) but that doesn't work at all since the old uAPI path was removed (edited)
18:33 fdobridge: <m​ohamexiety> how's GSP installation now? which kernel do I need?
18:47 fdobridge: <a​irlied> @marysaka Ah you are probably hitting the GSP kills everything lockup
18:47 fdobridge: <a​irlied> See if dmesg mentions VMM allocations
18:48 fdobridge: <m​arysaka> yeah I got that
21:53 fdobridge: <b​utterflies> note that GSP proper crashdumps are now available too