02:08fdobridge: <mhenning> I think kepler is pretty similar.
02:09fdobridge: <mhenning> codegen uses the same lowering across gens for the AL2P thing https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#L3383
02:10fdobridge: <mhenning> Most notable difference I'm aware of is that before Volta we have actual indirects for fragment inputs
16:26fdobridge: <gfxstrand> PSA: Just force-pushed nak/main. It now includes a getparam change for working on the latest kernels as well as all my tessellation work. I'm running CTS now and will post results once I get them but I expect them to be 2 tests improved from the ones I just posted
16:26fdobridge: <gfxstrand> @marysaka Sorry if some of that makes hash of the GS stuff. I totally reworked attribute I/O, adding a whole NIR lowering pass.
16:30fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I wonder if it's time to run `NVK_USE_NAK=all` tests
16:33fdobridge: <gfxstrand> Nope. @marysaka needs to finish GS.
16:33fdobridge: <gfxstrand> But that's all that's left before we can run with `=all`
16:34fdobridge: <rhed0x> ~~what about raygen, anyhit and closest hit?~~
16:35fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Ekstrand already said it's a different beast
16:36fdobridge: <gfxstrand> I think that's why they had a line through it. 😉
16:39fdobridge: <marysaka> No worries! I will do a proper rebase tomorrow
16:44fdobridge: <marysaka> normally it should be mostly ready, I had 40 fail but it was related to divergence issues, tho I haven't done a full CTS run as my system always lockup after an hour or two 😅
16:45fdobridge: <rhed0x> i am curious though how nvidias RT implementation is different than how it works on AMD
16:46fdobridge: <gfxstrand> Very different
16:47fdobridge: <gfxstrand> I'm pretty sure NVIDIA has hardware for BVH traversal as well as a HW scheduler.
16:47fdobridge: <gfxstrand> AMD is "Here's two instructions. Go build it in compute shaders!"
16:52fdobridge: <gfxstrand> There are going to be a lot of similarities, though.
16:52fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> So it's basically software ray-tracing?
16:52fdobridge: <pixelcluster> no
16:52fdobridge: <pixelcluster> I'd still guess there's a lot of compute stack work involved though
16:53fdobridge: <pixelcluster> I'd still guess there's a lot of compute stack work involved though, not just for bvh building (edited)
16:53fdobridge: <pixelcluster> where probably common mesa stuff could provide benefits
16:55fdobridge: <gfxstrand> No. It's sort of the minimal added bit of hardware to let you implement the rest efficiently. Also, just because they don't have things like a HW scheduler doesn't mean it's as slow as you'd get if you tried to implement it in what you can do through GLSL. There's a lot of compiler tricks you can play to emulate the HW scheduler pretty efficiently in a SW solution if said SW solution is targeted directly to one particular hardware.
16:55fdobridge: <pixelcluster> yeah I'm trying my way around fancy sw scheduling atm
16:55fdobridge: <pixelcluster> hopefully I'll get some of it into the xdc talk but it's probably too early to say anything meaningful other than (looking into it, kinda hard ngl)
16:56fdobridge: <pixelcluster> hopefully I'll get some of it into the xdc talk but it's probably too early to say anything meaningful other than "looking into it, kinda hard ngl" (edited)
16:57fdobridge: <gfxstrand> We should chat about it at XDC. I'm curious to know what all you're thinking there. I've thought about it some but I've also always been on hardware that has a HW scheduler so it's never gone beyond the brain prototype stage.
16:58fdobridge: <pixelcluster> definitely
16:59fdobridge: <gfxstrand> If we want to do efficient RT on turnip or Mali, we're probably looking at similar solutions.
16:59fdobridge: <gfxstrand> DId I just say "efficient RT on Mali"? 🤡
17:00fdobridge: <gfxstrand> Wait... DId I just say "efficient RT on Mali"? 🤡 (edited)
17:00fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Will NVK survive this with NAK?: https://youtu.be/x_d54SejEHA?t=185
17:00fdobridge: <gfxstrand> IDK
17:00fdobridge: <gfxstrand> Only one way to find out!
17:01cwabbott: Qualcomm doesn't have a hardware scheduler, but I'm going to guess they will in a future HW generation
17:01fdobridge: <rhed0x> is it bad?
17:02cwabbott: they're moving in that direction, removing things that would cause problems with a bindless compute shader thing
17:03fdobridge: <gfxstrand> RT is very bandwidth-intensive. If you can find me a Mali attached to real RAM, it might be okay... Most of them are attached to garbage.
17:03fdobridge: <pixelcluster> so far I think I at least managed to get the compiler to swallow the basic concepts of repacking (you call a magic intrinsic, and suddenly lanes you thought dead come back to life and look for more work 😱)
17:03fdobridge: <pixelcluster> at least it fits a halloween theme
17:04fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Does NAK resolve `ERROR: modifiers currently not supported on nir_alu_src` limitation of codegen? :ferris:
17:04fdobridge: <gfxstrand> Spooky!
17:04fdobridge: <gfxstrand> Yes
17:04fdobridge: <gfxstrand> Why are we using modifiers at all?!?
17:04fdobridge: <karolherbst🐧🦀> I think that's the question here
17:05fdobridge: <karolherbst🐧🦀> somehow they end up in the nir
17:05fdobridge: <gfxstrand> WTH?
17:05fdobridge: <gfxstrand> That sounds unpossible
17:05fdobridge: <karolherbst🐧🦀> yeah, same
17:05fdobridge: <gfxstrand> https://tenor.com/view/inconceivable-princessbride-vizzini-gif-4835840
17:05fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Only DOOM 2016 does (for some mysterious rip-and-tear functionality probably /s)
17:06fdobridge: <karolherbst🐧🦀> probably some weird corner case "nothing should hit"
17:06fdobridge: <karolherbst🐧🦀> but yeah.. having modifiers there sounds scatchy
17:06fdobridge: <karolherbst🐧🦀> mind running it with gdb and `nir_print_shader(nir, stdout)` the nir?
17:07fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I deleted DOOM a long time ago because of disk space constraints
17:07fdobridge: <karolherbst🐧🦀> ehh wait, that's with wine? pain
17:07fdobridge: <karolherbst🐧🦀> just add it to the code 🙃
17:07fdobridge: <karolherbst🐧🦀> ahh
17:07fdobridge: <pixelcluster> you can gdb with wine too
17:07fdobridge: <karolherbst🐧🦀> pain
17:07fdobridge: <karolherbst🐧🦀> is what it is
17:07fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> gdb somewhat works on Wine but it requires some patches for better functionality
17:07fdobridge: <pixelcluster> it's a bit more involved but definitely possible (and not that bad)
17:07fdobridge: <gfxstrand> I can't find that error string in Mesa anywhere
17:08fdobridge: <gfxstrand> https://gitlab.freedesktop.org/gfxstrand/mesa/-/commit/df8ad952c03399b0b93ec9098e5289044bfe9ba6
17:08fdobridge: <pixelcluster> you simply gdb attach and give it the wine64 binary as the symbol file
17:08fdobridge: <gfxstrand> I can't find that error string anywhere in Mesa.
17:09fdobridge: <karolherbst🐧🦀> ohh, I think we nuked that part?
17:09fdobridge: <gfxstrand> Yeah. I suspect you should just try again. We torched modifiers from NIR entirely.
17:10fdobridge: <pixelcluster> for example, if doom were running as "DOOM2016.exe" under proton experimental, you'd have the gdb invocation look something like this
17:10fdobridge: <pixelcluster> `gdb -p $(pgrep DOOM2016.exe) "~/.local/share/Steam/steamapps/Proton - Experimental/files//bin/wine64`
17:10fdobridge: <pixelcluster> for example, if doom were running as "DOOM2016.exe" under proton experimental, you'd have the gdb invocation look something like this
17:10fdobridge: <pixelcluster> `gdb -p $(pgrep DOOM2016.exe) "~/.local/share/Steam/steamapps/Proton - Experimental/files/bin/wine64` (edited)
17:10fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I tested DOOM a long time ago (back in March)
17:10fdobridge: <pixelcluster> I seem to need to run gdb as root for some reason
17:10fdobridge: <pixelcluster> but that's about it
17:10fdobridge: <karolherbst🐧🦀> I'm not interested in solution, I just want to rant how painful it is to debug applications with wine 🙃 😄
17:10fdobridge: <pixelcluster> I seem to need to run gdb as root for some reason, otherwise it'll error on opening some .so-s (edited)
17:11fdobridge: <pixelcluster> yes it's still pain as heck
17:11fdobridge: <karolherbst🐧🦀> I actually have all the scripts in place to do it
17:11fdobridge: <karolherbst🐧🦀> it's just pain
17:11fdobridge: <pixelcluster> hehe crash at startup go brrrrrrrrr
17:11fdobridge: <karolherbst🐧🦀> it physically hurts me every time I have to debug it
17:14fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> So I should probably do a more destructive cleanup or get a massive storage pool
18:27fdobridge: <mohamexiety> @marysaka sorry for the ping but do you use the GSP while developing? or are you on normal firmware
18:27fdobridge: <mohamexiety> was curious if it's now usable with no issues
18:28fdobridge: <phomes> I think that I have a working pipeline shader cache now. I will test some more and then clean it up and submit a MR tomorrow
18:29fdobridge: <marysaka> Was on normal firmware initially but now on GSP
18:29fdobridge: <marysaka> my setup is extra more unstable because it's eGPU based 😅
18:29fdobridge: <mohamexiety> ooo. ic. that's really good then. thanks!
18:30fdobridge: <marysaka> Maybe I should try to run CTS tests on my laptop to see how it behaves
18:30fdobridge: <mohamexiety> my setup is a bit cursed as well. the main GPU refuses to work with the normal firmware so I used a little 1030 for development (+ old mesa version without Ampere support as a base) but that doesn't work at all since the old uAPI pass was removed
18:31fdobridge: <mohamexiety> my setup is a bit cursed as well. the main GPU refuses to work with the normal firmware so I used a little 1030 for development (+ old mesa version without Ampere support as a base) but that doesn't work at all since the old uAPI path was removed (edited)
18:33fdobridge: <mohamexiety> how's GSP installation now? which kernel do I need?
18:47fdobridge: <airlied> @marysaka Ah you are probably hitting the GSP kills everything lockup
18:47fdobridge: <airlied> See if dmesg mentions VMM allocations
18:48fdobridge: <marysaka> yeah I got that
21:53fdobridge: <butterflies> note that GSP proper crashdumps are now available too