00:17 fdobridge: <b​enjaminl> https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/256 took a shot at the carry register file thing
00:24 fdobridge: <b​enjaminl> hmm, just realized I never actually tested what happens when regalloc fails, and didn't explicitly add anything to stop it from trying to spill carry to memory or something...
00:29 fdobridge: <m​henning> @karolherbst ah, you're right. I misread the CUDA docs earlier - on turing+ at least, we hit the same L1 cache as without the CONSTANT flag on LDG
00:36 fdobridge: <m​henning> @gfxstrand The vulkan memory model has a specific rule that shader i/o has to be visible across shader stages, so I'm guessing we don't need to do the same for SSBO access https://registry.khronos.org/vulkan/specs/1.1-extensions/html/vkspec.html#memory-model-shader-io
00:38 fdobridge: <g​fxstrand> I'm not sure. I need to give it a good, hard think. I seem to recall this actually came up in a call recently.
00:39 fdobridge: <m​henning> Fair enough. I'll admit I haven't read the rest of the memory model
00:39 fdobridge: <g​fxstrand> It'll just panic in that case. That's fine.
00:40 fdobridge: <g​fxstrand> Neither have most people. 🤣
00:41 fdobridge: <g​fxstrand> If it's the same L1 cache on Turing+ only without as many coherency guarantees, that may be fine.
01:51 fdobridge: <m​henning> I think that's true, but the CUDA docs are confusing
01:52 fdobridge: <m​henning> anyway, new version removes the controversial part https://gitlab.freedesktop.org/gfxstrand/mesa/-/merge_requests/53
01:52 fdobridge: <m​henning> still haven't run cts though
01:52 fdobridge: <g​fxstrand> Cool. I'll take a look Monday
14:34 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Meson 1.3.0rc1 has some Rust crate bug :ferris:
14:34 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> So you need 1.3.0rc2 for NAK to compile
14:39 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> NAK is already very slow at the PPSSPP main menu (I'll try the mhenning patch soon)
14:39 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1173270934577233990/Screenshot_20231112_163907.png?ex=656358b5&is=6550e3b5&hm=0d780aae60bdf58df679eb3275d51167e9a998b79fd5c9d9026305a1c98b3e7f&
14:46 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Non-GSP NAK beating GSP codegen 🚀
14:46 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1173272588290629642/Screenshot_20231112_164506.png?ex=65635a3f&is=6550e53f&hm=dff752cc5330f71f2c0614dd53486f7cca86c5742809f926e8eb911871420583&
15:18 fdobridge: <a​irlied> @gfxstrand trying to build nak/main, let is use wraps, but setting can't find crate for a bunch of proc_macro2 errors
15:19 fdobridge: <a​irlied> oh maybe rebasing meson fixed it
15:19 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> @ airlied ^
15:35 fdobridge: <d​wlsalmeida> @gfxstrand finally got rid of encode_alu() on maxwell: https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/257
15:35 fdobridge: <d​wlsalmeida> @gfxstrand finally got rid of encode_alu() on maxwell: https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/257
15:35 fdobridge: <d​wlsalmeida>
15:35 fdobridge: <d​wlsalmeida> cc @marysaka (edited)
15:39 fdobridge: <m​arysaka> amazing 🎉
15:47 fdobridge: <g​fxstrand> I'll give the code a read on Monday and merge it.
16:49 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> That's pretty boring compared to mhenning's MR but if it makes the code cleaner then it's fine 🐸
16:52 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> i turned down the resolution to 3x PSP here (it's now playable with NAK; codegen couldn't achieve that)
16:52 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1173304385141485569/Screenshot_20231112_183815.png?ex=656377dc&is=655102dc&hm=6a66caab130946049b44c60b1f82e476ad9505e94eb3f168bc9cad6bcdfe7665&
16:56 fdobridge: <m​arysaka> That's for Maxwell / Pascal.