10:38mohamexiety[d]: HdkR: https://videocardz.com/newz/nvidia-dgx-spark-features-6144-cuda-cores-just-as-many-as-rtx-5070 hm this is actually a big difference compared to Thor (which allegedly is 2560 CUDA cores/20SM from a NV GTC presentation... but somehow has twice the [AI] flops of Spark)
14:31karolherbst[d]: mhenning[d]: your MR seems to get rid of pointless movs a lot better
14:31karolherbst[d]: 925 -> 861 instructions
14:32karolherbst[d]: though I think the main issue is that the shaders are burning regs like no tomorrow
14:32karolherbst[d]: 125 regs...
14:33karolherbst[d]: https://gist.githubusercontent.com/karolherbst/891fad3dff6f83cc613a05548e43a877/raw/3af66c29fa57c208f4eb3c84fadfc146a30dc6ab/gistfile1.txt is one of those shaders in case you have good ideas
14:34karolherbst[d]: the last block is... not very optimal I'd say
14:35karolherbst[d]: also need to figure out what to do with those hundreds of prmt π
14:39karolherbst[d]: mhh... there is a bit of weird stuff going on...
14:45karolherbst[d]: I think we might split/merge fp16 values too often somewhere, will have to check it out tomorrow
16:50mhenning[d]: Yeah, the prmt I think is an issue with how we convert nir to nak - we often zero out the top bits of 16-bit values where it isn't really necessary
16:52mhenning[d]: for the movs at the end, I'd need to look at the shader more closely but it's possible we fail to use the register bias for some reason
16:52mhenning[d]: or maybe even they're from cssa conversion?
16:53mhenning[d]: would need to examine the shader in more detail
17:04gfxstrand[d]: Okay, I need to stop getting nerdsniped by alignments and finish reviewing the MR.
17:06gfxstrand[d]: Okay, I'll leave it for a bit. I asked for align_mul and offset to be swapped and that's gonna need super careful reading.
17:12gfxstrand[d]: gfxstrand[d]: Now I remember why I didn't do this. It's rediculously hard to come up with a mental model that's consistent. π€―
17:41mhenning[d]: karolherbst[d]: oh, also https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33306 can help with register pressure in some cases you could try that out and see if it helps
17:47karolherbst[d]: mhenning[d]: causes MMU faults
17:47karolherbst[d]: didn't even reduce register usage
18:09mhenning[d]: oh, that's not good
18:12gfxstrand[d]: snowycoder[d]: Do you have a Kepler A? Or are you relying on me for testing and debug? I don't care much either way. I'm just wondering.
18:33HdkR: mohamexiety[d]: Should be a good platform for most people.
19:26gfxstrand[d]: Wow. There is so much nonsense in these surface instructions that just isn't necessary. :silvy_sweat:
19:57snowycoder[d]: gfxstrand[d]: I only have a 710 (Kepler B) sorry.
19:57gfxstrand[d]: No worries
20:00snowycoder[d]: gfxstrand[d]: You mean Kepler surface ops? beacuse I'm still not sure about how the lower 9 bits workπ
20:11gfxstrand[d]: The thing I was specifically referring to is signed vs. unsigned clamping. There isn't really a difference. A negative number is just a way-too-big unsigned number.
20:11gfxstrand[d]: Maybe it matters for 24 and 16-bit things, I guess.
20:11gfxstrand[d]: If you want the bound to be more than 24 or 16 bits
20:14gfxstrand[d]: Okay, I think that's enough for now. My brain is mush and I'm about to the end of my train.
20:14gfxstrand[d]: REALLY good work, snowycoder[d] . π
20:40snowycoder[d]: gfxstrand[d]: Thank you :3
20:43snowycoder[d]: gfxstrand[d]: Yes that is weird, but it also does a lot of weird bitcasts.
20:43snowycoder[d]: Codegen and the new compiler always use unsigned.
21:42mangodev[d]: does anyone know if steam works in hardware accel mode yet? or does it still result in a black screen
21:50mangodev[d]: i'm scared to turn the setting on because last time i did so i had to go through a really lengthy process to reverse the setting (thanks, valve)
22:19mhenning[d]: Steam still gives a black screen on nvk+zink. It can be temporarily worked around by setting LIBGL_KOPPER_DISABLE=true for steam
22:19mhenning[d]: Issue report is here https://gitlab.freedesktop.org/mesa/mesa/-/issues/11901