00:01 mhenning[d]: loryruta[d]: Maybe try denorm preserve?
00:01 sonicadvance1[d]: If you want bit-accurate results, you're going to at some point need to reconcile spec language like `The maximum error is less than or equal to 1.5 * 2–12 times the true reciprocal.`
00:04 dwfreed: "float is a hot mess" full stop :)
00:16 loryruta[d]: sonicadvance1[d]: yes, but how come _everything_ else i tested behaves roughly the same (roughly, as there's still a bit of numerical instability=
00:17 loryruta[d]: and nvidia on desktop behaves that different!?
00:17 loryruta[d]: 🤷‍♂️
00:17 loryruta[d]: float is a hot mess but they're making it messier
00:20 sonicadvance1[d]: It's a reason why online games that need to sync state between PCs all used fixed-point math (Except when they don't and we get to laugh when it breaks)
00:45 karolherbst[d]: sonicadvance1[d]: well CL defines accuracy in terms of ULPs for tze builtins at least
00:46 sonicadvance1[d]: I definitely enjoy ULPs over..whatever the heck the x86 manuals do
00:47 karolherbst[d]: anyway, I can only recommend verifying each calculation and check where the error is significantly different and check if it's within API limits or not
00:49 karolherbst[d]: those kind of inaccuracies are generally a result of more aggressive optimizations, and nvidia is pretty aggressive in that area
09:32 loryruta[d]: karolherbst[d]: i’m wondering if they can disabled though
09:32 loryruta[d]: if not through the vulkan api, by compiling the spv to ptx and sending it to the driver
10:04 karolherbst[d]: yeah, they have private options for that
10:05 karolherbst[d]: loryruta[d]: https://registry.khronos.org/OpenCL/extensions/nv/cl_nv_compiler_options.txt
10:06 karolherbst[d]: well on the cl side that is
11:26 phomes_[d]: gfxstrand[d]: I am not sure how to answer that. The swapchain image format is VK_FORMAT_B8G8R8A8_UNORM. Is that the right thing to check?
11:46 pac85[d]: loryruta[d]: Well it's up to you to write algos such that rounding errors don't accumulate. You can model the error and it's propagation and rearrange ops to avoid them. Unless you mean that optimizations make that impossible.
12:49 karolherbst[d]: con 32 %137 = iadd %136 (0x7f), %135
12:49 karolherbst[d]: con 32 %139 = ushr %137, %138 (0x7)
12:49 karolherbst[d]: It's kinda impressive how many things I'm finding that could be optimized more with range analysis/uub
12:49 karolherbst[d]: ehh wait.. that would be illegal to simplify..