02:06airlied: jenatali: how does d3d12 and cl on it handle non-default floating poinrt modes?
02:07jenatali: airlied: You mean the pragma which switches the rounding mode for non-conversion ALU ops?
02:07jenatali: That's considered deprecated IIRC
02:08airlied: I've been looking nito the llvm constrained intrinsics, but I'm not sure if they are the best solution
02:09airlied: though not sure it's affecting CL tests so much, but I'm seeing vulkan fails on 16-bit rtz
02:09jenatali: Yeah CL doesn't support wholly changing floating point rounding modes anymore
02:11HdkR: I know some people that want per op rounding mode support in GL/Vulkan. So there is at least one user of it :P
02:11airlied: the big fun with llvm is you have to use contrained everywhere if you use oit anywhere
02:12jenatali: Oof
02:12jenatali: DXIL's based on an old enough LLVM that it doesn't have those constrained intrinsics :(
02:16airlied: I should probably confirm my bug is definitely in this area before rewriting all the intrinsics :-P
02:17jenatali: Probably :P
02:18airlied: all 32/64->16 rtz conversions fail
02:19jenatali: Oh, only conversions?
02:19airlied: <Text>Error: found unmatched 32-bit and 16-bit floats: 947977984 vs 1031 1032</Text>
02:19airlied: well 16-bit storage ext only does conversions
02:20airlied: I suppose I could enable the other ext and see what falls out
02:20jenatali: How does lavapipe implement the nir intrinsics for the fp16 conversions with explicit rounding?
02:20jenatali: Er, they're not even intrinsics, alu ops
02:21airlied: at the moment is just calls llvm fptrunc, explicit rounding like f2f16_rtz I've started to look at the constrained intrinsics
02:21jenatali: Ah, yeah that'd do it
02:21jenatali: In the CLOn12 stack I've got NIR lowering for those, but you probably want the constrained intrinsics instead of that
02:23airlied: jenatali: ah the half_rounded stuff?
02:24jenatali: airlied: If you want it, look up float_to_half_impl
02:24airlied: yeah that might be easier than porting everything to constrained
02:24jenatali: Oh, yeah, half_rounded is the helper above that
02:24jenatali: Feel free to move it to core nir if it's useful to you
02:52airlied: jenatali: yeah ripping off that code passes the tests at least
21:17zmike: mareko: I did get it working btw, the difference in nir output is amazing
21:18zmike: any reason you restricted it to single component 32bit loads or just testing/profiling?