03:33alyssa: dschuermann_: nir: Use SM5 properties to optimize shift(a@32, iand(31, b))
03:33alyssa:
03:33alyssa: This is a common pattern from HLSL->SPIRV translation
03:33alyssa: and supported in HW by all current NIR backends.
03:33alyssa: So... about that.. we've since added a NIR backend for AGX which doesn't define shifts this way
03:33alyssa: which means Asahi is unsound
03:33alyssa:sweats
03:34alyssa: I don't want to add back the `iand` to every shift in the backend, since it's UB in the GLSL spec, that'd be pointless
03:35alyssa: but I don't want to revert that optimization since it's doing good for every other backend (including Panfrost)
03:36alyssa: options include defining more specialized `ishl_sm5`, etc opcodes that regular ishl gets lowered to... but that requires touching every backend
03:36alyssa: or defining `ishl_nonsm5` that lowers to `ishl`... but that requires touching every frontend
03:37alyssa: or gating that optimization behind an option, but I've NAK'd exactly that before and I'm NAK'ing it from myself here now too :-p
03:37alyssa: or closing my eyes and pretending AGX is less special
03:41alyssa: Apple's compiler leaves the iand in when compiling `1 << (x & 31)`, so it's not like there's a special DirectX mode the instruction has
03:42alyssa: oh, wait, maybe Apple *inserts* the iand for any dynamic shift operand, uh
03:43alyssa: yeah, ok. apple is adding the iand in the backend for shifts. that's... curious.
03:44alyssa: oh, because Metal's shading language deviates from C++ and instead defines shifts as:
03:44alyssa: "The result of E1 << E2 is E1 left-shifted by the log2(N) least significant bits in E2 viewed as an unsigned integer value"
03:45alyssa: and likewise for >>
03:45alyssa: i.e. the NIR/SM5 definition
03:45alyssa: Kinda curious. Why would Apple define their hw to work differently than Metal is defined?
03:50alyssa: shrug. if it's good enough for apple, it's good enough for me
05:04jenatali: alyssa: Add a nir option to turn off that optimization for AGX?
05:04jenatali: Oh sorry I missed in your wall of text you don't want to do that :P
05:05jenatali: Still seems like the sanest option to me
13:48jenatali: Linux Hackerman: x512: Neither of your messages made it to IRC
14:20glehmann: alyssa: use nir_unsigned_upper_bound in backend isel to determine if the shift needs masking?
14:20glehmann: and hope that apple fixes it in the next hw gen
16:06hays: how does vulkan fit in with the rest of mesa architecture
16:10hays: I found this diagram -- https://vulkan.lunarg.com/doc/view/1.3.204.1/mac/LoaderInterfaceArchitecture.html
16:11X512: hays: Vulkan have its own architecture. Mesa provides Vulkan drivers that are loaded by standard Vulkan-Loader (libvulkan.so).
16:12X512: Vulkan drivers are not required to be provided by Mesa, drivers from various vendors can coexist.
16:12hays: who writes the loader? would that be part of a proprietary package?
16:12X512: https://github.com/KhronosGroup/Vulkan-Loader
16:13hays: ok.. So this implies Vulkan Loader is open and GPU vendor just provides a driver
16:13X512: Vulkan-Loader is open source (Apache License 2.0). It is cross platform and used by many operating systems including Windows.
16:14hays: I got this cryptic email from a GPU/SOC vendor: "the mali vulkan is not released yet and only support drm now(no wayland or x11)"
16:14X512: In Mesa Vulkan drivers are located at src/<vendor>/vulkan.
16:14hays: im trying to make sense of that email --it seems like the vulkan driver has no knowledge of windowing system etc
16:15X512: And generic code used by all Mesa Vulkan drivers in src/vulkan.
16:16X512: Mesa Vulkan drivers have knowledge of windowing system (WSI). But it is also possible to implement windowing system support in separate Vulkan layer module.
16:17hays: i suppose doing so would require looking at the shared library and seeing what function calls were not implemented and then writing that layer
16:17X512: Example of separate Vulkan windowing system layer module: https://gitlab.freedesktop.org/mesa/vulkan-wsi-layer. Seems not used by regular Linux distributions.
16:18hays: oh--cool.
16:19X512: You can write Vulkan WSI layer to add windowing system support to existing driver.
16:20hays: yeah, or maybe it sounds like one already exists and I could try to use it
16:21hays: I am currently testing a hypothesis that the Mali vendor blob wayland-gbm is unnecessary when used with wayland because they cleaned up the interface with libwayland-egl.so
16:22hays: My hope is that, if this holds, when they release vulkan support it can be similarly adapted (maybe via a WSL layer) to use
16:24alyssa: jenatali: The problem there is that the NIR opcode (ishl etc) then has different behaviour/meaning depending on the backend
16:24alyssa: and having an inconsistent NIR is not a good idea
16:25jenatali: Fair
16:27alyssa: glehmann: The problem there is that we still lose information
16:27hays: then of course once v10 comes out we swich over to open drivers :)
16:27alyssa: if the shift came from GLSL or SPIR-V, we may asume wlog that the shift is < 32
16:28alyssa: but just looking at the ishl instruction, range analysis can't assume that because of the sm-5 definition
16:28alyssa: again, Apple is masking every shift, so I may as well do the same.
16:28alyssa: I'm not going to be iand-due-to-nonconstant-shift bound, lol
16:28alyssa: non-constant shifts aren't super common anyway
16:28alyssa: got bigger fish to fry
16:29alyssa: jenatali: I think the right way to do this, if this mattered for a hot path, would be to introduce new `{ishl,ishr,ushr}_khr` NIR opcodes that are undefined for shift >= 32
16:29alyssa: generate _khr versions in GLSL-to-NIR and SPIRV-to-NIR, where we have a spec reference saying it's safe to do so
16:30alyssa: (but don't replace it throughout the tree, maybe there's a common lowering pass somewhere that relies on the existing SM5 semantics of ishl, do we really want to audit the tree due to walking back on a previous opcode definitional guarantee?)
16:31alyssa: replace the SM5 optimizations to be "ishl_khr(1, a & 31) -> ishl(1, a)"
16:32alyssa: have a "has_khr_shifts" compiler option that indicates a backend can accept the _khr versions *in addition* to the regular sm5 versions, which only AGX would set
16:32alyssa: AGX would have a late backend lowering "ishl(a, b) -> ishl_khr(a, b & 31)"
16:32alyssa: and AGX would only handle the _khr versions, while everyone else would only handle the nonsuffixed sm5 versions
16:33alyssa: that being said, that sounds like a big pile of work for something Apple doesn't bother to do, for the purpose of exploiting UB and that's maybe not great
16:34alyssa: so instead I think I'll do the subset: add _khr opcodes, lower to them in the backend, and handle only _khr versions
16:35alyssa: and if someone really wants to get rid of the extra iands they can plumb that through the glsl/spirv translators
18:40hays: I am hoping this is a correct interpretation of Mesa/Linux/Vendor-Blob architecture: https://file.st5ve.com/image.png
18:47alyssa: the annoying thing is that new opcodes => duplicating all the existing algebraic rules we have ishl :|
19:03X512: hays: Vulkan is fully independent from EGL and GBM.
19:04hays: yeah i thought i drew it that way, but maybe its confusing because the vendor ships it as a big blob
19:05X512: Mesa Vulkan WSI implementation send dma-buf FD over Wayland protocol.
19:07hays: Ahh--so what I've drawn is impossible if the application is a wayland application. some of the stuff im running might just be without wayland
19:17hays: X512: so if I understand, more like this -- https://file.st5ve.com/image1.png
19:17hays: i mean this client is for some reason doing both EGL and Vulkan at same time it would obv pick one
19:19X512: Maybe proprietary driver use EGL as WSI implementation, I don't know. It is possible to convert VkImage/VkDeviceMemory to EGLImage and display with EGL mechanisms.
19:29hays: well I think they just blob it together within the same driver even if they are different. the way GLESv2, GBM, and EGL are blobbed together
19:35hays: right now im hearing from rockchip they are not supporting wayland+vulkan just straight drm
19:37hays: but doesn't that mean IF wayland were to implement vulkan-wsi-layer, the system would essentially be wayland enabled?
22:09bb2045: Hi all. I've written a 3D engine. Source code not public atm. With task/mesh shaders, I have encountered NIR validation errors. Could these shader files potentially be of use to mesa's development?