02:56graphitemaster: Not sure where to ask this but I'm trying to figure out where segmentationMask comes from in NV_shader_thread_shuffle when using the regular shuffleNV shader intrinsic since there is no mask argument. Even CUDA's __shfl_ intrinsics has a unsigned mask parameter. This value seemingly comes out of thin air.
03:12HdkR: graphitemaster: Looks like it is only defined for the glasm rather than the glsl functions
03:14graphitemaster: HdkR, Yeah. It's just NV's GLSL compiler somehow turns shuffleNV into SHFIDX instructions in GLASM. Where is it deriving the segmentationMask. I assume it's just 0xFFFF_FFFF always?
03:16HdkR: Should mean that all active lanes are participating in the shuffle
03:19HdkR: The glasm definition is weird to read compared to the cuda implementation though
03:25graphitemaster: I have a shader with a bunch of shuffleNV and shuffleXorNV usage which always passes gl_SubgroupSize as the width and I'm just trying to see if I can basically implement that behavior in terms of regular ballot loop. Just trying to understand what is actually happening with it.
03:34graphitemaster: I suspect segmentationMask is just 32, because width = 32 >> bitCount(mask & 31), and if width = 32, solving for mask is 32
03:35graphitemaster: Since 32 >> bitCount(32) => 32
03:39HdkR: If you're doing reductions then it should be similar to walking the active mask and readInvocation.
03:40HdkR: https://developer.nvidia.com/reading-between-threads-shader-intrinsics Still one of the better posts to try and map the features
03:45HdkR: GL_KHR_shader_subgroup_arithmetic is also quite nice for letting the driver do the magic reduction for you :)
03:45graphitemaster: What if I told you I'm the driver (not really, but I'm writing a shader transpiler)
03:46HdkR: Then you get to implement the driver magic
03:47HdkR: Nvidia will just butterfly shuffle those ops, which is probably what you're seeing
03:48graphitemaster: It's what I'm doing sort of
03:49graphitemaster: e.g https://gist.githubusercontent.com/graphitemaster/95cca04a3b8093aecccf3fd618f171ef/raw/057716b6bc54486e8a4eb7902283a2098a56f933/gistfile1.txt
03:49HdkR: :D
03:51graphitemaster: Want to emulate __shuffleXor and __shuffle in terms of ballot loop now XD
03:51graphitemaster: We're going deep with the loops
09:00HdkR: n/4
15:24fdobridge_: <Sid> can't run DOOM yet
15:24fdobridge_: <Sid> 2/10 driver /j
17:16fdobridge_: <tom3026> Vkdoom runs 😄
17:17fdobridge_: <Sid> I was talking of the 2016 reboot 😅
17:17fdobridge_: <Sid> I've already played gzdoom vulkan renderer on nvk :>
18:17fdobridge_: <rinlovesyou> what the hell do you mean Doom 2016 is an opengl game
18:17fdobridge_: <rinlovesyou> since when is it opengl??
18:19fdobridge_: <rhed0x> 2016 supports both opengl and vulkan
18:19fdobridge_: <rinlovesyou> i didn't see a vulkan switch in the settings
18:19fdobridge_: <rinlovesyou> perhaps the demo doesn't have it
18:19fdobridge_: <rhed0x> idk about the demo
18:19fdobridge_: <rinlovesyou> let me check again
18:20fdobridge_: <rinlovesyou> aha, advanced settings
18:24fdobridge_: <Sid> since release
18:24fdobridge_: <rinlovesyou> damn
18:24fdobridge_: <Sid> the vulkan renderer was added in a patch
18:25fdobridge_: <rinlovesyou> sometimes i forget how good opengl can look
18:25fdobridge_: <rinlovesyou> i just instantly assumed it's dx/vulkan
18:25fdobridge_: <huntercz122> iirc opengl in doom 2016 performs kinda poorly
18:25fdobridge_: <Sid> well, no
18:25fdobridge_: <rinlovesyou> well, on nouveau's old opengl impl it runs pretty shit at least
18:25fdobridge_: <Sid> but, the vk renderer does run better, yes
18:25fdobridge_: <rinlovesyou> on zink it doesn't have a good time at all
18:26fdobridge_: <rinlovesyou> on zink+nvk it doesn't have a good time at all (edited)
18:26fdobridge_: <huntercz122> well on vk i get constant locked 200fps
18:26fdobridge_: <huntercz122> opengl likes to jump around 90fps
21:34fdobridge_: <airlied> @gfxstrand I have a complete set of passing logs 32 and 64-bit, the only concern is I have to revert the 3 KHR enablements on mesa
21:38fdobridge_: <gfxstrand> Unless someone throws a stink about not having WSI in the ones I've submitted, I think we'll just leave it. Throw me a link, though, just in case.
22:36fdobridge_: <orowith2os> that doesn't mean it's performing poorly
22:37fdobridge_: <orowith2os> just worse in comparison
22:37fdobridge_: <orowith2os> you can probably use the same logic with any situation of vulkan vs opengl; or a poor Vulkan implementation
22:48fdobridge_: <karolherbst🐧🦀> if it's about nouveau gl vs nvk, it's because of the gl driver :ferrisUpsideDown:
22:58fdobridge_: <orowith2os> womp womp
22:59fdobridge_: <orowith2os> I'd think the logic would apply, basically anywhere
23:22gameoverX: I am dead in couple of years, cause couple of men who got away both from prison, managed to assault me that bad, that my neck is broken entirely, jaw as well, behind the back they did it both have jewish ugly faces, just to warn you from such countries as cambodia, all can be happy but criminals are not sentenced, and there is a code broken, such assaults are highly against code , i am not sure if i live enough to see those both
23:22gameoverX: slaughtered or imprisoned. Jack Dedman was one hero Alex Enrico sif another one, some ladies who ordered it and it seems it's game over to me. It's disgusting shit.
23:23gameoverX: and you can see that his posters have joss
23:23gameoverX: so it's one of your guys
23:33gameoverX: you sneak behind the back and land the bottle , machete to the neck, you will get your last lesson, that this ain't the code that underground respects
23:42gameoverX: so i expect to show and see very big brutality against such
23:43gameoverX: i saw those british harassing me, with sneakers and nike shoes thought they are big deal in sports and martial arts
23:44gameoverX: but i have seen bigger ones, and i hope to connect with my story to those
23:44gameoverX: that some man break the code in all departments in life