10:54airlied[d]: Sounds like zink should be lowering it
10:55karolherbst[d]: yeah, I already have the patch written
10:58karolherbst[d]: some other tests just crash the GPU context :blobcatnotlikethis:
11:11karolherbst[d]: I'd really like to know what all those errors mean `gsp: rc engn:00000001 chid:56 type:13 scope:1 part:233` :ferrisUpsideDown:
11:54karolherbst[d]: gfxstrand[d]: TIL, `FindILsb` isn't 32 bit only in vulkant
11:54karolherbst[d]: *vulkan
11:55karolherbst[d]: but anyway... nak should probably set `nir_lower_find_lsb64` 😄
11:56karolherbst[d]: ehh wait.. the issue is somethign else
11:57karolherbst[d]: I'll submit the patch regardless
13:07karolherbst[d]: let's see how this second run goes
13:08karolherbst[d]: it's sad that it takes so long....
13:08karolherbst[d]: I wished we could do something to lower the cost of creating a GPU context, but it's also super slow with the nvidia stack
13:39gfxstrand[d]: karolherbst[d]: Sure. I don't think we have any NV tricks for that one
13:41karolherbst[d]: yeah.. I've already submitted an MR
13:42karolherbst[d]: I just had to teach zink to deal with it which wouldn't require lowering
14:02karolherbst[d]: yeah well.. `Pass 2598 Fails 93 Crashes 11 Timeouts 0`
14:02karolherbst[d]: though those crashes are all the GPU context getting nuked
14:42dadschoorse[d]: karolherbst[d]: It is 32bit only though?
14:42karolherbst[d]: it isn't
14:42dadschoorse[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1269666827601055877/grafik.png?ex=66b0e4e0&is=66af9360&hm=c96b52a52b32c22f6894b521bf6430023b71782f4190e2fcee72a71d535622ec&
14:42dadschoorse[d]: > This instruction is currently limited to 32-bit width components.
14:43dadschoorse[d]: couldn't be clearer
14:43karolherbst[d]: huh...
14:43karolherbst[d]: ohh.. google gave me the 1.0 spec
14:43karolherbst[d]: where that sentence didn't exist :ferrisUpsideDown:
14:43karolherbst[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1269667116781539464/image.png?ex=66b0e525&is=66af93a5&hm=60a540dc458af1420216936b0058acedf0db419b7bc716ccd023ee7ee09cf6c1&
14:43dadschoorse[d]: 🐸
14:44dadschoorse[d]: > Khronos SPIR-V Issue #211: As with FindSMsb and FindUMsb, FindILsb needs 32-bit components
14:44karolherbst[d]: yeah.. guess I should assume 32 then 😄
14:44dadschoorse[d]: when I search for `glsl.std.450` the first result is the unified spec, not 1.0 🤔
14:45karolherbst[d]: yeah.. dunno
14:45karolherbst[d]: maybe it was my browser history
14:45dadschoorse[d]: it's pretty silly that these integer ops aren't supported for all bit sizes
14:45dadschoorse[d]: lowering is trivial
14:46karolherbst[d]: the thing is, that spirv-val doesn't print an error 😄
14:46karolherbst[d]: but it does do that on msb
14:46dadschoorse[d]: dadschoorse[d]: but the same can be said for all integer bit sizes, and vk still doesn't require them for some reason (stupid imo)
18:15airlied[d]: karolherbst[d]: I assume a chunk of the context create overhead is probably gsp interaction
18:21karolherbst[d]: probably
18:21karolherbst[d]: it's just that a run on my other GPUs is like 10 minutes, on nvk it's like 40 minutes
18:37airlied[d]: Might be worth profiling to see where it stalls, running in parallel will likely hit gsp locks, I think NVIDIA mentioned this is a problem for them in past, not sure if they did anything to alleviate it
18:42karolherbst[d]: yeah...
18:42karolherbst[d]: I think with the nvidia driver I reduced how much runs in parallel and it made things go faster