01:07fdobridge: <gfxstrand> 32-bit NVK userspace works fine.
01:08HdkR: Would be very sad if it didn't, It's needed for all those old D3D8/9/10/11 games :)
01:53fdobridge: <gfxstrand> Ugh... breaks...
01:55fdobridge: <gfxstrand> This part of the hardware really does hate SSA
02:04fdobridge: <karolherbst🐧🦀> yes
02:04fdobridge: <karolherbst🐧🦀> 😄
02:06fdobridge: <karolherbst🐧🦀> ~~could just handle them as intrinsics~~
02:06fdobridge: <karolherbst🐧🦀> ~~and the barrier file as memory~~
02:14fdobridge: <gfxstrand> Yeah, the problem with that is that all my spilling assumes SSA.
02:15fdobridge: <gfxstrand> Maybe I should just RE `bmov.32` and figure out what the value is
02:16fdobridge: <gfxstrand> If they really are just a mask (I still don't know if I believe that), then `bssy` is just `|=` and `break` is just `&= ~ACTIVE`
02:19fdobridge: <karolherbst🐧🦀> see what I wrote earlier
02:20fdobridge: <karolherbst🐧🦀> there are those `TS_THREAD_STATE_ENUM0` to `TS_THREAD_STATE_ENUM4` things and those instructions do something funky with that
02:20fdobridge: <karolherbst🐧🦀> so if you want to RE it, you probably also have to query those 5 values
02:22fdobridge: <karolherbst🐧🦀> it's apparently unsafe to modify those 5 values if interrupts are enabled
02:25fdobridge: <gfxstrand> Yeah, I've got to think about this more. 😕
02:25fdobridge: <gfxstrand> I think I may be able to play some tricks and make it work.
02:25fdobridge: <gfxstrand> It's just going to be annoying.
03:21fdobridge: <gfxstrand> It's break that's the really annoying part
03:29fdobridge: <karolherbst🐧🦀> @gfxstrand could just implement `B1 = BREAK B0` as `BMOV B1 B0 + BREAK B1` and if RA makes `BMOV B0 B0` + `BREAK B0` out of it so be it :ferrisUpsideDown:
03:38fdobridge: <gfxstrand> Yeah...
06:13fdobridge: <!DodoNVK (she) 🇱🇹> Implementing EXT_map_memory_placed also works
21:39fdobridge: <gfxstrand> This should help with perf a decent bit: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26408
21:41fdobridge: <gfxstrand> CTS run looks mostly good but it looks like there's a small handfull of fails yet.
21:49fdobridge: <gfxstrand> This is entertaining...
21:50fdobridge: <gfxstrand> ```
21:50fdobridge: <gfxstrand> fifo: PBDMA0: 01000000 [] ch 3 [007fb10000 deqp-vk[275722]] subc 0 mthd 0000 data 00000000
21:50fdobridge: <gfxstrand> ```
21:51fdobridge: <gfxstrand> It looks like it maybe shaves a few minutes off a CTS run. That's fun.
21:54fdobridge: <marysaka> Can't wait to have the mesh shader tests double it :vReiAgony:
21:55fdobridge: <karolherbst🐧🦀> are there that many mesh shader tests?
21:57fdobridge: <pixelcluster> just wait for raytracing :frog_demon:
21:59fdobridge: <airlied> @gfxstrand that's a lot of 0s
22:00fdobridge: <marysaka> the txt file is 4MB soo
22:00fdobridge: <marysaka> the txt file with the test cases is 4MB soo (edited)
22:01fdobridge: <karolherbst🐧🦀> pain
22:01fdobridge: <marysaka> I will try to do a run tomorrow and see how things goes
22:54Lyude: you know
22:54Lyude: having openrm's source kind of fucks
23:12karolherbst: Lyude: that reminds me.. we should scan the repo for all those PCI workarounds as well
23:13Lyude: oh yeah good point
23:13karolherbst: there are some laptops where runpm is still a bit broken
23:13Lyude: (for those wondering: I found a Big list of displayport workarounds in nvidia's codebase :DD)
23:13karolherbst: but I also wanted to wait until they officially support runpm
23:13Lyude: fair
23:14Lyude: I'm still trying to figure out link training btw, been going through nvidia's link training code and taking notes on the stuff they do. at least I've got a list of stuff to start trying now
23:35fdobridge: <gfxstrand> `Pass: 556678, Fail: 108, Crash: 84, Warn: 4, Skip: 3431474, Flake: 20, Duration: 52:02`
23:35fdobridge: <gfxstrand> So, it needs some fixing but not bad. Also, that's 10-15 min faster than a run without it.
23:46fdobridge: <gfxstrand> Okay, why does flushing compute caches at the end of `EndTransformFeedback()` do something?!? If I replace it with a 3D flush, even WFI, it doesn't workl.
23:46fdobridge: <gfxstrand> Okay, why does flushing compute caches at the end of `EndTransformFeedback()` do something?!? If I replace it with a 3D flush, even WFI, it doesn't work. (edited)
23:46fdobridge: <gfxstrand> WTH?!?
23:47fdobridge: <gfxstrand> It was even a NO_WFI invalidate. Somehow switching to compute flushes *something*
23:54fdobridge: <karolherbst🐧🦀> 3D and compute state aliases
23:54fdobridge: <karolherbst🐧🦀> well.. some parts of those
23:54fdobridge: <karolherbst🐧🦀> and a subchannel switch causes a full WFI afaik
23:56fdobridge: <pixelcluster> ye that was my knowledge too
23:56fdobridge: <pixelcluster> ye that was what I heard too (that subchannel switch causes WFI) (edited)