00:00mhenning[d]: karolherbst[d]: there was a synchronization fix that might slow some things down a few weeks ago. would be good to bisect though
07:40karolherbst[d]: yeah.. that could do that.. but not sure
07:47karolherbst[d]: the issue is, that the perf regression doesn't seem to be huge, and the benchmark I'm using is already a bit fuzzy there, but it seems to be a drop of around 5% or so?
07:48karolherbst[d]: mhh more like 10% for IMMA testing
08:15karolherbst[d]: mhhh looks kinda more like a change in the compiler but...
08:21karolherbst[d]: `Max warps/SM: 16` => `Max warps/SM: 12` yeah... that would do it 🙃
08:23karolherbst[d]: kinda curious which opt is responsible here
08:23karolherbst[d]: but we also need better RA targets or something dunno...
08:52karolherbst[d]: mhh https://gitlab.freedesktop.org/mesa/mesa/-/commit/b3615e5d6ffc64b55c95ddbd4d0c4aa264c45a1d
08:53karolherbst[d]: guess not much to do there and just accept the loses and hope with my other opts it's all gone 😄
12:39karolherbst[d]: okay.. all done, proper cmat vectorization in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37998/
19:46_lyude[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1440790376049934446/image.png?ex=691f701a&is=691e1e9a&hm=d680b28854b5be734f8b8af3b524933957dfed14895bc56d8fc786ae07bdede8&
19:46_lyude[d]: Anyone seen this cool hip new bug that seems to only happen sometimes with DRI_PRIME offloading on AD107?
19:48sonicadvance1[d]: I personally enjoy only copying every other tile. Cuts my workload in half.
19:48_lyude[d]: hehe
19:48_lyude[d]: if I get any spare time soon I'd be happy to take a look
19:49_lyude[d]: especially since I think(?) upstream may have finally fixed my laptop's discrete GPU crashing
19:51airlied[d]: What kernel is it? And is it rendered on NVIDIA displayed on Intel?
19:54_lyude[d]: airlied[d]: drm-next as of f3a1d69f9b388271986f4efe1fd775df15b443c1 . It's rendered on nvidia (AD107) and displayed on Intel (ADL-P)
19:56airlied[d]: So it could be big pages and compression support?
19:56airlied[d]: Might be worth reverting those patches one by one to see
19:57_lyude[d]: maybe - but, it's worth noting it doesn't actually start like this interestingly enough
19:57_lyude[d]: I think it took like 30-40 vkcube invocations (was trying to see if my GPU still hangs randomly on s/r) before it started doing this, and now it does it every time
19:59_lyude[d]: gonna try reverting those patches now
20:06mhenning[d]: What mesa are you on?
20:06_lyude[d]: mhenning[d]: version 25.1.9-1.fc42
20:07_lyude[d]: oh ! about to reboot but if I keep rerunning it eventually gets worse
20:08mhenning[d]: Maybe try a newer mesa? The artifact looks a little like the shadows in https://gitlab.freedesktop.org/mesa/mesa/-/issues/13909 and I don't think the fix for that got backported
20:08_lyude[d]: ooooo there's something very entertaining about seeing the same bug but in a shadow instead
20:09_lyude[d]: I'll give it a shot and see if I can backport it
21:04pac85[d]: sonicadvance1[d]: It's 1 every 4
21:09sonicadvance1[d]: Dang, not efficiently dropping things on the floor enough
21:17pac85[d]: Gotta improve the bug
23:18_lyude[d]: karolherbst[d]: the only thing that we needed for dynamic reclocking on older gens was figuring out the iso buffer for display, right?
23:18karolherbst[d]: uhhh.. I _think_ so.. well besides making it all correct and stable
23:18_lyude[d]: And like, actually watching the GPU usage to know when to reclock
23:19karolherbst[d]: that part is trivial
23:19karolherbst[d]: I already have the code
23:19_lyude[d]: karolherbst[d]: so like do you remember that one time Nvidia left on a bit in the vbios that made it so underruns were colored
23:19_lyude[d]: and we had a workaround to disable it
23:19karolherbst[d]: _lyude[d]: https://github.com/karolherbst/nouveau/commits/pmu_counters_v4/
23:20_lyude[d]: what if we just enabled it and then captured the video output with my Chamelium and used the location of the underrun color along with varying iso buffer values to reverse engineer the algorithm for actually calculating the watermarks for it
23:20_lyude[d]: since we could literally see how many scanlines/whatevers off our watermarks are
23:21_lyude[d]: I've been thinking about doing this for ages but realized I've never actually posted this idea here
23:21karolherbst[d]: I have no idea how this buffer works
23:21_lyude[d]: I have a guess
23:22_lyude[d]: If it's anything like how watermarks work on Intel display hardware (and it probably is) I have a general idea
23:27_lyude[d]: I think martin roukala knows more but: it's likely just telling the hardware how much pixel data to prefetch for each scan out frame, and how long you lose access to memory during reclocks
23:31karolherbst[d]: somebody mentioned that it would be beneficial to use it also for normal operation