IRC Logs of #nouveau on irc.freenode.net for 2024-05-22

01:54 fdobridge: <gfxstrand> Once again debating whether I want uniform ops to be different opcodes or re-use good ol' iadd, etc.
01:57 fdobridge: <zmike.> don't overthink it
02:01 fdobridge: <redsheep> Just found this in the docs, isn't this the answer?
02:01 fdobridge: <redsheep>
02:01 fdobridge: <redsheep> ```On Turing, the compiler has the option to push these integer operations
02:01 fdobridge: <redsheep> onto the separate uniform datapath, out of the way of the main datapath. To
02:01 fdobridge: <redsheep> do so, the compiler must emit uniform datapath instructions.```
02:01 fdobridge: <gfxstrand> Not really
02:02 fdobridge: <gfxstrand> That's a description of maybe how NVIDIA's compiler works on the inside, not a mandate for what we need do.
02:02 fdobridge: <redsheep> So you can still get full utilization of parallel int and fp without that on turing?
02:03 fdobridge: <gfxstrand> It's not really about parallelism AFAIK
02:03 fdobridge: <gfxstrand> It just uses a different register file
02:03 fdobridge: <redsheep> I am reading 3.5.2 here: https://arxiv.org/pdf/1903.07486
02:04 fdobridge: <redsheep> The extra path turing added to get integer instructions out of the way of the fp pipes needs the separate instructions, unless I am completely misunderstanding
02:04 fdobridge: <redsheep> Otherwise you'd get ~pascal perf with mixed int and fp workloads
02:05 fdobridge: <redsheep> Nvidia had said at the time it improved IPC 1.36x
02:06 fdobridge: <gfxstrand> I mean, it can increase float throughput in the sense that it lets you reduce register pressure and increase occupancy.
02:11 fdobridge: <redsheep> It would be interesting to see what instructions nvidia spits out for RT shaders, because IIRC this stuff was added with the aim towards RT related stuff
02:14 fdobridge: <redsheep> Just reread, I see what you mean. Using the dedicated int path doesn't require them. So you're just saying that increased occupancy is a result of less register pressure, and not from actually depending on using these instructions
02:17 fdobridge: <gfxstrand> Yup
02:19 fdobridge: <gfxstrand> I don't know that it was added for RT. Maybe? I think it's more targeted towards bindless UBOs and other address calculations
02:19 fdobridge: <redsheep> IIUC it was address calculations for RT stuff because that part of the workload got much heavier than traditional raster?
02:20 fdobridge: <redsheep> I dunno exactly how it works, I do remember address calculations being mentioned somewhere though
02:21 fdobridge: <redsheep> I still don't understand what half this stuff is, so take all this with a grain of salt of course. Trying to understand gpu architecture from scratch is really hard...
02:29 fdobridge: <redsheep> What have you been using to figure out what works with your NAK improvements? Do you have some way to test or profile things?
02:33 fdobridge: <gfxstrand> Not really. Mostly just my own mental model of the hardware.
02:48 fdobridge: <Misyl with Max-Q Design> Worth making something like the fossil metrics on aco?
06:14 fdobridge: <marysaka> that would be a great idea
10:45 fdobridge: <ahuillet> Please share complete instructions on how to repro. You can DM me, as this probably doesn't belong here
11:08 fdobridge: <magic_rb.> will do
22:49 fdobridge: <redsheep> So is the plan to merge the MR defaulting zink now that CI for NVK+Zink exists?
22:49 fdobridge: <redsheep>
22:49 fdobridge: <redsheep> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29232
22:52 fdobridge: <redsheep> I'm kind of split on what I think of this at the moment. Obviously there are issues, given I've opened 4 myself, but merging it would prompt quite a bit of testing on more varied setups before 24.2
22:57 fdobridge: <airlied> it's still relies on a kernel patch that hasn't landed anywhere useful yet, so I'm not going to do it until after that
22:58 fdobridge: <redsheep> Ok. Hopefully it's not too much of a flood of issues getting opened, hopefully people testing mesa git know how to disable it if things break for them
23:00 fdobridge: <airlied> I'll enable it in rawhide at some point, I think I've seen some triangle tearing on gdm
23:01 fdobridge: <redsheep> Right, I'm pretty sure there are sync issues somewhere
23:07 fdobridge: <redsheep> Hmm. I wonder if the sync bug causing flicker for me is actually the same as the chromium/discord corruption? Multi process rendering means the application has to sync internally, right?
23:17 fdobridge: <redsheep> Come to think of it, it seems plausible to me that all 4 of my issues could be the same big since my stack trace for session crashes mentions DRI
23:33 fdobridge: <redsheep> I have never used gdb before, that is the appropriate tool to try to understand what went wrong with my session crash coredumps, right?
23:39 fdobridge: <redsheep> Ah right as I was running into issues trying to use GDB because I had rebuilt mesa since the last crash the debug gods smiled upon me and crashed my session 🤣
23:46 fdobridge: <mhenning> yes, gdb is the right tool to get a backtrace from a coredump
23:50 fdobridge: <redsheep> Hmm yeah this appears to confirm this is a sync issue, it is segfaulting at vulkan/runtime/vk_synchronization.c:399
23:54 fdobridge: <redsheep> Am I interpreting that right?
23:54 fdobridge: <redsheep>
23:54 fdobridge: <redsheep> ```Core was generated by `/usr/bin/plasmashell --no-respawn'.
23:54 fdobridge: <redsheep> Program terminated with signal SIGSEGV, Segmentation fault.
23:54 fdobridge: <redsheep> #0 0x00007f46a1e9db73 in vk_common_QueueSubmit (_queue=<optimized out>, submitCount=1, pSubmits=<optimized out>, fence=0x7f4680001030) at ../mesa/src/vulkan/runtime/vk_synchronization.c:399
23:54 fdobridge: <redsheep> 399 .semaphore = pSubmits[s].pWaitSemaphores[i],
23:54 fdobridge: <redsheep> [Current thread is 1 (Thread 0x7f4687e006c0 (LWP 86065))]```