00:23 airlied[d]: _lyude[d]: btw you say that last atomic patch you added got reverted?
06:33 _lyude[d]: airlied[d]: which one.
06:41 _lyude[d]: Sigh. I wish I had noticed that, that means we reintroduced the suspend/resume bug 😒
06:42 _lyude[d]: I honestly didn't see the patch until just now
06:43 _lyude[d]: The actual fix would have been really simple; just use a different vtable for <nv50.
06:55 _lyude[d]: ...also, there's no review tags on this
06:57 airlied[d]: it was a regression in rc8, review isn't going to keep the patch in the tree at that point
06:57 _lyude[d]: ah ok
06:57 _lyude[d]: that makes sense
06:58 _lyude[d]: (also thank you for letting me know in the first place!)
19:06 mohamexiety[d]: has anyone poked around perfetto integration in nvk before? stalled a bit on the hwmon stuff and need to think a bit more so was looking around for userspace stuff to do in the meanwhile and been looking more into this the past few days
19:08 mhenning[d]: There's an issue for it: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14020
19:08 mhenning[d]: looks like marysaka[d] is assigned
19:09 marysaka[d]: I'm not planning to look at it for a while and didn't start anything so it should be unassigned
19:11 marysaka[d]: (currently reworking the mesh shader MR and related bits)
19:39 phomes_[d]: I did some things with a vulkan layer for perfetto tracking. I have some wip to do that as an internal layer like the hud layer I posted about a few days ago. I can push that tomorrow. It logs every vulkan api call
19:49 phomes_[d]: I started to look at u_trace support too after talking with Faith at fosdem, and also a suggestion to so from Emma. I am trying to make the game perfect testing automatic by running captures from gfxreconstruct and renderdoc. Running the captures gave very inconsistent timings and u_trace helped other drivers stabilize the timings
20:21 mhenning[d]: For the trace timings, I'd be curious if we're clocking down
20:24 phomes_[d]: locking the clocks was also a suggestion from Emma. I don't think that it is a task that I can do though
20:24 mhenning[d]: Yeah, it requires a kernel patch
20:26 phomes_[d]: mohamexiety[d]: for perfetto/u_trace what if I push what I have tomorrow and then you can see if it fits into the bigger picture?
20:27 mohamexiety[d]: it's not difficult as far as kernel patches go but i am not sure how the interface should look like. there are a bunch of a controls in openrm that can set the clock to the base, prevent boost, etc etc and we can wire that up in nouveau
20:27 mohamexiety[d]: phomes_[d]: sure! up to you. if you'd rather continue on it, I can also look at something else. no worries at all there
20:27 karolherbst[d]: yeah we need it for benchmarking anyway
20:28 karolherbst[d]: if we don't have any GPU APIs that expose this kind of functionality, we should probably just have a debugfs file...
20:29 karolherbst[d]: or try to standardized an interface
20:32 mhenning[d]: yeah, debugfs sounds reasonable to me
20:36 karolherbst[d]: I wonder if GSP has a method to tell if the GPU dropped below base clocks
20:36 karolherbst[d]: becuase I'm sure that furmark will make the GPU drop below base clocks
20:39 redsheep[d]: That depends a lot on which GPU generation. Furmark dropped my 3090 to like 1200 MHz but my 4090 is quite happy at like 2.6 GHz with furmark on
20:39 karolherbst[d]: I wonder if we can also get power consumption reports..
20:40 karolherbst[d]: redsheep[d]: interesting...
20:41 karolherbst[d]: though I wonder how the power draw is ..
20:50 redsheep[d]: Basically tbp limit on both, near 450 watts
22:36 karolherbst[d]: Do we have any use for IPA with an uniform address?
22:36 karolherbst[d]: can do things like `IPA.PASS R0, P0, a[UR16+0x40] ;`
22:37 karolherbst[d]: also the output predicate sounds interesting
22:50 karolherbst[d]: mhhhh.. `OPT(nir, nir_lower_indirect_derefs_to_if_else_trees`
22:50 karolherbst[d]: I wonder.....
22:50 karolherbst[d]: maybe I look into it...
22:54 karolherbst[d]: though not sure how relevant it is and I'd rather get my other stuff finished 😄
23:05 mhenning[d]: Oh, yeah I actually didn't know we could still do indirect indexing of inputs
23:06 karolherbst[d]: well.. there is a twist: it has to be uniform
23:07 karolherbst[d]: but that's still massively better than if else ladders
23:07 mhenning[d]: codegen says:
23:07 mhenning[d]: /* HW doesn't support indirect addressing of fragment program inputs
23:07 mhenning[d]: * on Volta. The binary driver generates a function to handle every
23:07 mhenning[d]: * possible indirection, and indirectly calls the function to handle
23:07 mhenning[d]: * this instead.
23:07 mhenning[d]: */
23:07 karolherbst[d]: yeah..
23:07 karolherbst[d]: which makes sense
23:08 mhenning[d]: and I've been taking that at face value
23:08 karolherbst[d]: well
23:08 karolherbst[d]: it is true
23:08 karolherbst[d]: volta doesn't have UGPRs
23:08 mhenning[d]: right, but the check it uses is `chipset >= NVISA_GV100_CHIPSET`
23:08 mhenning[d]: I assumed they were gone for good
23:09 karolherbst[d]: mhhh
23:09 karolherbst[d]: the code might have been written before turing existed...
23:10 mhenning[d]: right maybe we only tested volta
23:10 karolherbst[d]: yeah and also it's the gallium driver, nobody spend perf optimizing it for anything newer than kepler
23:10 karolherbst[d]: well maybe maxwell aas well
23:11 karolherbst[d]: but yeah..
23:11 karolherbst[d]: also codegen doesn't support UGPRs 😄
23:12 mhenning[d]: but anyway yeah, would be good to wire up if we care about that case (not sure how common it is)
23:12 karolherbst[d]: yeah...
23:12 karolherbst[d]: I think the output predicate of IPA is more useful...
23:12 karolherbst[d]: probably
23:13 mhenning[d]: what does that do?
23:14 karolherbst[d]: either tells if the attribute is 1.0 or if you can skip adjusting for perspective according to the per attribute state
23:16 karolherbst[d]: former for pass and constant, latter for state
23:16 karolherbst[d]: so like feq(ipa, 1.0) -> use predicate instead
23:18 mhenning[d]: does that happen a lot? seems like an odd thing to specialize
23:18 karolherbst[d]: yeah no idea
23:18 karolherbst[d]: maybe?
23:19 karolherbst[d]: but also it's in fragment shaders, so every tiny bit helps
23:22 karolherbst[d]: well maybe everything together gives us like 0.1% 😄
23:28 mmarchini: has anyone looked into or is anyone currently working on https://gitlab.freedesktop.org/drm/nouveau/-/issues/336 (performance metrics)? Been looking for something to start contributing to the project and this one peaked my interest
23:32 mhenning[d]: I think mohamexiety[d] has already started that one
23:51 karolherbst[d]: I love how changing the order in which legalize spills UGPRs to GPRs is making the shader stats worse 🥲