01:07fdobridge: <airlied> yes that pretty much
16:41fdobridge: <gfxstrand> @karolherbst What's the difference between `getLatency()` and `getReadLatency()`?
16:43fdobridge: <karolherbst🐧🦀> good question, it's not used, but if I'd have to guess it is to account for when an instruction starts executing and when it starts to read sources
16:44fdobridge: <karolherbst🐧🦀> there are some conditions where this latency can even differ
16:46fdobridge: <gfxstrand> RIght
16:46fdobridge: <gfxstrand> That's what I thought
16:47fdobridge: <karolherbst🐧🦀> @gfxstrand btw, most of that stuff was taken from here: https://github.com/NervanaSystems/maxas
16:51fdobridge: <karolherbst🐧🦀> btw, there are weird instructions which read/write to sources/dests later depending on the flags 😢
19:27fdobridge: <gfxstrand> This is looking like it's not going to be the long pole I thought it was. I might be merging something by EOD. We'll see how this run goes.
19:53fdobridge: <gfxstrand> 27 minutes in and this run is looking really good
20:51fdobridge: <gfxstrand> All that worrying over opcode latencies and... it's all just 6?!?
21:18DemiMarie: At Qubes OS we get occasional bug reports that appear to be Nouveau-related. Should we ask our users to send these bugs to Nouveau instead?
21:22karolherbst: Yeah, probably
21:22karolherbst: unless it's EoL software in use
21:25DemiMarie: In Qubes we generally have a long since EoL Mesa but a quite recent kernel
21:26DemiMarie: Though as it happens for R4.2 we currently have a (hopefully!) not EoL Mesa in dom0 for now
21:29fdobridge: <gfxstrand> Except for predicates, apparently.
21:30DemiMarie: What is fdobridge?
21:30fdobridge: <gfxstrand> It's a bridge from Discord
21:31HdkR: Bridge so that when Discord inevitably bans an account you can still fall back to IRC :P
21:32karolherbst: I kinda wished matrix would be an alternative... but... well it's not terrible at least not anymore
21:39DemiMarie: https://github.com/QubesOS/qubes-issues/issues/8506 looks like an Nvidia GPU firmware regression.
21:56fdobridge: <benjaminl> are there general guidelines for whether to clobber inputs vs using scratch space when running out of registers on fermi MME stuff?
21:57fdobridge: <benjaminl> I'm working on indirect dispatch for pre-turing, and have a working implementation that doesn't need scratch space, but it _does_ involve some input-clobber helper functions that might be a pain to maintain
21:57fdobridge: <benjaminl> I don't have a good idea of what the runtime cost for scratch space is like
21:58DemiMarie: Is there anywhere I can follow progress on the GSP firmware bringup? Until that is implemented there should probably be a “don’t buy Nvidia” note somewhere in the Qubes docs.
22:00fdobridge: <benjaminl> I'm working on indirect dispatch for pre-turing, and have a working implementation that doesn't need scratch space, but it _does_ involve some input-clobbering helper functions that might be a pain to maintain (edited)
22:04fdobridge: <gfxstrand> I think we can assume scratch is cheap
22:05fdobridge: <gfxstrand> Really, someone needs to implement RA for pre-Turing MME...
22:05fdobridge: <gfxstrand> But ugh
22:05fdobridge: <benjaminl> like, as a second pass in the builduer?
22:05fdobridge: <gfxstrand> I don't want to think about that and I don't think I want to review it right now, either. 😅
22:05fdobridge: <benjaminl> lol reasonable
22:05fdobridge: <gfxstrand> Yeah, something that'd happen as part of `_finish()`
22:07fdobridge: <benjaminl> for the indirect dispatch stuff, I _think_ you'd also need a pass to remove unnecessary copies to get the bit I'm looking at under the limit
22:07fdobridge: <benjaminl> (without having the helper functions clobber inputs)
22:08fdobridge: <gfxstrand> I mean, having helpers clobber inputs isn't the end of the world as long as it's really clear what's going on.
22:08fdobridge: <gfxstrand> If they're going to clobber, they should free as well.
22:08fdobridge: <benjaminl> yeah, that makes sense
22:08DemiMarie: Is the bridge 1-way or 2-way?
22:08fdobridge: <gfxstrand> DemiMarie 2-way
22:09DemiMarie: gfxstrand: thanks
22:09airlied: DemiMarie: phoronix? they will probably publish articles as things progress
22:09fdobridge: <benjaminl> a bit more context: half of the helpers are integer multiplication functions. I'm thinking it would be nice to have those in the shared `mme_fermi_builder.h` file to use other places, but don't want to do that with the clobbering
22:10fdobridge: <benjaminl> because I'm worried it would be easy to miss that part when using them
22:10DemiMarie: airlied: good idea
22:10DemiMarie: airlied: I looked at `lore.kernel.org` and found zilch
22:10fdobridge: <gfxstrand> Yeah, I think I ran into that too, last I looked at this.
22:10fdobridge: <gfxstrand> There's a reason I didn't merge anything yet. 🙃
22:11fdobridge: <gfxstrand> But I'm happy for you to figure out something sensible.
22:11fdobridge: <benjaminl> oh, you already did the implementation?
22:11fdobridge: <gfxstrand> I started typing and got frustrated and walked away
22:12fdobridge: <benjaminl> ah, yeah that's also reasonable lol
22:12fdobridge: <gfxstrand> nvk/fermi-dispatch-indirect in my repo if there's anything else useful in there.
22:12fdobridge: <gfxstrand> nvk/fermi-dispatch-indirect in my repo if there's anything useful in there. (edited)
22:12fdobridge: <gfxstrand> Looks like I didn't get all that far
22:12fdobridge: <benjaminl> it made the puzzle-loving part of my brain happy for a bit, but now I'm just like "why are there _yet more problems_"
22:13fdobridge: <gfxstrand> 😹
22:13fdobridge: <benjaminl> dedicated instruction for AND_NOT, but no MULU...
22:40fdobridge: <gfxstrand> MUL is complicated
23:51fdobridge: <gfxstrand> PSA: `NAK_DEBUG=serial` is no longer required. 🎉
23:53fdobridge:<gfxstrand> runs with NVK_USE_NAK=fs
23:53fdobridge: <gfxstrand> _runs with `NVK_USE_NAK=fs`_ (edited)
23:53fdobridge: <gfxstrand> _runs with_ `NVK_USE_NAK=fs` (edited)