03:02fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Kepler support is "here be dragons" 🐉
03:08fdobridge: <gfxstrand> Because of course there's a CTS test with 16-deep nested loops... 🙄
03:09fdobridge: <gfxstrand> Kepler support requires `NVK_I_WANT_A_BROKEN_VULKAN_DRIVER=1` and, well... it's as advertised. 😂
05:54benjaminl: made an MR for a bunch of SM50 stuff here: https://gitlab.freedesktop.org/marysaka/mesa/-/merge_requests/2
05:55benjaminl: I could also split this up into separate MRs if it would be easier
14:07fdobridge: <gfxstrand> benjaminl: Did you notice I fixed the register ranges?
15:05fdobridge: <gfxstrand> I have no idea why but my CTS runs are suddenly going way faster. 🤯
15:13fdobridge: <orowith2os> huh, NAK uses plain Display prints on debug stuff?
15:13fdobridge: <orowith2os> funky
15:14fdobridge: <gfxstrand> Yeah
15:14fdobridge: <gfxstrand> We should maybe use debug instead
15:14fdobridge: <orowith2os> yep
15:14fdobridge: <orowith2os> would give you more info, probably
15:15fdobridge: <orowith2os> maybe
15:15fdobridge: <gfxstrand> Eh... More that it'd save us some typing
15:15fdobridge: <gfxstrand> Because enums debug automatically
15:15fdobridge: <orowith2os> not sure if there's any specific reasoning for `eprintln()` over `println()`
15:15fdobridge: <gfxstrand> Most drivers do that sort of logging to stderr
15:15fdobridge: <orowith2os> oh, i see
15:15fdobridge: <orowith2os> kk
15:16fdobridge: <gfxstrand> And it's nice when you're running like a CTS test or something because you can do `2> debug` and still see the test results in your terminal.
15:16fdobridge: <gfxstrand> Caio had a plan at one point to make a really fancy compiler debug logging thing which I would really like to see happen but it's off in future land somewhere.
15:17fdobridge: <orowith2os> mmm, I *was* gonna look at using `dbg!()` like `dbg!("test: {}", 2)` but that's two separate statements
15:17fdobridge: <orowith2os> mmm, I *was* gonna look at using `dbg!()` like `dbg!("test: {}", 2)` but that's two separate prins (edited)
15:17fdobridge: <orowith2os> mmm, I *was* gonna look at using `dbg!()` like `dbg!("test: {}", 2)` but that's two separate prints (edited)
15:17fdobridge: <orowith2os> I'm used to hacky-ish printlns and dbgs, tracing if I need something nicer
15:18fdobridge: <orowith2os> I'm used to hacky-ish printlns and dbgs, `tracing` if I need something nicer (edited)
15:18fdobridge: <orowith2os> maybe take a peek at that
15:18fdobridge: <karolherbst🐧🦀> I should be better about Debug in rusticl :ferrisUpsideDown:
15:18fdobridge: <orowith2os> two Mesa contributions then :v
15:18fdobridge: <karolherbst🐧🦀> _apparently_ there is a way to fetch debugging information and that might be another cool project
15:18fdobridge: <karolherbst🐧🦀> but that's like a huge project
15:19fdobridge: <orowith2os> anyways, don't worry about it now, bug me later and I'll get on swapping it over
15:19fdobridge: <orowith2os> :ferris_happy:
15:19fdobridge: <orowith2os> ~~brownie points~~
15:20fdobridge: <gfxstrand> @karolherbst do we have any info on how to spill/fill barriers? It looks like BMOV can move the barrier into a GPR but IDK how to get it back into a barrier
15:21fdobridge: <karolherbst🐧🦀> `B2R` and `R2B`
15:22fdobridge: <karolherbst🐧🦀> I think?
15:22fdobridge: <karolherbst🐧🦀> huh.. maybe it's for the BAR barriers...
15:23fdobridge: <karolherbst🐧🦀> ahh yeah..
15:23fdobridge: <karolherbst🐧🦀> that's for the other varrier
15:23fdobridge: <karolherbst🐧🦀> @gfxstrand BMOV can do both sides
15:23fdobridge: <karolherbst🐧🦀> just different opcode
15:23fdobridge: <karolherbst🐧🦀> I have a patch somewhere...
15:24fdobridge: <karolherbst🐧🦀> @gfxstrand https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/3d9bb5c1b835ef2824cc55fb0d554990c92cf22e
15:24fdobridge: <karolherbst🐧🦀> I hope the opcodes there help
15:29fdobridge: <karolherbst🐧🦀> mhhh
15:29fdobridge: <karolherbst🐧🦀> maybe there was something else?
15:29fdobridge: <karolherbst🐧🦀> nah.. there is an example
15:29fdobridge: <karolherbst🐧🦀> BMOV.32 B0 R0
15:30fdobridge: <karolherbst🐧🦀> the barrier file is just weird
15:30fdobridge: <karolherbst🐧🦀> because you have 16 fixed values
15:30fdobridge: <karolherbst🐧🦀> and 16 general purpose ones
15:30fdobridge: <karolherbst🐧🦀> and it's kinda weird
15:31fdobridge: <gfxstrand> Yeah
15:31fdobridge: <karolherbst🐧🦀> B0 might a BMOV into 16
15:31fdobridge: <karolherbst🐧🦀> ehh no, it's the other way around
15:31fdobridge: <karolherbst🐧🦀> the fixed ones start at 16
15:31fdobridge: <gfxstrand> Yeah
15:31fdobridge: <karolherbst🐧🦀> anyway, BMOV should be able to do both
15:32fdobridge: <karolherbst🐧🦀> and `B2R`/`R2B` are for the `BAR` barriers
15:33fdobridge: <karolherbst🐧🦀> mhh
15:33fdobridge: <karolherbst🐧🦀> `.CLEAR` is needed if you move from barrier to barrier
15:33fdobridge: <gfxstrand> I'm still very confused by all that
15:33fdobridge: <gfxstrand> Like, do we have any docs on the semantics of any of this?
15:33fdobridge: <karolherbst🐧🦀> mhhhh
15:33fdobridge: <karolherbst🐧🦀> no
15:33fdobridge: <gfxstrand> Of course...
15:34fdobridge: <gfxstrand> I may just have to run some nasty tests through the blob and see what they do
15:34fdobridge: <karolherbst🐧🦀> you need some of it for texgrad lowering though
15:34fdobridge: <karolherbst🐧🦀> `.PQUAD` does a promotion to quad operation, best if you just check what codegen does there
15:34fdobridge: <karolherbst🐧🦀> https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/8eb017a67992d60b9931c9b5e74586defa45749c
15:35fdobridge: <karolherbst🐧🦀> not sure you'll need anything besides that
15:35fdobridge: <karolherbst🐧🦀> I have the names here: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/a8915e92ef0e387abaf54b6be56bba5a7c58d428
15:35fdobridge: <karolherbst🐧🦀> but they are also in codegen
15:36fdobridge: <karolherbst🐧🦀> some of the things I can tell you what they do though
15:37fdobridge: <karolherbst🐧🦀> `MACTIVE` is the mask of active threads at least 😄
15:38fdobridge: <karolherbst🐧🦀> and that `.PQUAD` makes it so, that you have quads everywhere
15:39fdobridge: <karolherbst🐧🦀> so what you need for those quad operation things, is to save the current mask, write it back with `.PQUAD` to force quads, and then restore the mask later
15:39fdobridge: <karolherbst🐧🦀> but that's the only one I've dealt with
15:39fdobridge: <karolherbst🐧🦀> `MATEXIT` is the exit handler, never used it though
15:40fdobridge: <karolherbst🐧🦀> `ATEXIT` is the only one you can write to with `.64`
15:41fdobridge: <gfxstrand> Yeah, I have all the names too. That doesn't help.
15:42fdobridge: <gfxstrand> I need to know things like "If I do a R2B from non-uniform control-flow, what happens?"
15:42fdobridge: <karolherbst🐧🦀> ahh
15:42fdobridge: <karolherbst🐧🦀> I know
15:43fdobridge: <karolherbst🐧🦀> an arbitrary thread is chosen to do the write
15:43fdobridge: <karolherbst🐧🦀> or rather
15:43fdobridge: <karolherbst🐧🦀> the order is arbitrary
15:43fdobridge: <karolherbst🐧🦀> but each thread executes it
15:44fdobridge: <karolherbst🐧🦀> anyway, gotta eat
15:45fdobridge: <karolherbst🐧🦀> so in case the flow is non uniform, you see whatever was valid at the time the thread read it/wrote to it
15:59fdobridge: <benjaminl> if we mark most of the NAK IR types as `Debug` we can use stuff like `assert_eq` and get nicer errors without very much work
15:59fdobridge: <gfxstrand> Yeah, probably a good idea.
16:20fdobridge: <orowith2os> `debug_assert` if you don't want them in release builds btw
16:22fdobridge: <orowith2os> I'd generally prefer those since you can change whether they show in release builds or not
16:23fdobridge: <gfxstrand> Generally, I've been trying to move more sanity check stuff into `debug_assert!()` and reserving `assert!()` for stuff which is actual invariants which will cause the compiler to blow up if invalidated.
16:27fdobridge: <benjaminl> how much of a concern is compiler perf?
16:29fdobridge: <orowith2os> the more done at compile-time, the better :)
16:30fdobridge: <orowith2os> I thought `assert!()` is a runtime check, not a compile-time check?
16:30fdobridge: <orowith2os> or does that depend on the arguments
16:33fdobridge: <orowith2os> iouno
16:36fdobridge: <orowith2os> I figure, if something's wrong, you'll end up blowing up the user's computer rather than the compiler
16:36fdobridge: <benjaminl> I think confusion is probably from whether the compiler we're talking about is rustc or NAK
16:37fdobridge: <benjaminl> `assert!()` is (for all practical purposes) a runtime check from the perspective of rustc, but if we're putting `assert!()` in the NAK code it happens at compile-time for shaders on the user's computer
16:38fdobridge: <orowith2os> I see
16:38fdobridge: <orowith2os> nice thing to know :)
16:38fdobridge: <orowith2os> wouldn't you still not want that assert in NAK, for performance reasons?
16:39fdobridge: <orowith2os> or is that not really a worry
16:39fdobridge: <orowith2os> (or something for Later TM)
16:39fdobridge: <benjaminl> yeah, I was asking gfxstrand about that, my naive guess is that it doesn't matter in practice are are
16:39fdobridge: <benjaminl> yeah, I was asking gfxstrand about that, my naive guess is that it doesn't matter in practice (edited)
16:40fdobridge: <benjaminl> because all the asserts that we actually have are very cheap (and some are probably eliminated by rustc)
16:40fdobridge: <orowith2os> mm, mm
16:40fdobridge: <orowith2os> Later, then
16:40fdobridge: <benjaminl> I don't really have much experience with graphics drivers and shader compilers though, it's possible that I don't know the tradeoffs for perf stuff very well
16:40fdobridge: <orowith2os> can see if it actually has any impact that it's worth changing
16:41fdobridge: <benjaminl> in my own stuff, I usually only switch to `debug_assert!` if it's in a tight inner loop or something, and profiling shows an actual difference
16:41fdobridge: <benjaminl> which is... usually not the case
16:50fdobridge: <orowith2os> I wonder what benchmarking NAK and rusticl would be like, actually
16:50fdobridge: <orowith2os> you can't really `cargo bench`
16:51fdobridge: <orowith2os> sysprof, I guess?
17:15fdobridge: <karolherbst🐧🦀> yeah
17:15fdobridge: <karolherbst🐧🦀> I've used sysprof and hotspot
17:15fdobridge: <karolherbst🐧🦀> sysprof was kinda broken for a long time, but it seems much better now
17:23fdobridge: <orowith2os> mm, I gotta get used to sysprof anyways, considering it's a common occurrence in benchmarking GNOME's stuff like Mutter :v
17:24fdobridge: <karolherbst🐧🦀> ahh yeah, fair
22:46fdobridge: <gfxstrand> I have no idea why but my NAK runs are now suddenly down under an hour. 🤔
22:46fdobridge: <gfxstrand> The only thing I know I changed was not lowering indirect clip distances. That can't possibly be it, though, can it?!?
22:46fdobridge: <gfxstrand> If so, WTF was that slow?!?