03:02 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Kepler support is "here be dragons" 🐉
03:08 fdobridge: <g​fxstrand> Because of course there's a CTS test with 16-deep nested loops... 🙄
03:09 fdobridge: <g​fxstrand> Kepler support requires `NVK_I_WANT_A_BROKEN_VULKAN_DRIVER=1` and, well... it's as advertised. 😂
05:54 benjaminl: made an MR for a bunch of SM50 stuff here: https://gitlab.freedesktop.org/marysaka/mesa/-/merge_requests/2
05:55 benjaminl: I could also split this up into separate MRs if it would be easier
14:07 fdobridge: <g​fxstrand> benjaminl: Did you notice I fixed the register ranges?
15:05 fdobridge: <g​fxstrand> I have no idea why but my CTS runs are suddenly going way faster. 🤯
15:13 fdobridge: <o​rowith2os> huh, NAK uses plain Display prints on debug stuff?
15:13 fdobridge: <o​rowith2os> funky
15:14 fdobridge: <g​fxstrand> Yeah
15:14 fdobridge: <g​fxstrand> We should maybe use debug instead
15:14 fdobridge: <o​rowith2os> yep
15:14 fdobridge: <o​rowith2os> would give you more info, probably
15:15 fdobridge: <o​rowith2os> maybe
15:15 fdobridge: <g​fxstrand> Eh... More that it'd save us some typing
15:15 fdobridge: <g​fxstrand> Because enums debug automatically
15:15 fdobridge: <o​rowith2os> not sure if there's any specific reasoning for `eprintln()` over `println()`
15:15 fdobridge: <g​fxstrand> Most drivers do that sort of logging to stderr
15:15 fdobridge: <o​rowith2os> oh, i see
15:15 fdobridge: <o​rowith2os> kk
15:16 fdobridge: <g​fxstrand> And it's nice when you're running like a CTS test or something because you can do `2> debug` and still see the test results in your terminal.
15:16 fdobridge: <g​fxstrand> Caio had a plan at one point to make a really fancy compiler debug logging thing which I would really like to see happen but it's off in future land somewhere.
15:17 fdobridge: <o​rowith2os> mmm, I *was* gonna look at using `dbg!()` like `dbg!("test: {}", 2)` but that's two separate statements
15:17 fdobridge: <o​rowith2os> mmm, I *was* gonna look at using `dbg!()` like `dbg!("test: {}", 2)` but that's two separate prins (edited)
15:17 fdobridge: <o​rowith2os> mmm, I *was* gonna look at using `dbg!()` like `dbg!("test: {}", 2)` but that's two separate prints (edited)
15:17 fdobridge: <o​rowith2os> I'm used to hacky-ish printlns and dbgs, tracing if I need something nicer
15:18 fdobridge: <o​rowith2os> I'm used to hacky-ish printlns and dbgs, `tracing` if I need something nicer (edited)
15:18 fdobridge: <o​rowith2os> maybe take a peek at that
15:18 fdobridge: <k​arolherbst🐧🦀> I should be better about Debug in rusticl :ferrisUpsideDown:
15:18 fdobridge: <o​rowith2os> two Mesa contributions then :v
15:18 fdobridge: <k​arolherbst🐧🦀> _apparently_ there is a way to fetch debugging information and that might be another cool project
15:18 fdobridge: <k​arolherbst🐧🦀> but that's like a huge project
15:19 fdobridge: <o​rowith2os> anyways, don't worry about it now, bug me later and I'll get on swapping it over
15:19 fdobridge: <o​rowith2os> :ferris_happy:
15:19 fdobridge: <o​rowith2os> ~~brownie points~~
15:20 fdobridge: <g​fxstrand> @karolherbst do we have any info on how to spill/fill barriers? It looks like BMOV can move the barrier into a GPR but IDK how to get it back into a barrier
15:21 fdobridge: <k​arolherbst🐧🦀> `B2R` and `R2B`
15:22 fdobridge: <k​arolherbst🐧🦀> I think?
15:22 fdobridge: <k​arolherbst🐧🦀> huh.. maybe it's for the BAR barriers...
15:23 fdobridge: <k​arolherbst🐧🦀> ahh yeah..
15:23 fdobridge: <k​arolherbst🐧🦀> that's for the other varrier
15:23 fdobridge: <k​arolherbst🐧🦀> @gfxstrand BMOV can do both sides
15:23 fdobridge: <k​arolherbst🐧🦀> just different opcode
15:23 fdobridge: <k​arolherbst🐧🦀> I have a patch somewhere...
15:24 fdobridge: <k​arolherbst🐧🦀> @gfxstrand https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/3d9bb5c1b835ef2824cc55fb0d554990c92cf22e
15:24 fdobridge: <k​arolherbst🐧🦀> I hope the opcodes there help
15:29 fdobridge: <k​arolherbst🐧🦀> mhhh
15:29 fdobridge: <k​arolherbst🐧🦀> maybe there was something else?
15:29 fdobridge: <k​arolherbst🐧🦀> nah.. there is an example
15:29 fdobridge: <k​arolherbst🐧🦀> BMOV.32 B0 R0
15:30 fdobridge: <k​arolherbst🐧🦀> the barrier file is just weird
15:30 fdobridge: <k​arolherbst🐧🦀> because you have 16 fixed values
15:30 fdobridge: <k​arolherbst🐧🦀> and 16 general purpose ones
15:30 fdobridge: <k​arolherbst🐧🦀> and it's kinda weird
15:31 fdobridge: <g​fxstrand> Yeah
15:31 fdobridge: <k​arolherbst🐧🦀> B0 might a BMOV into 16
15:31 fdobridge: <k​arolherbst🐧🦀> ehh no, it's the other way around
15:31 fdobridge: <k​arolherbst🐧🦀> the fixed ones start at 16
15:31 fdobridge: <g​fxstrand> Yeah
15:31 fdobridge: <k​arolherbst🐧🦀> anyway, BMOV should be able to do both
15:32 fdobridge: <k​arolherbst🐧🦀> and `B2R`/`R2B` are for the `BAR` barriers
15:33 fdobridge: <k​arolherbst🐧🦀> mhh
15:33 fdobridge: <k​arolherbst🐧🦀> `.CLEAR` is needed if you move from barrier to barrier
15:33 fdobridge: <g​fxstrand> I'm still very confused by all that
15:33 fdobridge: <g​fxstrand> Like, do we have any docs on the semantics of any of this?
15:33 fdobridge: <k​arolherbst🐧🦀> mhhhh
15:33 fdobridge: <k​arolherbst🐧🦀> no
15:33 fdobridge: <g​fxstrand> Of course...
15:34 fdobridge: <g​fxstrand> I may just have to run some nasty tests through the blob and see what they do
15:34 fdobridge: <k​arolherbst🐧🦀> you need some of it for texgrad lowering though
15:34 fdobridge: <k​arolherbst🐧🦀> `.PQUAD` does a promotion to quad operation, best if you just check what codegen does there
15:34 fdobridge: <k​arolherbst🐧🦀> https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/8eb017a67992d60b9931c9b5e74586defa45749c
15:35 fdobridge: <k​arolherbst🐧🦀> not sure you'll need anything besides that
15:35 fdobridge: <k​arolherbst🐧🦀> I have the names here: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/a8915e92ef0e387abaf54b6be56bba5a7c58d428
15:35 fdobridge: <k​arolherbst🐧🦀> but they are also in codegen
15:36 fdobridge: <k​arolherbst🐧🦀> some of the things I can tell you what they do though
15:37 fdobridge: <k​arolherbst🐧🦀> `MACTIVE` is the mask of active threads at least 😄
15:38 fdobridge: <k​arolherbst🐧🦀> and that `.PQUAD` makes it so, that you have quads everywhere
15:39 fdobridge: <k​arolherbst🐧🦀> so what you need for those quad operation things, is to save the current mask, write it back with `.PQUAD` to force quads, and then restore the mask later
15:39 fdobridge: <k​arolherbst🐧🦀> but that's the only one I've dealt with
15:39 fdobridge: <k​arolherbst🐧🦀> `MATEXIT` is the exit handler, never used it though
15:40 fdobridge: <k​arolherbst🐧🦀> `ATEXIT` is the only one you can write to with `.64`
15:41 fdobridge: <g​fxstrand> Yeah, I have all the names too. That doesn't help.
15:42 fdobridge: <g​fxstrand> I need to know things like "If I do a R2B from non-uniform control-flow, what happens?"
15:42 fdobridge: <k​arolherbst🐧🦀> ahh
15:42 fdobridge: <k​arolherbst🐧🦀> I know
15:43 fdobridge: <k​arolherbst🐧🦀> an arbitrary thread is chosen to do the write
15:43 fdobridge: <k​arolherbst🐧🦀> or rather
15:43 fdobridge: <k​arolherbst🐧🦀> the order is arbitrary
15:43 fdobridge: <k​arolherbst🐧🦀> but each thread executes it
15:44 fdobridge: <k​arolherbst🐧🦀> anyway, gotta eat
15:45 fdobridge: <k​arolherbst🐧🦀> so in case the flow is non uniform, you see whatever was valid at the time the thread read it/wrote to it
15:59 fdobridge: <b​enjaminl> if we mark most of the NAK IR types as `Debug` we can use stuff like `assert_eq` and get nicer errors without very much work
15:59 fdobridge: <g​fxstrand> Yeah, probably a good idea.
16:20 fdobridge: <o​rowith2os> `debug_assert` if you don't want them in release builds btw
16:22 fdobridge: <o​rowith2os> I'd generally prefer those since you can change whether they show in release builds or not
16:23 fdobridge: <g​fxstrand> Generally, I've been trying to move more sanity check stuff into `debug_assert!()` and reserving `assert!()` for stuff which is actual invariants which will cause the compiler to blow up if invalidated.
16:27 fdobridge: <b​enjaminl> how much of a concern is compiler perf?
16:29 fdobridge: <o​rowith2os> the more done at compile-time, the better :)
16:30 fdobridge: <o​rowith2os> I thought `assert!()` is a runtime check, not a compile-time check?
16:30 fdobridge: <o​rowith2os> or does that depend on the arguments
16:33 fdobridge: <o​rowith2os> iouno
16:36 fdobridge: <o​rowith2os> I figure, if something's wrong, you'll end up blowing up the user's computer rather than the compiler
16:36 fdobridge: <b​enjaminl> I think confusion is probably from whether the compiler we're talking about is rustc or NAK
16:37 fdobridge: <b​enjaminl> `assert!()` is (for all practical purposes) a runtime check from the perspective of rustc, but if we're putting `assert!()` in the NAK code it happens at compile-time for shaders on the user's computer
16:38 fdobridge: <o​rowith2os> I see
16:38 fdobridge: <o​rowith2os> nice thing to know :)
16:38 fdobridge: <o​rowith2os> wouldn't you still not want that assert in NAK, for performance reasons?
16:39 fdobridge: <o​rowith2os> or is that not really a worry
16:39 fdobridge: <o​rowith2os> (or something for Later TM)
16:39 fdobridge: <b​enjaminl> yeah, I was asking gfxstrand about that, my naive guess is that it doesn't matter in practice are are
16:39 fdobridge: <b​enjaminl> yeah, I was asking gfxstrand about that, my naive guess is that it doesn't matter in practice (edited)
16:40 fdobridge: <b​enjaminl> because all the asserts that we actually have are very cheap (and some are probably eliminated by rustc)
16:40 fdobridge: <o​rowith2os> mm, mm
16:40 fdobridge: <o​rowith2os> Later, then
16:40 fdobridge: <b​enjaminl> I don't really have much experience with graphics drivers and shader compilers though, it's possible that I don't know the tradeoffs for perf stuff very well
16:40 fdobridge: <o​rowith2os> can see if it actually has any impact that it's worth changing
16:41 fdobridge: <b​enjaminl> in my own stuff, I usually only switch to `debug_assert!` if it's in a tight inner loop or something, and profiling shows an actual difference
16:41 fdobridge: <b​enjaminl> which is... usually not the case
16:50 fdobridge: <o​rowith2os> I wonder what benchmarking NAK and rusticl would be like, actually
16:50 fdobridge: <o​rowith2os> you can't really `cargo bench`
16:51 fdobridge: <o​rowith2os> sysprof, I guess?
17:15 fdobridge: <k​arolherbst🐧🦀> yeah
17:15 fdobridge: <k​arolherbst🐧🦀> I've used sysprof and hotspot
17:15 fdobridge: <k​arolherbst🐧🦀> sysprof was kinda broken for a long time, but it seems much better now
17:23 fdobridge: <o​rowith2os> mm, I gotta get used to sysprof anyways, considering it's a common occurrence in benchmarking GNOME's stuff like Mutter :v
17:24 fdobridge: <k​arolherbst🐧🦀> ahh yeah, fair
22:46 fdobridge: <g​fxstrand> I have no idea why but my NAK runs are now suddenly down under an hour. 🤔
22:46 fdobridge: <g​fxstrand> The only thing I know I changed was not lowering indirect clip distances. That can't possibly be it, though, can it?!?
22:46 fdobridge: <g​fxstrand> If so, WTF was that slow?!?