01:05fdobridge: <Joshie with Max-Q Design> I have a friend whose middle name is literally danger
07:47fdobridge: <ahuillet> what does NIL stand for?
07:50fdobridge: <karolherbst🐧🦀> probably nouveau/nvidia image layout. It's the same type of library as ISL, which stands for Intel Surface Layout
07:52fdobridge: <karolherbst🐧🦀> anyway, it deals with all the tiling and image/sampler header stuff
09:38fdobridge: <ahuillet> more TLAs for me to learn!
09:41fdobridge: <ahuillet> so what's CSV?
09:41fdobridge: <redsheep> There will always be more top level acceleration structures /j
09:41fdobridge: <ahuillet> j/k
10:35fdobridge: <redsheep> Can confirm emptying out switching DEBUG_RUSTFLAGS="-C debuginfo=2" also results in a working build when debug is on
10:38fdobridge: <redsheep> So, it's specifically rust that's the issue, which probably confirms where it said a compute shader triggered the crash. NAK built with those rust flags is breaking.
10:42fdobridge: <redsheep> Oh I realize now that's actually only one flag.
10:57fdobridge: <redsheep> It's also specifically debuginfo set to 2, setting 1 or 0 works fine.
11:19fdobridge: <redsheep> Yeah, I don't know how I would get a full stack trace here, but I can at least get the addr2line for my latest build with rust debug 2, and disabled stripping and lto:
11:19fdobridge: <redsheep>
11:19fdobridge: <redsheep> ```addr2line -e libvulkan_nouveau.so -a 0x4a9c9b
11:19fdobridge: <redsheep> 0x00000000004a9c9b
11:19fdobridge: <redsheep> /usr/src/debug/rust/rustc-1.77.0-src/library/core/src/num/mod.rs:1181```
11:19fdobridge: <redsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1223229956713091083/steam-782330.log?ex=661918ac&is=6606a3ac&hm=7a2d1216a4cef6e95c8d3bf5d59df7da12b7248999f4556b7db5cdab2e314e69&
11:21fdobridge: <redsheep> For good measure here is the .so in question, and I have confirmed this is what is getting loaded.
11:21fdobridge: <redsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1223230543433306124/libvulkan_nouveau.so?ex=66191938&is=6606a438&hm=a28e006e7da21d7913a8bcca7e619ea64ec497388bb07a87b2f97656c4477bde&
11:37fdobridge: <ahuillet> what does NAK stand for? :)
11:37fdobridge: <Sid> Nouveau/Nvidia Awesome Kompiler
11:37fdobridge: <redsheep> @gfxstrand Since this is a NAK and rust issue I assume you will want to be aware of the above. At least on EndeavourOS they're defaulting to having rust stuff build with -C debuginfo=2 in the arguments, and that seems to break NAK being able to deal with compute shaders, or at least one of the ones in Doom Eternal. Maybe worth spinning up a build like that and checking that compute shaders work?
11:37fdobridge: <Sid> it's the compiler NVK uses
11:38fdobridge: <Sid> ok it's Nvidia Awesome Kompiler, not nouveau
11:38fdobridge: <karolherbst🐧🦀> have you tried to compile with a toolchain installed via rustup?
11:39fdobridge: <!DodoNVK (she) 🇱🇹> Nouveau Advanced Kompiler
11:39fdobridge: <Sid> oh it changed?
11:39fdobridge: <Sid> when? where? :o
11:39fdobridge: <redsheep> Oh I thought it was Awesome as well
11:42fdobridge: <redsheep> No, I am not really sure how all this works yet. Honestly I am not even sure *what* is receiving problematic flag, I just know something is
11:44fdobridge: <mohamexiety> proprietary wouldn't dream of having one that cool 😎
11:46fdobridge: <karolherbst🐧🦀> I'd just test with rustup
11:46fdobridge: <karolherbst🐧🦀> if that works, then it's a distribution bug
11:46fdobridge: <karolherbst🐧🦀> and you file the bug and mvoe on
11:54fdobridge: <redsheep> I will try and see if I can figure out using rustup this weekend, guess my mesa builds will get more complicated once again lol...
11:54fdobridge: <Sid> it's not hard
11:54fdobridge: <Sid> uninstall rust/rustc from the package manager, install rustup
11:55fdobridge: <Sid> then run `rustup default stable`
11:55fdobridge: <Sid> and compile as you normally would
12:02fdobridge: <redsheep> I really need to figure out replicating this with my dev build instead of using the PKGBUILD so I can remove pacman from the equation. If it happens there then I don't think it's a distro bug, it would be a rust bug that debug stuff breaks things, or it would be a NAK bug where having debug exposes an issue that is otherwise hiding.
12:03fdobridge: <redsheep> I will get to the bottom of this later
13:08fdobridge: <triang3l> every modern Mesa driver technically has "SFN" btw
13:08fdobridge: <Sid> Source FilmNaker?
13:08fdobridge: <Sid> Society For Neuroscience?
13:08fdobridge: <Sid> wait
13:08fdobridge: <Sid> wait
13:08fdobridge: <Sid> Single Frequency Network :D
13:09fdobridge: <triang3l> That one SFN really emits networks of instructions, but at different frequencies
13:10fdobridge: <triang3l> The compiler in the R600 driver is called "shader from NIR"
13:10fdobridge: <Sid> ah
13:14fdobridge: <gfxstrand> Yeah everyone has a weird acronym for something or other.
13:15fdobridge: <redsheep> I love that TLAs are so overloaded that "TLAs" itself has gotten overloaded
13:15fdobridge: <gfxstrand> Thanks for the heads up. I missed some of the chatter yesterday because by the time I got up there were like 300 messages and I declared bankruptcy
13:16fdobridge: <gfxstrand> Meh. It can go either way.
13:16fdobridge:<Sid> takes notes
13:32fdobridge: <gfxstrand> NVK almost got named NouVulkan but I think it's just NVK with no meaning beyond what people project onto it.
13:33fdobridge: <ahuillet> sorry about the role I played in that - feels like it's worth debugging if
13:33fdobridge: <!DodoNVK (she) 🇱🇹> It's Nouveau VulKan :triangle_nvk:
13:33fdobridge: <ahuillet> sorry about the role I played in that - feels like it's worth debugging if the toolchain can produce that sort of issue (edited)
13:34fdobridge: <gfxstrand> You're fine. I should read back through the chatter today if I can.
13:34fdobridge: <gfxstrand> But also the new control flow pass has some bugs so it might just be those.
13:34fdobridge: <gfxstrand> I fixed one last night but there's another
13:37fdobridge: <ahuillet> https://github.com/rust-lang/rust/blob/master/library/core/src/num/mod.rs#L1181
13:37fdobridge: <ahuillet> does that.. make any sense?
13:39fdobridge: <karolherbst🐧🦀> mhh yeah... so the line resolver and macros are kinda painful
13:40fdobridge: <ahuillet> what does the stuff I linked even mean anyway, doesn't look like code?
13:41fdobridge: <karolherbst🐧🦀> rust macros can match any pattern
13:41fdobridge: <karolherbst🐧🦀> so you can put literally anything into it and the macro parses it out
13:41fdobridge: <karolherbst🐧🦀> well
13:41fdobridge: <karolherbst🐧🦀> according to the macro that is
13:41fdobridge: <ahuillet> this is a macro, and its name is "u64"?
13:42fdobridge: <karolherbst🐧🦀> `uint_impl!` is the macro invocation
13:43fdobridge: <ahuillet> what's a "syscall" in Rust, is it what I know as a syscall or did they repurpose the word to mean something else?
13:43fdobridge: <karolherbst🐧🦀> I don't think they repurposed the term for anything
13:44fdobridge: <ahuillet> ok, I'm confused because the log mentions a syscall stack overflow
13:44fdobridge: <ahuillet> the fault address looks like a hash function
13:45fdobridge: <ahuillet> conceivably at a point where there may be an actual stack overflow
13:45fdobridge: <ahuillet> @redsheep what's your ulimit -s?
13:45fdobridge: <karolherbst🐧🦀> if nothing makes sense I usually throw libasan at the problem, but that's kinda a bit annoying with wine as well
13:47fdobridge: <ahuillet> I don't understand the .rs, but the fault address in the given binary is consistent with a stack overflow, the segfault appears upon writing at the top of the stack right after a sub rsp, 1B0h
13:47fdobridge: <ahuillet> so maybe something in debug builds eats a lot of stack in which case it's arguably not even a bug
13:47fdobridge: <karolherbst🐧🦀> could be
13:47fdobridge: <karolherbst🐧🦀> but rust in debug builds also enables some protections
13:47fdobridge: <karolherbst🐧🦀> like overflow detection
13:48fdobridge: <karolherbst🐧🦀> but that should just print an error or so
13:48fdobridge: <ahuillet> I don't think that's consistent with what I see
13:48fdobridge: <ahuillet> https://pastebin.com/Ppu46Mbc exception happens on the first MOV there
13:49fdobridge: <nanokatze> syscall in wine means call to unix iirc
13:49fdobridge: <ahuillet> function entrypoint. feels like an actual stack overflow. I'd check ulimit -s and bump it to something huge first
13:49fdobridge: <nanokatze> there's like 2 mechanisms or something, one slower than the other
13:49fdobridge: <karolherbst🐧🦀> using a debugger with wine is also always pain :ferrisUpsideDown: but something one can actually make work with a "little" script
13:49fdobridge: <nanokatze> and the overflow might be because the stacks in some cases are really tiny, in the case of the faster mechanism
13:49fdobridge: <ahuillet> arguably my hypothesis doesn't cover the "syscall" part, but maybe this is happening as part of a syscall call stack, we don't have a call stack so can't be sure
13:50fdobridge: <karolherbst🐧🦀> https://gist.github.com/karolherbst/4bee8680a323863a875d7431bba05747
13:50fdobridge: <ahuillet> right. wouldn't hurt to attach GDB and catch this live, @redsheep
13:50fdobridge: <karolherbst🐧🦀> just needs some magic handling like the script I have to load debug symbols
13:51fdobridge: <karolherbst🐧🦀> or well..
13:51fdobridge: <karolherbst🐧🦀> get useful stacks at all
13:51fdobridge: <karolherbst🐧🦀> but yeah.. tracking stackoverflows are a bit of a pain :ferrisUpsideDown:
13:52fdobridge: <redsheep> 8192
13:52fdobridge: <ahuillet> is that @prop_energy_ball 's script or a variant thereof?
13:52fdobridge: <karolherbst🐧🦀> popssible
13:52fdobridge: <karolherbst🐧🦀> *possible
13:52fdobridge: <ahuillet> @redsheep meh, seems not so bad, but try 1048576
13:53fdobridge: <ahuillet> and either way, good to catch the crash in gdb using the script above.
13:53fdobridge: <nanokatze> @ ahuillet pretty sure there was a similar issue once in the past already, where a nir pass recursed very deeply and was running out of wine syscall/unix stack
13:53fdobridge: <!DodoNVK (she) 🇱🇹> (C) 2020 Remi Bernon (OG Whacker): https://gist.github.com/rbernon/cdbdc1b0e892f91e7449fcf3dda80bb7
13:54fdobridge: <nanokatze> so I suppose you could search in LGD or poke people in there that were dealing with it
13:54fdobridge: <ahuillet> also, if anybody feels like explaining to this dude <- what the Rust stuff linked does and how it does it, I'm curious, because the ASM I see is definitely some sort of hashing thing
13:54fdobridge: <ahuillet> @nanokatze right, but was it recursing deeply /legitimately/ or not?
13:54fdobridge: <karolherbst🐧🦀> it's just a macro to roll out a lot of impls for integer types
13:54fdobridge: <nanokatze> yes, legitimately
13:54fdobridge: <karolherbst🐧🦀> because they are structurally the same, but operate on different types
13:54fdobridge: <nanokatze> it was running out of stack because syscall stack was too short
13:54fdobridge: <ahuillet> if Rust when building with debug symbols uses huge stacks then it's maybe a Rust bug that it uses so much
13:55fdobridge: <karolherbst🐧🦀> @ahuillet it's to get all those methods: https://doc.rust-lang.org/std/primitive.u64.html
13:55fdobridge: <nanokatze> hmm, right, perhaps
13:55fdobridge: <nanokatze> maybe there are other changes than just debug symbols
13:55fdobridge: <karolherbst🐧🦀> and if you click "source" you get the same macro invocation 😄
13:55fdobridge: <redsheep> How do I actually set the limit instead of just reading it?
13:55fdobridge: <karolherbst🐧🦀> the actual macro is somewhere, but I never bothered checking
13:55fdobridge: <ahuillet> is there hash stuff in there? looks like there might be
13:55fdobridge: <karolherbst🐧🦀> yes
13:56fdobridge: <ahuillet> <core__hash__sip__Sip13Rounds as core__hash__sip__Sip>::d_rounds::h5887e6c8058ff508
13:56fdobridge: <ahuillet> that's the name I see, also f*ck discord
13:56fdobridge: <karolherbst🐧🦀> use ` ` `
13:56fdobridge: <karolherbst🐧🦀> use ` \` ` (edited)
13:56fdobridge: <karolherbst🐧🦀> mhh
13:56fdobridge: <karolherbst🐧🦀> '`'
13:57fdobridge: <karolherbst🐧🦀> 😄
13:57fdobridge: <nanokatze> \`
13:57fdobridge: <nanokatze> ur welcome
13:57fdobridge: <!DodoNVK (she) 🇱🇹> Also known as `uint64_t` in C
13:57fdobridge: <karolherbst🐧🦀> well
13:57fdobridge: <ahuillet> `<core__hash__sip__Sip13Rounds as core__hash__sip__Sip>::d_rounds::h5887e6c8058ff508`
13:57fdobridge: <karolherbst🐧🦀> ahh
13:57fdobridge: <karolherbst🐧🦀> yeah
13:57fdobridge: <karolherbst🐧🦀> sounds like hasing
13:57fdobridge: <ahuillet> does that connect with what you linked in any way?
13:57fdobridge: <karolherbst🐧🦀> yeah
13:57fdobridge: <ahuillet> yeah it's definitely a hash thing. xor and rol, no need to look deeper :)
13:58fdobridge: <karolherbst🐧🦀> you have a hasher you provide an array of data to
13:58fdobridge: <ahuillet> if the XOR density is greater than really not very much, it's a hash or crypto thing.
13:58fdobridge: <!DodoNVK (she) 🇱🇹> ~~Roblox On Linux instruction (real)~~
13:58fdobridge: <karolherbst🐧🦀> and an `hash` impl takes the hasher and passes it through
13:58fdobridge: <ahuillet> you're just trolling me now
13:58fdobridge: <ahuillet> this. does what you said?
13:58fdobridge: <karolherbst🐧🦀> e.g. Vec's `hash` looks like this: https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#2754
13:59fdobridge: <karolherbst🐧🦀> ehh wait
13:59fdobridge: <karolherbst🐧🦀> that just calls slice::hash
13:59fdobridge: <karolherbst🐧🦀> https://doc.rust-lang.org/src/core/hash/mod.rs.html#929
13:59fdobridge: <karolherbst🐧🦀> anyway
13:59fdobridge: <karolherbst🐧🦀> it's somewhere deeper
13:59fdobridge: <karolherbst🐧🦀> doesn't matter
14:00fdobridge: <ahuillet> love the random looking name at the end btw, was there a design challenge in Rust to make things as hard to debug as possible or that just cherry on the cake? 🌶️
14:01fdobridge: <karolherbst🐧🦀> ohh.. I found where `uint_impl` is
14:01fdobridge: <nanokatze> yes, and there's nothing there
14:01fdobridge: <karolherbst🐧🦀> well.. it debugs find in a debugger 😛
14:01fdobridge: <ahuillet> ah cool maybe we can see the actual hashing code see if it matches the ASM?
14:01fdobridge: <redsheep> Ok ulimit -s let's you set a number after that argument and doesn't say as much in the help, stupid thing
14:01fdobridge: <ahuillet> series of rol/xor
14:02fdobridge: <ahuillet> @redsheep yeah or "unlimited" as a special number but let's maybe not do that
14:02fdobridge: <nanokatze> <https://github.com/rust-lang/rust/blob/45796d1c24445b298567752519471cef2cff3298/library/core/src/hash/sip.rs>
14:02fdobridge: <ahuillet> changing may or may not work depending on /etc/security/limits.conf
14:02fdobridge: <nanokatze> the rol/xor/whatever you're looking are probably a part of some hasher impl, after lots of inlining too probably
14:02fdobridge: <ahuillet> https://github.com/rust-lang/rust/blob/45796d1c24445b298567752519471cef2cff3298/library/core/src/hash/sip.rs#L75
14:02fdobridge: <nanokatze> the rol/xor/whatever you're looking at are probably a part of some hasher impl, after lots of inlining too probably (edited)
14:02fdobridge: <ahuillet> ASM matches that
14:03fdobridge: <ahuillet> unrolled 3 times
14:03fdobridge: <karolherbst🐧🦀> mhhh
14:03fdobridge: <karolherbst🐧🦀> why is it using the sip hasher though.. doesn't matter
14:03fdobridge: <ahuillet> not sure it matters immensely: if rust is using too much stack things will blow up pretty much randomly
14:03fdobridge: <nanokatze> it's using too much stack for wine's syscall stack
14:03fdobridge: <redsheep> Yeah setting ulimit -s 1048576 before running steam from terminal makes for broken steam, it's crashing
14:03fdobridge: <nanokatze> it's on the order of few hundred k as opposed to normal 8M
14:04fdobridge: <ahuillet> it just so happens that this takes 0x1B0 bytes on the stack which is not small so it's statistically a good place for things to blow
14:04fdobridge: <nanokatze> it might be using too much stack for wine's syscall stack (edited)
14:04fdobridge: <ahuillet> @nanokatze how do they enforce the smaller stack, do you know?
14:04fdobridge: <nanokatze> no
14:04fdobridge: <nanokatze> you might want to ask some wine people in LGD discord
14:04fdobridge: <ahuillet> if there's an mprotect we could trace it
14:04fdobridge: <ahuillet> @redsheep alright, try progressively lower values I guess
14:05fdobridge: <ahuillet> @nanokatze seems to be suggesting that there is a separate special stack for "syscalls" done through Wine, so that could be a problem too that ulimit -s won't help with
14:05fdobridge: <ahuillet> gdb call stack might help figure things out a bit. :)
14:05fdobridge: <ahuillet> or maybe not if it's actually a different stack, because, you know, different stack.
14:07fdobridge: <redsheep> Steam launches with 65536, checking doom now
14:07fdobridge: <karolherbst🐧🦀> you really should just get gdb to work proper, because otherwise it's just painful to figure out what's going on
14:08fdobridge: <nanokatze> is ulimit -s applicable to things that create stacks by themselves?
14:08fdobridge: <nanokatze> because I suspect wine is
14:08fdobridge: <nanokatze> as opposed with pthread_create or whatever
14:08fdobridge: <redsheep> Yeah. I'll mess with it when I have some proper time, have to work or something for some reason.
14:08fdobridge: <karolherbst🐧🦀> also.. is wineserver crashing?
14:09fdobridge: <redsheep> The proton logs are up there, I imagine those would say?
14:09fdobridge: <karolherbst🐧🦀> soo.. do you know where wineserver crashes?
14:09fdobridge: <karolherbst🐧🦀> because if wine just kills the process, because the server goes away, you can get the crash anywhere
14:10fdobridge: <karolherbst🐧🦀> and `RPC_S_SERVER_UNAVAILABLE` is kinda a weird error to have
14:12fdobridge: <karolherbst🐧🦀> and is the crash even always in the same place?
14:12fdobridge: <Sid> reaper kills proton's wineserver, afaik
14:13fdobridge: <redsheep> What I can say now without time to dig in is that the rust debug build results in it being extremely laggy for the few seconds it's open before the crash, but a non debug build is butter smooth
14:13fdobridge: <Sid> sounds right
14:13fdobridge: <redsheep> The error seems to always be the same, and time wise yes, as close as I can figure
14:13fdobridge: <karolherbst🐧🦀> ohh wait
14:14fdobridge: <karolherbst🐧🦀> I was looking at the wrong place in the log 🥲
14:14fdobridge: <redsheep> It's a big log
14:14fdobridge: <redsheep> Redwood trees are jealous
14:16fdobridge: <ahuillet> line 156102
14:17fdobridge: <!DodoNVK (she) 🇱🇹> That's harmless I think
14:19fdobridge: <ahuillet> there's so many non-error errors in that log, just finding the one of interest is hard
14:19fdobridge: <karolherbst🐧🦀> wine moments
14:20fdobridge: <ahuillet> takes a trained eye to pattern-match 0xc0000005 while fast-scrolling through the log :D
14:20fdobridge: <magic_rb.> TIL that turning off eDP with xrandr deadlocks X11
14:20fdobridge: <!DodoNVK (she) 🇱🇹> That's why search function/`grep` exists
14:20fdobridge: <zmike.> @gfxstrand I didn't notice !25916 before
14:20fdobridge: <karolherbst🐧🦀> mhhh yeah
14:20fdobridge: <karolherbst🐧🦀> sounds like an "out of stack" situation
14:21fdobridge: <magic_rb.> Seems to be back? Reminder to self, do not disable the last x11 output
14:22fdobridge: <zmike.> your `get_io_offset()` change is very humorous since I put up an identical change at some point years ago and got rejected
14:22fdobridge: <gfxstrand> @karolherbst @marysaka @dwlsalmeida So I had a crazy idea this morning...
14:22fdobridge: <gfxstrand>
14:22fdobridge: <gfxstrand> We've been talking about trying to port NVK to Rust one day. However, in order to do that we're going to need to build a LOT of NVIDIA driver Rust infra. @dwlsalmeida has been working on a Rust version of `nv_push` as well as a Rust port of NIL. But wer'e going to need more: ioctl wrappers, Rust wrappers around nouveau/winsys and/or a rust impl (it's not that big so maybe re-implement?), etc. It would also be good if we had a driver for wha
14:22fdobridge: <gfxstrand>
14:22fdobridge: <gfxstrand> So... What if we wrote a new compute-only gallium driver in Rust? It would just be for rusticl and would use NAK instead of codegen. That way we could have a sandbox to play with all the Rust things we want without disturbing NVK until we're ready to do a full port.
14:23fdobridge: <karolherbst🐧🦀> maybe?
14:23fdobridge: <ahuillet> that's kind of offtopic but I'm curious to understand - what makes Rust a purpose?
14:23fdobridge: <karolherbst🐧🦀> but it would still implement the C gallium API, right?
14:23fdobridge: <!DodoNVK (she) 🇱🇹> https://www.youtube.com/watch?v=NT6tq0s5Lx0 :opencl:
14:23fdobridge: <karolherbst🐧🦀> not having tod ebug memory issues
14:23fdobridge: <ahuillet> such as the stack overflow we're looking at right now?
14:24fdobridge: <karolherbst🐧🦀> more like heap overflow
14:24fdobridge: <ahuillet> (snark so fully intended I can't even try to hide it)
14:24fdobridge: <karolherbst🐧🦀> but no language can protect you from stack overflows 😛
14:24fdobridge: <gfxstrand> Yes
14:24fdobridge: <karolherbst🐧🦀> array OOBs are fine
14:24fdobridge: <karolherbst🐧🦀> but if your OS limits the stack size you are kinda on your own
14:25fdobridge: <karolherbst🐧🦀> yeah.. I'm all for it, we need a good push buffer API for the kernel side anyway
14:25fdobridge: <ahuillet> k. just being curious about the motivation
14:25fdobridge: <marysaka> I mean sure why not, I'm not too well versed in gallium still but I can take more peak at it :SoniiPray:
14:25fdobridge: <karolherbst🐧🦀> it's just an API anyway
14:25fdobridge: <karolherbst🐧🦀> and the things you need for compute are very limited
14:25fdobridge: <karolherbst🐧🦀> one can even ignore images for now
14:26fdobridge: <marysaka> It would be lovely to have an assembler at some point for SM7x too (like giving some assembly representation like what we print with NAK atm to produce some blob)
14:27fdobridge: <karolherbst🐧🦀> once you get past the "fighting against wrong learned things" it's quite enjoyable even. But yeah.. not having to debug silly overflows, or silly data races because you forgot to lock helps a lot
14:27fdobridge: <karolherbst🐧🦀> as those bugs are for most part just gone entirely
14:27fdobridge: <karolherbst🐧🦀> I think in the 2 years I had one overflow bug in rusticl, and it was a stack one, and it was a bug in my C wrapper code
14:27fdobridge: <gfxstrand> Yeah, I would really like to have a full assembler/disassembler that isn't the CUDA blob. That way when we start adding secret instructions like for RTX we have something that can disassemble anything NAK can produce.
14:28fdobridge: <karolherbst🐧🦀> is nvdisasm hiding the RTX stuff?
14:28fdobridge: <karolherbst🐧🦀> pain
14:29fdobridge: <marysaka> I started something that generate some XML at some point based on my bit banging output but I didn't go that far
14:29fdobridge: <gfxstrand> For the compiler, it was mostly a test bed and because Rust is a REALLY nice language to work with. Especially if most of what your code does is `match` (which is true for compilers), Rust really excels at that. Forcing myself to make it all borrow checker happy did also preemptively catch some bugs but it wasn't the main focus.
14:29fdobridge: <karolherbst🐧🦀> mhhhh
14:29fdobridge: <karolherbst🐧🦀> using a macro for the push buffer stuff would be kinda cool
14:30fdobridge: <gfxstrand> For an actual driver, though, the real benefit is in concurrency. Once you successfully map Vulkan's concept of internally and externally synchronized to Rust's shared/mutable reference semantics, Rust will prove your locking correct.
14:30fdobridge: <karolherbst🐧🦀> because that could eliminate the dirty hack we are doing for setting the length in the header
14:30fdobridge: <karolherbst🐧🦀> though mhhh...
14:30fdobridge: <karolherbst🐧🦀> it's still annoying if you have loops and such
14:30fdobridge: <gfxstrand> Yeah, I don't really like the length hack
14:31fdobridge: <karolherbst🐧🦀> yeah...
14:31fdobridge: <gfxstrand> With a Rust wrapper, we can implement stuff in `Drop` 🙂
14:31fdobridge: <karolherbst🐧🦀> though I hate being explicit with it even more as this is even more annoying 😄
14:31fdobridge: <karolherbst🐧🦀> ohhh
14:31fdobridge: <karolherbst🐧🦀> mhhh
14:31fdobridge: <karolherbst🐧🦀> yeah.. maybe
14:32fdobridge: <gfxstrand> And as long as `PushRef` owns a mutable reference to `PushBuf` (names TBD), only one of them is able to exist at a time, fixing the potential C issues where you accidentally have `nv_push` lifetimes that overlap.
14:32fdobridge: <karolherbst🐧🦀> yeah...
14:32fdobridge: <karolherbst🐧🦀> but we need to track it per "header"
14:32fdobridge: <gfxstrand> Yeah, that's fine
14:32fdobridge: <karolherbst🐧🦀> so you'd need to do multiple per function
14:32fdobridge: <gfxstrand> Nah
14:33fdobridge: <karolherbst🐧🦀> I mean.. how do you do it with loops then?
14:33fdobridge: <gfxstrand> The moment you do a `PushRef::mthd()`, it flushes out the old header and focuses on the next one. Then `drop()` flushes out the last one.
14:33fdobridge: <gfxstrand> The reason why we can't do that today with C is because we don't have `drop()`
14:33fdobridge: <karolherbst🐧🦀> yeah, but then you need to use `drop` 😄
14:33fdobridge: <karolherbst🐧🦀> ohh wait
14:33fdobridge: <karolherbst🐧🦀> mhhh
14:33fdobridge: <karolherbst🐧🦀> mhhhhhhh
14:33fdobridge: <karolherbst🐧🦀> yeah, I see it now
14:34fdobridge: <karolherbst🐧🦀> yeah.. so you just do a by value thing, and the next `mthd` overwrites an internal `current_mthd` thing, where its `drop` just builds the header and the values
14:34fdobridge: <gfxstrand> yup
14:34fdobridge: <karolherbst🐧🦀> yeah...
14:34fdobridge: <karolherbst🐧🦀> sounds like a plan then
14:34fdobridge:<gfxstrand> has thought about this a bit too much
14:35fdobridge: <karolherbst🐧🦀> could even do a proper state machine thing
14:35fdobridge: <gfxstrand> But yeah, having a sandbox to prove out and polish all these ideas before we do an NVK port would be really good.
14:35fdobridge: <karolherbst🐧🦀> yeah...
14:36fdobridge: <karolherbst🐧🦀> yeah, would be kinda cool tbh
14:36fdobridge: <karolherbst🐧🦀> though...
14:36fdobridge: <karolherbst🐧🦀> the question is if it ever become a 3D driver 😄
14:36fdobridge: <gfxstrand> Nah
14:36fdobridge: <gfxstrand> I don't see a need
14:37fdobridge: <gfxstrand> Images, sure. 3D, no.
14:37fdobridge: <gfxstrand> I mean it could. 3D isn't that hard
14:37fdobridge: <karolherbst🐧🦀> okay, because I think we'd kinda need to be explicit about it, so people don't think we'd start using the 3D stuff 😄
14:37fdobridge: <karolherbst🐧🦀> but
14:37fdobridge: <karolherbst🐧🦀> that might happen anyway
14:37fdobridge: <karolherbst🐧🦀> at some point
14:37fdobridge: <karolherbst🐧🦀> if somebody gets bored
14:37fdobridge: <karolherbst🐧🦀> yeah.. I'm not concerned about implementing it
14:37fdobridge: <karolherbst🐧🦀> I'm more concerned having a rust driver using the 3D apis
14:37fdobridge: <karolherbst🐧🦀> and other devs getting too frustrated
14:38fdobridge: <gfxstrand> Yeah...
14:38fdobridge: <gfxstrand> And honestly, IDK if we should even plan to merge it or if it should just live in nouveau/mesa for a while.
14:39fdobridge: <gfxstrand> The NAK interfaces aren't likely to change much (they're pretty damn simple) and it won't touch src/nouveau/vulkan so I doubt it'll really hurt anything to live out-of-tree for a while.
14:40fdobridge: <karolherbst🐧🦀> I mean.. merging a compute only driver is fine
14:40fdobridge: <karolherbst🐧🦀> but yeah.. I mean we can also discuss merging it later
14:44fdobridge: <gfxstrand> Yeah, as long as we leave it `-experimental`, it should be fine. I'm just not sure I want Phoronix articles.
14:44fdobridge: <karolherbst🐧🦀> you don't have to read the comments tho
14:45fdobridge: <gfxstrand> But having a `-Dgallium-drivers=nrc-experimental` (NRC: Nouveau Rust Compute) might actually help with some of the PR if we're explicit that this is a sandbox for playing with Rust and drivers.
14:45fdobridge: <gfxstrand> Like, don't even build it in CI and say it's okay if gallium changes break it.
14:45fdobridge: <karolherbst🐧🦀> ohh
14:45fdobridge: <karolherbst🐧🦀> yeah, maybe that would be fine
14:45fdobridge: <karolherbst🐧🦀> we can always reevaluate later
14:46fdobridge: <karolherbst🐧🦀> it could also be a CI job which is allowed to fail
14:46fdobridge: <karolherbst🐧🦀> and then it either gets fixed or not
14:53fdobridge: <leopard1907> Damn, even redsheep referenced Clash of Clans issue for Doom 2016
14:57fdobridge: <redsheep> It's really dumb that gitlab doesn't tell you that's going to happen before you open the issue, all it takes is one stray #1
14:57fdobridge: <leopard1907> Kek
14:58fdobridge: <leopard1907> Codeblocking that stuff should prevent that i guess
15:02fdobridge: <leopard1907> Does NVK support DCC stuff yet?
15:09fdobridge: <redsheep> Iirc delta color compression is among the list of optimization things that have yet to be turned on with the driver still focused on correctness/feature support. Zcull and tile based rasterization are on that list too.
15:09fdobridge: <leopard1907> 👍
15:10fdobridge: <leopard1907> So you guys still have time to see this.
15:10fdobridge: <leopard1907> https://cdn.discordapp.com/attachments/1034184951790305330/1223288123027620042/Ekran_Goruntusu_-_2020-11-25_12-13-36.jpg?ex=66194ed8&is=6606d9d8&hm=ddaf0ab9f66ca0e4811975679fa637b7520ccea60592d9433be4e0d18885d2f6&
15:12fdobridge: <huntercz122> how old is that trace?
15:13fdobridge: <Sid> from november 2020
15:13fdobridge: <Sid> says so in the titlebar
15:13fdobridge: <leopard1907> Ye
15:13fdobridge: <leopard1907> <https://forums.developer.nvidia.com/t/doom-2016-vulkan-renderer-is-broken-since-440-drivers-optimus/160332/2>
15:13fdobridge: <leopard1907> From here, at that time app profiles were going borked with some NV env vars
15:14fdobridge: <leopard1907> Which causes that
15:14fdobridge: <leopard1907> <https://gitlab.freedesktop.org/mesa/mesa/-/issues/5024#note_980254>
15:14fdobridge: <leopard1907> Because those games are buggy
15:15fdobridge: <leopard1907> Radv disables dcc for them, NV prop seems to be doing the same when it has app profiles working correctly
15:38fdobridge: <valentineburley> So do you prefer the for loop variant over the functions for the descriptor binding commands? @gfxstrand
15:38fdobridge: <valentineburley> It looks nice but just doing the same thing as the other drivers also has its value 🤷♂️
15:38fdobridge: <valentineburley> CTS looks the same between the two
15:41fdobridge: <zmike.> @gfxstrand good news, I found you an expert reviewer for https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25916
15:41fdobridge: <zmike.> you'll just have to wait...probably an hour or two for them to manifest
15:41fdobridge: <gfxstrand> I think either is fine. Sorry for my misunderstanding of the first version.
15:42fdobridge: <gfxstrand> In fact, I think the first version is probably nicer with the bind points the way things are now and will be pretty natural if we want per-stage
15:42fdobridge: <gfxstrand> @karolherbst Can LDC take the cbuf index as a source?
15:43fdobridge: <karolherbst🐧🦀> as a register you mean?
15:43fdobridge: <gfxstrand> yeah
15:43fdobridge: <gfxstrand> non-bindless
15:43fdobridge: <karolherbst🐧🦀> uniform reg, yes
15:43fdobridge:<karolherbst🐧🦀> bug
15:43fdobridge: <karolherbst🐧🦀> *but
15:44fdobridge: <karolherbst🐧🦀> you can also use non uniform if you use a different addressing mode
15:44fdobridge: <gfxstrand> ?
15:44fdobridge: <karolherbst🐧🦀> `.IL` e.g., where the top 16 bits are added to the cb index
15:44fdobridge: <karolherbst🐧🦀> but the immediate applies to the full 32 bit
15:45fdobridge: <karolherbst🐧🦀> `.IS` prevents this behavior, so the offset can't overflow into the index
15:45fdobridge: <gfxstrand> RIght
15:45fdobridge: <karolherbst🐧🦀> soo `.IL` is first add, then split, and `.IS` is first splitting
15:46fdobridge: <gfxstrand> Well in this case I would be constructing the offset with a PRMT so I would know the top bits are only index and the bottom bits are only offset
15:46fdobridge: <karolherbst🐧🦀> but if you want a single indirect index, then you need to use an uniform register
15:48fdobridge: <gfxstrand> Just so I get this right... `.IL` means `c[idx+offset[31:16]][offset[15:0]]`
15:48fdobridge: <karolherbst🐧🦀> no
15:49fdobridge: <karolherbst🐧🦀> .IL means you take the immediate offset and add it to the register
15:49fdobridge: <karolherbst🐧🦀> and then you split it apart
15:49fdobridge: <gfxstrand> Ok...
15:49fdobridge: <karolherbst🐧🦀> .`IS` is what you wrote
15:49fdobridge: <karolherbst🐧🦀> just + immediate on the offset
15:50fdobridge: <karolherbst🐧🦀> I think...
15:50fdobridge: <karolherbst🐧🦀> I mean..
15:50fdobridge: <karolherbst🐧🦀> you left out the register 😄
15:50fdobridge: <gfxstrand> Okay, let me try this again...
15:51fdobridge: <gfxstrand> I think this is what you're saying `.IS` means:
15:51fdobridge: <gfxstrand> ```
15:51fdobridge: <gfxstrand> idx = imm_idx + reg[31:16]
15:51fdobridge: <gfxstrand> offset = imm_offset + reg[15:0]
15:51fdobridge: <gfxstrand> ldc c[idx][offset]
15:51fdobridge: <gfxstrand> ```
15:51fdobridge: <karolherbst🐧🦀> right.. `offset` being an unsiged 16 bit value
15:51fdobridge: <karolherbst🐧🦀> for overflow purposes
15:51fdobridge: <gfxstrand> right
15:53fdobridge: <gfxstrand> So what is `.IL`?
15:53fdobridge: <gfxstrand> And what is `.ISL`?
15:54fdobridge: <karolherbst🐧🦀> `.IL` is where the overflow of the offset applies to the index
15:54fdobridge: <karolherbst🐧🦀> so you just treat both as two 32 bit values and just add them
15:54fdobridge: <karolherbst🐧🦀> and then split it
15:54fdobridge: <gfxstrand> Okay, now I see what you mean by "add then split"
15:54fdobridge: <karolherbst🐧🦀> yeah
15:54fdobridge: <karolherbst🐧🦀> `.ISL` has some index bound check on top of `.IS`
15:55fdobridge: <gfxstrand> That might actually be what I want
15:55fdobridge: <karolherbst🐧🦀> well..
15:55fdobridge: <karolherbst🐧🦀> it's a constant
15:55fdobridge: <gfxstrand> ?
15:55fdobridge: <karolherbst🐧🦀> I _think_ it's always 14
15:56fdobridge: <karolherbst🐧🦀> might be configurable somewhere, but not as part of the instruction
15:57fdobridge: <karolherbst🐧🦀> I assume it was especially added for a special driver just having internal cbs at 14+, so those operations never overflow into those
15:57fdobridge: <karolherbst🐧🦀> I doubt it's of any use for us
15:58fdobridge: <ahuillet> DCC is a AMD term, I don't think I've heard it at NV. What is that referring to?
15:58fdobridge: <ahuillet> DCC is a AMD term, is it not? I don't think I've heard it at NV. What is that referring to? (edited)
15:59fdobridge: <gfxstrand> In this case, I think it just maps to color compression
15:59fdobridge: <gfxstrand> Which we indeed haven't turned on.
15:59fdobridge: <gfxstrand> Well, we're setting some bits someplace but I'm very sure it isn't actually enabled properly.
15:59fdobridge: <ahuillet> right, we don't call it DCC.
16:00fdobridge: <karolherbst🐧🦀> codegen never used ISL for anything either
16:00fdobridge: <gfxstrand> AMD calls it DCC, Intel calls it CCS/MCS, NVIDIA just calls it "compression" as far as I can tell.
16:00fdobridge: <ahuillet> yup
16:01fdobridge: <gfxstrand> @karolherbst Any idea what we should call the mode that isn't any of those 3? The disassembler doesn't have a name.
16:01fdobridge: <gfxstrand> And `None` feels wrong
16:02fdobridge: <karolherbst🐧🦀> `.I` 😄
16:02fdobridge: <karolherbst🐧🦀> or `.IA` if you want the more official one, but that one is less consistent
16:02fdobridge: <gfxstrand> Sure
16:02fdobridge: <leopard1907> <https://forums.developer.nvidia.com/t/doom-2016-vulkan-renderer-is-broken-since-440-drivers-optimus/160332/7>
16:02fdobridge: <leopard1907>
16:02fdobridge: <leopard1907> What do you call it in app profile? Because root cause was found after doing this 🐸
16:03fdobridge: <karolherbst🐧🦀> `A` means Addressing.. but uhh those are all addressing modes
16:03fdobridge: <gfxstrand> What do I, S, and L mean?
16:03fdobridge: <karolherbst🐧🦀> Index, Segmented, Linear
16:03fdobridge: <karolherbst🐧🦀> *Indexed
16:03fdobridge: <karolherbst🐧🦀> so kinda literal what they do
16:04fdobridge: <ahuillet> there's no relationship between that and compression IIRC, but it's just "compression", let me dig up the whitepaper
16:04HdkR: I still wonder who is taking advantage of segmented mode instead of using an SSBO
16:04fdobridge: <ahuillet> interesting, the whitepaper does refer to it as delta color compression, hehe.
16:05fdobridge: <karolherbst🐧🦀> codegen uses segmented 😄
16:05HdkR: hah
16:06fdobridge: <karolherbst🐧🦀> heh..
16:06fdobridge: <karolherbst🐧🦀> wait..
16:06fdobridge: <karolherbst🐧🦀> it makes no sense
16:06fdobridge: <karolherbst🐧🦀> codegen does what the hardware does, but in software :ferrisUpsideDown:
16:06fdobridge: <karolherbst🐧🦀> ehh wait
16:06fdobridge: <karolherbst🐧🦀> it _reverts_ it and overflows anyway
16:06fdobridge: <karolherbst🐧🦀> don't ask me
16:07fdobridge: <karolherbst🐧🦀> cursed code: `i->src(0).get()->reg.data.offset = (int)(short)offset;`
16:07fdobridge: <karolherbst🐧🦀> anyway...
16:08fdobridge: <karolherbst🐧🦀> ohh codegen doesn't use `.IL`... that explains
16:08fdobridge: <karolherbst🐧🦀> _anyway_...
16:09fdobridge: <redsheep> Yeah I'm just going off names from disclosures like this one https://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/8
16:09fdobridge: <redsheep> Was a big topic around Maxwell and Pascal
16:10fdobridge: <leopard1907> Kek
16:15fdobridge: <gfxstrand> @karolherbst With ISL, does the HW bounds-check the offset or does it roll over?
16:15fdobridge: <gfxstrand> For that matter, with `.I` does it bounds check the offset?
16:16fdobridge: <karolherbst🐧🦀> I don't think there is any bound checking happening
16:16fdobridge: <karolherbst🐧🦀> only on the index
16:16fdobridge: <valentineburley> First version now with a function for nvk_CmdPushConstants2KHR too:
16:16fdobridge: <valentineburley> https://gitlab.freedesktop.org/Valentine/mesa/-/commit/d6744e701c05271cdd2d58023daff3463f4dd392
16:17fdobridge: <karolherbst🐧🦀> no
16:17fdobridge: <gfxstrand> @valentineburley Care to make that an MR? Or do you want me to pull that commit into my MR?
16:17fdobridge: <karolherbst🐧🦀> you won't get a fault either though
16:17fdobridge: <valentineburley> Up to you
16:17fdobridge: <karolherbst🐧🦀> dunno if you _could_ get a fault, but it will simply return 0 afaik
16:17fdobridge: <valentineburley> Do you want me to open a new MR?
16:26fdobridge: <gfxstrand> I'll pull it into mine.
16:26fdobridge: <gfxstrand> I just kicked off a CTS run for a NAK thing. I'll CTS maintenance6 with your patch next
16:42fdobridge: <gfxstrand> @karolherbst This should do the trick: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28474
16:49fdobridge: <karolherbst🐧🦀> gitlab seems broken 😄
16:49fdobridge: <karolherbst🐧🦀> but it's also public holy day here, so that's just fair
16:54fdobridge: <gfxstrand> Weird. It's working fine for me.
16:54fdobridge: <leopard1907> Because you ain't in a place with holy day
16:54fdobridge: <leopard1907> I wrote it that way because Karol did
16:54fdobridge: <leopard1907> Looks weird tho
16:55fdobridge: <leopard1907> But yes, also works here too
16:55fdobridge: <leopard1907> But there is no holy day stuff going here as well
16:55fdobridge: <leopard1907> So holy day theory sounds solid
16:56fdobridge: <gfxstrand> Maybe GitLab is cursed and somehow repelled by holy days?
16:56fdobridge: <gfxstrand> I mean, it definitely is cursed....
16:56fdobridge: <leopard1907> :evil_gears:
16:56fdobridge: <karolherbst🐧🦀> seems to work now
16:57fdobridge: <leopard1907> Unholy day confirmed
17:05fdobridge: <karolherbst🐧🦀> you could also support an constant index base, but not sure if nir can produce those
17:37fdobridge: <mhenning> @gfxstrand Have any time to review an MR? I'd really like to see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27798 or something like it land
17:37fdobridge: <mhenning> The descriptor set allocator we have on main is bad enough that it violates the spec and breaks multiple games
17:49fdobridge: <gfxstrand> Ugh... yeah.
19:24fdobridge: <triang3l> meme feature
19:34fdobridge: <huntercz122> 🐸 I'm so dumbo
19:36fdobridge: <gfxstrand> Pulled into my branch and running now
20:07fdobridge: <valentineburley> Nice fixups
20:21fdobridge: <valentineburley> And you might want to get features.txt if it all goes well
20:22fdobridge: <gfxstrand> right
20:22fdobridge: <gfxstrand> thanks
22:57fdobridge: <gfxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28482
22:58fdobridge: <gfxstrand> I really hate the way RADV hand-rolls an allocator when there's literally one right there in util/ that actually handles holes properly.