00:23skeggsb9778[d]: I did a bit last week, but it works for me 😛 I have one more system I can try on, but I hit another bug there which I haven't solved yet
01:12kuter7639[d]: Probably not very useful for nouveau at this point but here is 4090 ISA https://kuterdinel.com/nv_isa_sm89/
01:12kuter7639[d]: some weird surface instructions are missing
01:22HdkR: kuter7639[d]: Where's the ray tracing instructions :P
01:24kuter7639[d]: HdkR: If you send me some binaries with those instructions used I will add them. I mutate instructions from libcublasLt + enumerate opcodes but that doesn't work all the time
01:25HdkR: I see
01:26kuter7639[d]: sometimes you need some bits set (like modifiers and flags) for the instruction to decode. which increases the search space significantly
01:27mhenning[d]: kuter7639[d]: That probably will be useful for nouveau - every once in a while we want to add new opcodes and we've mostly been doing the RE work as we need it. Having a reference will speed that up
01:28mhenning[d]: As for raytracing instructions, my understanding is that nvdisasm can't decode them, so they'll be a little harder to figure out
01:28kuter7639[d]: Cool. Reading nouveau code thought me a lot. It's great to be able to contribute back
01:30kuter7639[d]: https://github.com/kuterd/nv_isa_solver this is the code that creates those ISA spec files. In a bit of an experimental state 😄
01:30kuter7639[d]: As long as NVIDIA doesn't remove the nvdisasm tool from cuda we should be able to decode any future ISA as well
01:34HdkR: Yea, nvdisasm refuses to disassemble the RT stuff. It's cute obfuscation
02:30skeggsb9778[d]: That's a way fancier version of how I fuzzed nvdisasm to do the bring-up for the last couple of ISA changes
02:31skeggsb9778[d]: Except I had a couple of bash scripts, and manually entered the data into envydis 😛
02:31skeggsb9778[d]: This seems easier
03:02kuter7639[d]: skeggsb9778[d]: am I correct in citing the https://doi.org/10.1145/3018743.3018755 paper for the original fuzzing technique?
03:21skeggsb9778[d]: No idea. I've never read it. As I said, what you've done goes *well* beyond what I was doing
03:35redsheep[d]: gfxstrand[d]: Do you mean like the shader model 6.6 and 6.7 stuff from the tracker? That and fs interlock would get vkd3d up to as far as will really be possible without understanding the RT instructions
03:36gfxstrand[d]: No, hardware shader models
03:37gfxstrand[d]: Volta vs. Turing, etc.
03:37redsheep[d]: Oh? So like making better use of latest isa, or what?
03:37gfxstrand[d]: It's not well abstracted right now and leaks all over everywhere.
03:37redsheep[d]: Ah I see
03:38gfxstrand[d]: When I first wrote NAK, I didn't have a good sense for how hardware generations would impact the compiler. Now I do, or at least I understand it better.
03:41skeggsb9778[d]: gfxstrand[d]: out of curiosity, what are the biggest performance blockers right now?
03:42skeggsb9778[d]: in nvk in general, i mean
03:42skeggsb9778[d]: particularly if the kernel is involved
03:43redsheep[d]: It's unfortunate the RT instructions are shrouded in so much secrecy. They seem like they could be tricky from a compiler perspective so being able to account for that sooner than later would be nice
03:43gfxstrand[d]: Compiler tuning is probably a big one. I also know that descriptors need more work. I finally landed the deep compiler work that was blocking all that a few weeks ago but I haven't figured out the right heuristic yet.
03:45gfxstrand[d]: I also need a better story with images. I know bound images are faster, even if I don't know why yet. But promoting to bound without adding stalls is tricky.
03:46gfxstrand[d]: I'm not aware of any big kernel things affecting perf just yet. I do have a short wishlist but nothing that's absolutely breaking us right now.
03:49airlied[d]: I still think zcull will help
03:50skeggsb9778[d]: Yeah, an I suspect compression in general would too (it definitely did back when we added it to earlier GPUs)
03:50airlied[d]: And not sure where we are at with xompression
03:51redsheep[d]: Regardless of whether they will have huge impact that will be needed to get to matching perf
03:52redsheep[d]: I don't think tiled cache is huge on ada just based on you generally being able to hold your entire screen in one or two tiles in most configurations
03:53redsheep[d]: Even at 4k I had to set up trianglebin in some really silly ways to get it to use more than 4 tiles
03:55skeggsb9778[d]: gfxstrand[d]: What's the wish-list?
03:56gfxstrand[d]: skeggsb9778[d]: Userptr and some sort of engine version query that doesn't require creating channels. I think those are the big ones.
03:57gfxstrand[d]: I've got ideas on what I'd like for both of them. I think I wrote up something in a userptr issue. I haven't made an issue about the other yet.
03:57skeggsb9778[d]: Ack. I suspect userptr would require more discussion and planning. But the latter should be do-able
03:57skeggsb9778[d]: Ideas on the UAPI you'd like would be great though
03:58skeggsb9778[d]: No rush though 😉
04:01gfxstrand[d]: I've had got two thoughts:
04:01gfxstrand[d]: 1. A getparam per engine type. It's a bit verbose but it's straightforward and getparam enums are cheap.
04:01gfxstrand[d]: 2. The kernel to return a list of all engines that both it and the hardware supports as an array of `u16`s. I can get the kind of engine from the bottom byte and it avoids having anything super fixed like getparams.
04:02gfxstrand[d]: I kinda like 2 but it also feels a bit too clever for its own good. I also need to take a good hard look at the context creation API again and make sure that whatever we do makes sense in the context of that. But those are the thoughts kicking around my head.
04:04skeggsb9778[d]: So. What NVKM reports to the DRM driver is:
04:04skeggsb9778[d]: - A list of all engines, and the classes each supports
04:04skeggsb9778[d]: - A list of available runlists, and the engines available from them
04:05gfxstrand[d]: What's the distinction between a runlist, engine, and class? All I see for the most part are classes.
04:06skeggsb9778[d]: Yeah, nouveau's UAPI for channels is out of date since kepler 😛
04:06skeggsb9778[d]: When you create a channel, you create it on a specific runlist
04:07skeggsb9778[d]: That runlist can reach certain engines (as mapped to subchannels, which is fixed these days)
04:07skeggsb9778[d]: And each engine supports a set of classes
04:08skeggsb9778[d]: (which define the method layout on the subchannel)
04:13gfxstrand[d]: I guess the thing I'm missing is "why?" Why wouldn't I just always use the one runlist that can reach all of the engines?
04:13skeggsb9778[d]: Because there isn't one
04:13gfxstrand[d]: Hehe. Fair.
04:13gfxstrand[d]: Okay, more targeted question: are compute and 3D different engines?
04:14gfxstrand[d]: Or is there one engine that supports both classes?
04:14skeggsb9778[d]: Yes and no 😛
04:14gfxstrand[d]: In that order?
04:15skeggsb9778[d]: The graphics runlist has two "run queues", the first supports 2d+3d+i2m+compute, the second only supports compute
04:15skeggsb9778[d]: (both have their own copy engine)
04:15skeggsb9778[d]: which is why you can do 2d+3d+i2m+compute+copy from one channel
04:16skeggsb9778[d]: you need a channel on a different runlist to, say, use nvdec
04:17skeggsb9778[d]: Using the compute-only pipe gets you async compute
04:21gfxstrand[d]: Okay, do what's the difference between a list and a queue? 😅
04:21gfxstrand[d]: I think it would help if I could see all this laid out in a tree or something.
04:22skeggsb9778[d]: Haha. The runlist has the dma fetcher etc for all the channels on it, the queue would be the interface to the engine (so, gr has two, so it can do graphics+compute simultaneously)
04:24gfxstrand[d]: Okay, so the runlist is the thing that processes my command stream and directs stuff to the different engines based on subchannel?
04:24skeggsb9778[d]: Yes
04:25gfxstrand[d]: So how do I select between the "main" run queue for 3D and the async one?
04:27skeggsb9778[d]: You can't with current UAPI. But, in general. You create a channel group., then you create multiple channels inside that. One channel can select runqueue 0 (2d+3d+i2m+compute+copy), and then some number of channels can select runqueue 1 (compute+copy)
04:28gfxstrand[d]: Kk
04:29skeggsb9778[d]: NVKM supports this on HW (from gk110 onwards), but not GSP currently. It's sorta why I'm asking about UAPI stuff around engines etc, because if I fix it to work on GSP it'd be good to plumb it to userspace finally
04:30gfxstrand[d]: I need to digest this some more but I think this just moved further up the priority list. We need proper async compute.
04:30skeggsb9778[d]: Cool 🙂 As I said, there's no rush, but I'd love to have your thoughts here
04:31gfxstrand[d]: Yeah, I need to digest the details of the topology more. From a first brush, it sounds like there's more detail there than I care about. On the other hand, I have a feeling that those details will matter.
04:34gfxstrand[d]: But it's late here so I need to sign off. We'll talk about this more for sure.
04:35skeggsb9778[d]: No problem 🙂 Have a good evening
13:18gfxstrand[d]: https://www.phoronix.com/news/NVK-Platform-Abstraction
13:19gfxstrand[d]: Okay, that was a way better article than I expected when I saw the headline. He immediately took it in a Nova direction, not theorizing about OGK or anything like that.
13:19karolherbst[d]: well.. he also did that later, but yeah
13:27gfxstrand[d]: Well, naturally. But he assumed something innocuous instead of going straight for the conspiracy theory. 😂
13:34rayquaza: I'm having issues running nouveau, I keep getting some error: [ 33.300675] nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:8 type:45 scope:1 part:233
13:35rayquaza: here is my dmesg: https://pastebin.com/9aPsrgrg
13:52rayquaza: I've enables the GSP in my kernel paramters
13:55clangcat[d]: rayquaza: Does the card boot without error if you disable GSP.
13:55clangcat[d]: Also is the error something that happens on boot or does it happen after trying to use the card? Like what do you see as a user because of the error?
13:56rayquaza: I have hybrid graphics if that's important, these error appear when I boot up my system and look at my dmesg
13:56rayquaza: I'll try disable the firmware and see what happens
13:57tiredchiku[d]: I mean, disabling the gsp will 100% get rid of the gsp messaging in the dmesg
13:57rayquaza: so I shouldn't do it?
13:58tiredchiku[d]: what issue are you facing, what are the symptoms
13:59clangcat[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1260233836134928454/grafik.png?ex=668e93b8&is=668d4238&hm=dbea56f80148c8c545c616d678464c3fd8b4eed665295ecbacb0248ed279824f&
13:59clangcat[d]: tiredchiku[d]: I know that but I wanna know if these go away.
13:59rayquaza: I get a bunch of errors on dmesg, I honesly have no idea on how to check which gpu is being used when I run a steam game
13:59clangcat[d]: rayquaza: Also another question when your in your system does the card show up under `/dev/dri/card*`?
13:59tiredchiku[d]: display init failing because it's a laptop with no external display connected I imagine
14:00clangcat[d]: I'd assume it doesn't by the init failed message
14:00clangcat[d]: tiredchiku[d]: No mine only displays that message when the actual nouveau driver fails for me. and the card is then missing from `/dev/dri`
14:00tiredchiku[d]: I see
14:01tiredchiku[d]: actually
14:01tiredchiku[d]: it might be the same GPU as yours clangcat[d]
14:01tiredchiku[d]: GA107
14:01clangcat[d]: Oh it might?
14:01tiredchiku[d]: rayquaza is this an RTX 3050 laptop?
14:01rayquaza: I see 2 gpus
14:01rayquaza: yes I think, give me a sec
14:02rayquaza: from fastfetch: GPU 1: NVIDIA GeForce RTX 3050 Ti Mobile [Discrete]
14:02karolherbst[d]: worst case it's one of the GPU we need an updated GSP...
14:02rayquaza: rip
14:02tiredchiku[d]: _almost_ the same GPU
14:02tiredchiku[d]: same die though, meaning same GSP firmware
14:03tiredchiku[d]: GA107
14:03rayquaza: let me reboot after disabling GSP and see what happens with dmesg
14:03karolherbst[d]: rayquaza: ever tried the open source nvidia driver with that? I'm curious if 535.113.01 loads on it using GSP
14:03tiredchiku[d]: you can't build 535 with the latest kernel anymore
14:03tiredchiku[d]: some function names changed somewhere
14:03rayquaza: open source driver? nvidia-open?
14:04tiredchiku[d]: yes, nvidia-open
14:04karolherbst[d]: mhh
14:04clangcat[d]: rayquaza: very similar card as me you might need a patch from patchwork as that's what fix it for me.
14:04karolherbst[d]: right...
14:04karolherbst[d]: clangcat[d]: ohh.. or that
14:04karolherbst[d]: what patch was it?
14:04rayquaza: I'll try reboot then install the nvidia-open
14:05tiredchiku[d]: https://patchwork.freedesktop.org/patch/591858/?series=132966&rev=2
14:05tiredchiku[d]: I think
14:05tiredchiku[d]: Sid127: I kinda asked about it too
14:05karolherbst[d]: mhh, that's a different issue tho
14:05tiredchiku[d]: but got no response
14:05clangcat[d]: tiredchiku[d]: No I think that's a different one as this person doesn't have SG_DEBUG problem.
14:05tiredchiku[d]: I see
14:06clangcat[d]: karolherbst[d]: Yea I have an issue that I clear my browser history :p and I've not rebuilt my kernel in a while.
14:06clangcat[d]: Mainly cause low memory kernel building is a headache.
14:07tiredchiku[d]: do you have the patch file
14:07tiredchiku[d]: but yeah, it does appear GA107 is rather problematic currently: https://www.techpowerup.com/gpu-specs/nvidia-ga107.g988
14:07rayquaza: Here is the new dmesg with GSP disabled: https://pastebin.com/t9XUusQq
14:07clangcat[d]: tiredchiku[d]: Not in my history.
14:08tiredchiku[d]: I could try it out in a liveISO on my brother's laptop (3050Ti) over the weekend
14:08clangcat[d]: tiredchiku[d]: Could be an idea.
14:08tiredchiku[d]: he'd kill me if I nuked windows on it or did a dual boot
14:08tiredchiku[d]: but I think he should be ok with letting me borrow it for liveISO shenanigans
14:09karolherbst[d]: rayquaza: it's still using GSP
14:09tiredchiku[d]: I can even roll my own iso with custom kernel builds, no big deal
14:09clangcat[d]: rayquaza: The GSP messages are still there question did you always get these messages or only on this kernel?
14:09karolherbst[d]: rayquaza: you need to boot with `nouveau.config=NvGspRm=0` to disable it
14:10rayquaza: clangcat[d] This is the second kernel I'm testing it on, tried the TKG linux-git first
14:10rayquaza: will do karolherbst[d]
14:10rayquaza: how do you reply in IRC?
14:10karolherbst[d]: rayquaza: what do you mean?
14:11rayquaza: like ping someone
14:11clangcat[d]: rayquaza: Uhhh I don't think IRC has like a reply feature. But @ us works here.
14:11rayquaza: @clangcat[d] does this work?
14:11tiredchiku[d]: rayquaza: you just type out their username
14:11karolherbst[d]: rayquaza: you already did
14:11tiredchiku[d]: no need to @ either
14:11clangcat[d]: rayquaza: Yea username works fine.
14:11rayquaza: ah I see
14:11clangcat[d]: rayquaza: Yup
14:11karolherbst[d]: normally on IRC you do $username: ...
14:11karolherbst[d]: but clients usually highlight if they see the username written anywhere
14:11tiredchiku[d]: anyway, I'll try to take a look at it over the weekend
14:12clangcat[d]: tiredchiku[d]: It would be nice if you fixed my issues.
14:12clangcat[d]: Or
14:12tiredchiku[d]: I know sweetie
14:12clangcat[d]: if Nova would finally come out and possibly fix them.
14:12clangcat[d]: XD
14:12tiredchiku[d]: gonna be interesting trying to get 535 openrm driver into a live iso
14:13clangcat[d]: It should be possible
14:13tiredchiku[d]: I know
14:13tiredchiku[d]: just interesting :P
14:13tiredchiku[d]: I'll figure it out
14:13tiredchiku[d]: no promise I'll look at it this weekend, since it's gonna be my first weekend back at home, but I will take a look at it for sure
14:14clangcat[d]: Sid127: Also tiredchiku[d] did airleid ever get back to you about this
14:14rayquaza: thanks man
14:14tiredchiku[d]: nope
14:14clangcat[d]: <a:aqua_Cry_Sad:956763266355441664>
14:14rayquaza: here is the actual disabled GSP: https://pastebin.com/rqjC43sS
14:14karolherbst[d]: rayquaza: so, disabling GSP makes it work? cursed
14:15tiredchiku[d]: yeah, GA107 GSP is troublesome on nouveau
14:15clangcat[d]: karolherbst[d]: Well not really it goes down a different code path :p
14:15tiredchiku[d]: and I believe 535 is broken on current kernels
14:15tiredchiku[d]: maybe on 6.6 LTS...
14:15tiredchiku[d]: tiredchiku[d]: openrm, that is
14:15karolherbst[d]: yeah.. I wouldn't be surprised if we really have to update GSP
14:15clangcat[d]: But yea uhh GSP is problematic at times.
14:15clangcat[d]: cool
14:15tiredchiku[d]: I shall investigate and report my findings posthaste
14:16tiredchiku[d]: out of
14:16clangcat[d]: Especially on 3050's
14:16tiredchiku[d]: both boredom and love xD
14:16clangcat[d]: it seems like
14:16rayquaza: wait isn't there like a way to extract the GSP from the graphics driver, would that be any different?
14:16karolherbst[d]: rayquaza: no, but nouveau would have to support the version
14:16rayquaza: ah I see
14:18rayquaza: karolherbst[d] yeah trying mangohud vkcube now defaults to my nvidia gpu instead of intel but got some errors
14:19rayquaza: dunno if this is useful or not:
14:19rayquaza: Selected GPU 1: NVIDIA GeForce RTX 3050 Ti Laptop GPU (NVK GA107), type: DiscreteGpu
14:19rayquaza: [2024-07-09 16:18:35.929] [MANGOHUD] [error] [loader_nvml.cpp:42] Failed to open 64bit libnvidia-ml.so.1: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
14:19rayquaza: [2024-07-09 16:18:35.929] [MANGOHUD] [error] [nvml.cpp:46] Failed to load NVML
14:19rayquaza: Authorization required, but no authorization protocol specified
14:19rayquaza: [2024-07-09 16:18:35.934] [MANGOHUD] [error] [nvctrl.cpp:56] XNVCtrl didn't find the correct display
14:19tiredchiku[d]: no, that's just normal mangohud logging
14:19rayquaza: ah I see
14:19clangcat[d]: rayquaza: One more question who makes your laptop?
14:19karolherbst[d]: with GSP disabled performance will be poor anyway, so not quite sure you'll be having fun playing games like that
14:19rayquaza: back to square one I guess
14:20clangcat[d]: Just wondering if there is a pattern
14:20tiredchiku[d]: I doubt there's a pattern
14:20rayquaza: clangcat[d] IdeaPad Gaming 3 15IAH7
14:20tiredchiku[d]: since GSP is die-dependent
14:20clangcat[d]: Yea no but neat.
14:20clangcat[d]: tiredchiku[d]: Yea was unlikely but was curious if it was another Dell.
14:20tiredchiku[d]: well, somewhat die-dependent
14:20karolherbst[d]: yeah.. you can't really compare nvidia cards by model name
14:20karolherbst[d]: they could have the same name even and just behave differently
14:21tiredchiku[d]: either way
14:21tiredchiku[d]: will look at it
14:21tiredchiku[d]: c:
14:21clangcat[d]: karolherbst[d]: Yea but it's also why I was curious if it was another Dell one. Mainly just cause yea would be interesting to know.
14:22clangcat[d]: tiredchiku[d]: Does mangohud just try and open the prop drivers libraries if they exist?
14:22clangcat[d]: cause I know EGL does that for me. Even though the card is running on Nouvea
14:22karolherbst[d]: yeah
14:22tiredchiku[d]: correct
14:22tiredchiku[d]: mangohud isn't wired up for nouveau very well yet
14:25clangcat[d]: karolherbst[d]: It's great cause the `libnvidia_EGL` reports a memory error in my checking tools but it's only happening because Nvidia's driver isn't present.
14:26clangcat[d]: And I need to always remember "ahhh yea that one isn't fixable by me"
14:26karolherbst[d]: it's even worse. some nvidia libs call into `nvidia-modeprobe` which is a setuid binary trying to load the nvidia kernel module and spams `dmesg`
14:26clangcat[d]: Probably not even by Mesa/EGL as a whole. Seeing as it happens inside the libnvidia files.
14:27clangcat[d]: karolherbst[d]: I've not had that yet but yikes
14:27clangcat[d]: I just have both prop and FOSS so I can test soilleir on both
14:28tiredchiku[d]: actually
14:28tiredchiku[d]: I won't make a live ISO
14:28tiredchiku[d]: I'll just make a proper installation on a USB
14:29tiredchiku[d]: :wolfFIRE:
14:31rayquaza: is there is discord server for this channel?
14:32tiredchiku[d]: https://discord.com/invite/QvFhUPPq
14:32rayquaza: this me thanks you
15:47notthatclippy[d]: tiredchiku[d]: Some community maintained patches for basically any driver version and any kernel version at <https://github.com/Frogging-Family/nvidia-all>. YMMV on actual runtime, but these should make it build.
15:51redsheep[d]: That's useful, maybe I can actually get somewhere with seeing if the DP audio issue is just a gsp bug then
15:53notthatclippy[d]: You can actually apply these to the proprietary closed source driver as well, and then test the no-GSP NV variant of the same version.
16:18tiredchiku[d]: notthatclippy[d]: am familiar with that, yeah
21:12skeggsb9778[d]: airlied[d]: i pushed fixes for the dma mask issue, and missing module device table yesterday btw
21:15skeggsb9778[d]: the bl fix is still outstanding, as somehow they break my laptop panel harder than it already is, and i'd like to understand why more before i do a fix for that
21:15airlied[d]: cool I'll try and page back in where I was and test it, been off for a week and forgotten everything I was doing 🙂
21:16skeggsb9778[d]: hehe, that can happen 😛
21:17skeggsb9778[d]: i was able to reprod the dma mask issue by disabling iommu in sbios