IRC Logs of #nouveau on irc.freenode.net for 2025-12-18

17:54 mhenning[d]: compression and prepass scheduling both landed 🎉
17:55 mhenning[d]: I think transfer queues are also pretty much ready if anyone else wants to review: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36617 If there are no more comments I'll probably marge tomorrow
18:05 karolherbst[d]: nice nice
18:13 karolherbst[d]: I've noticed that I have 40 open MRs :blobcatnotlikethis:
18:16 karolherbst[d]: Also I'm terrible at PTO, at least I'm not reviewing anything 🙃
18:17 karolherbst[d]: but I can do benchmarks, because that doesn't count as work!
18:18 karolherbst[d]: the scheduling stuff has potential to really mess with the microbenchmarks I'm looking at.. it's wild how sensitive those are to register usage and occupancy in general
18:21 karolherbst[d]: 10% perf regression in one of them 😢
18:22 karolherbst[d]: well since d06aff22438 to be fair, which is like 3 weeks of MRs
18:23 karolherbst[d]: but I'm sure the scheduling could make the phi vec RA issue worse
18:23 mhenning[d]: possibly yeah
18:43 mohamexiety[d]: are there more things waiting for review?
18:46 mhenning[d]: I have a tiny fix here that could use review: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39021
18:54 marysaka[d]: was looking at it but really confused about why we report only 31-bit at the moment to be honest
19:05 mhenning[d]: gfxstrand[d]: Do you remember why `NVK_MAX_BUFFER_SIZE` is `(1ull << 31)`?
19:07 gfxstrand[d]: Because we use 32-bit offsets in shaders.
19:08 gfxstrand[d]: Also, I think there's some nouveau BO size limitations. But mostly it's the shader thing.
19:09 mhenning[d]: Right, I'm aware of that limit
19:09 gfxstrand[d]: We could probably bump it to u32 max without much harm
19:09 gfxstrand[d]: But powers of two are nice
19:10 mhenning[d]: If that's the only worry then I think 1 << 32 should be fine? That's what nvidia exposes for maxStorageBufferRange
19:10 gfxstrand[d]: There's a Kronos NVK branch which has 64-bit indexing mostly implemented
19:10 gfxstrand[d]: mhenning[d]: Our size is also 32 bits and isn't -1
19:11 gfxstrand[d]: Which could also be changed but ugh...
19:14 mhenning[d]: Okay. Maybe I should be increasing maxBufferSize to UINT32_MAX then rather than lowering maxStorageBufferRange to 1 << 31
19:23 marysaka[d]: I see... in any cases my comment for address space still apply that's something I did hit badly with my RM backend 😅
19:25 _lyude[d]: karolherbst[d]: honestly I'm scared of looking at the nouveau gitlab issue list ._.
19:25 _lyude[d]: i haven't done it for quite a while and i've had enough stuff on my plate to keep up with
19:26 _lyude[d]: also wow, there is a heck of a lot of code we're missing for calculating firmware sizing in nouveau
19:30 esdrastarsis[d]: There are people watching phomes update the spreadsheet (myself included) lol
19:31 mhenning[d]: ooh spreadsheet watch party
19:38 _lyude[d]: -and- apparently the nvidia driver can actually retry the gsp loading process multiple times with different parameters,
19:38 _lyude[d]: wow
20:04 marysaka[d]: _lyude[d]: I think the whole context size allocation also over-allocate on nouveau atm (always allocating 64 subctx when it's not always required)
20:04 marysaka[d]: that and also I think we forget to set the hypervisor type field for the system info part and we end up as hyper-v from memory
20:04 marysaka[d]: *needs to pull out her notes of when she was lost difing stuffs*
20:11 phomes_[d]: sorry I had to go do something else. Back to spreadsheet updating now 🙂
21:06 _lyude[d]: marysaka[d]: sfdafafsdasfdsfad
21:06 _lyude[d]: we should probably fix this
21:07 _lyude[d]: also, oooooooooooooooh boy. yeah. openrm is definitely calculating wpr meta twice, with different values the second time
21:25 airlied[d]: just a passing make sure you are on the 570.144 branch of openrm 🙂
21:44 _lyude[d]: airlied[d]: yeah I am about to double check that but unfortunately nvidia seems to make 0 mention of "what is the newest version this version of the driver compiles on".
21:44 _lyude[d]: also, it's kind of difficult to do this when i'm trying to avoid polluting my system with their .run driver
21:47 _lyude[d]: airlied[d]: do you have any idea where I can find this? nvidia keeps claiming "kernel 4.15 or newer" which obviously is false because they can't guarantee their kernel driver actually compiles for versions of the kernel newer then the ones they tested :S. so I'm stuck getting random build errors that presumably are from having too new of a kernel
21:55 airlied[d]: I don't think that info exists anywhere, they just release newer 570 branches that will work on later kernels, but there is no way for them to time travel back and edit end datse
21:55 airlied[d]: but unfortuantely for debugging nouveau we need to use 570.144
21:55 _lyude[d]: are the gsp versions the same
21:55 notthatclippy[d]: You can probably just use latest 570.xx. The differences in this area will be absolutely minimal.
21:55 airlied[d]: or at least just read the code 🙂
21:55 _lyude[d]: airlied[d]: i have been lol
21:55 _lyude[d]: but I need to see the values it's coming up with because there's a -lot- of changes that need to happen
21:56 airlied[d]: yeah 570 at least is more stable than 535 where the whole rpc layer got rewritten
21:56 notthatclippy[d]: _lyude[d]: No, but all 570.xx versions after 144 are ABI-stable in all the ways nouveau cares about at least.
21:56 _lyude[d]: good enough for me 🙂
21:56 notthatclippy[d]: You should be able to trivially load newer 570 gsp.bin in nouveau
22:00 _lyude[d]: but yeah - most of me trying to get this info is because there's quite a lot of differences in the code between the two drivers, and it's a bit hard to tell all of the changes that we're actually missing (primarily because of nouveau using raw register offsets, and having a much different organization for how it calculates all of its offsets for loading the GSP in the first place). if I can see at
22:00 _lyude[d]: least which WPR meta-data variables are different though that will at least direct me towards which bits I need to port over
22:00 _lyude[d]: so far I at least know we're missing the MMU_LOCK bits
22:05 _lyude[d]: (also, thank you again for the info on openrm the other day notthatclippy[d] it has made going through openrm way easier 🙂