00:12redsheep[d]: marysaka[d]: It's very exciting seeing that nearing the finish line. I think that "just" leaves fs interlock and the various RT related extensions before vkd3d-proton can advertise 12_2
00:41airlied[d]: is there any useful info on how NVIDIA do RT?
00:41airlied[d]: I remember hearing different class objects, but I've no idea how that impacts from the POV of how other vendors do it in mesa etc
00:43airlied[d]: like if it's just a compute class that has launch parameters, but you still have to build all the same shaders
00:44mhenning[d]: Konstantin did some RE work, but it's all fairly early https://gitlab.freedesktop.org/KonstantinSeurer/bvhre/-/tree/main/source/nvk?ref_type=heads
00:50karolherbst[d]: looks like it has its own instruction set?
00:53mhenning[d]: I mean, it has its own instructions
00:53mhenning[d]: but not a brand new instruction set no
00:55karolherbst[d]: ahh
01:00gfxstrand[d]: airlied[d]: Pretty sure it's just compute with some new instructions.
01:07gfxstrand[d]: Not sure how shader calls work. Probably just function calls?
01:18airlied[d]: sounds like you have to give the hw a bunch of shader entrypoints, and it does most of the calls
01:25mohamexiety[d]: Yeah they do traversal in HW and intersection/sorting
01:26mohamexiety[d]: They’re by far the most advanced with RT stuff so RT work will be either the simplest compared to other drivers or it will be the most complex, no middle :frog_pregnant:
03:44mangodev[d]: i wonder if software RT would be possible some day, kind of like how RADV does on Polaris
03:54airlied[d]: probably could pull that off now with the compute shader support we have
03:54airlied[d]: maybe with function calling
09:11Hypfer: Good day, I suspect that I might be asking a question that is seeing a lot of asking, but having exhausted the non-AI-hallucinated google results, I guess it's time to show up at the source of truth
09:12Hypfer: So what I saw is that nouveau on older nvidia has been difficult, due to something something signed firmwares, preventing foss from doing a lot of stuff such as reclocking and other power management
09:13Hypfer: In 2022, there was this LWN article https://lwn.net/Articles/910343/ stating something about some new component having been added, that might possibly change something about that
09:13Hypfer: I also found this wiki page which says "todo" for pascal https://nouveau.freedesktop.org/PowerManagement.html
09:14Hypfer: Q: What's the situation atm? Did that GSP thing lead anywhere, or was that met with more roadblocks? Do people care, or did everyone cut their losses and moved to less hostile hardware?
09:15Hypfer: For context, my hw is the gtx 1070 and I've been "experiencing" the proprietary nvidia driver over the last few days
09:16Mary: Hypfer: On Turing and later (RTX 2XXX), we use the GSP firmware meaning that we now have reclocking and power management working
09:16Hypfer: Oh the GSP thing only applies to newer stuff? I see
09:16Mary: Yes sadly...
09:25Hypfer: So if I understand it correctly, the firmware on the GPU only accepts signed commands? And the proprietary nvidia driver does.. somehow do that?
09:25Hypfer: (Not bargaining, genuine curiosity. I can just buy a new card and might actually do that, but it sounds interesting)
09:26Hypfer: What confuses me in that understanding is that the driver is code that rests on my disk, so.. the secret that signs also is on my disk? Or do I misunderstand?
09:28Hypfer: LLM tells me that it's actually a firmware blob that is signed, which exposes an interface to the driver, which is entirely undocumented and changes all the time, but LLM is also a lying bastard and not to be trusted
09:30Hypfer: That claim raises questions like "how can that protocol to the signed firmware blob be undocumented if the interfacing part compiles on my machine?"
09:31Hypfer: And if that is answered with "because the compiled part just talks to a blob", does that mean that the driver only works on x86, and can't I "just" throw that blob into ghidra?
09:32Hypfer: Or did they invent their own fake arch and the driver is actually an emulator for a cpu that doesn't exist? But why would they do that that would sound insane and incompatible with performance
09:33sonicadvance1[d]: The firmware is signed and runs on a RISC-V coprocessor on the GPU itself.
09:33sonicadvance1[d]: The hardware verifies the signature of course so you can't change it without resigning.
09:34Hypfer: Q: Why would you want to modify the firmware?
09:34sonicadvance1[d]: NVIDIA used to not provide firmwares, so open-source projects had to build their own.
09:35sonicadvance1[d]: Older architecture firmwares weren't signed and ran on a custom coprocessor with a custom ISA.
09:35sonicadvance1[d]: Signing killed that idea.
09:37Hypfer: But is that "just" a licensing thing? Because I'd suspect that that blob is somewhere in the proprietary driver and at least the "push blob to card" part is probably doable. So the issue might me "talking to that blob"? But why?
09:38sonicadvance1[d]: Licensing yes, there were some scripst to rip them out of the installer because they couldn't be redistributed. Back when NVIDIA was more hostile towards OSS and Linux.
09:39Hypfer: so the LLM is just making stuff up when it tells me that the interface to that blob would be "highly convoluted and constantly changing"? I mean it must be. The driver is a stationary object. it doesn't change. The cards are long EOL
09:42sonicadvance1[d]: The firmware itself usually doesn't change that dramatically over time. It could break but not without reason.
09:42airlied[d]: It was constantly changing when the driver was under development
09:42airlied[d]: Probably not any more
09:43Hypfer: So are there still any hard roadblocks there then? Or would, hypothetically, saying "okay, final version. We take this one, and then the user has to figure out themselves where they get that blob from" something that would work if someone would care?
09:43airlied[d]: But it's not just one firmware it's a bunch and they have to be loaded in a certain order and they all interact with each other
09:44airlied[d]: Then you have to RE the API
09:45sonicadvance1[d]: NVIDIA actually uploads most firmwares to linux-firmware these days right?
09:45Hypfer: Which makes perfect sense as that is how it works all the time
09:47Hypfer: Okay so it's 2 things then? 1) NVIDIA was actively hostile and people were scared to rip out that blob and use it; and 2) The interface changed all the time, and those not yet scared eventually grew tired (like with flutter RE)
09:53Hypfer: Okay, remaining question: What makes GSP different? The fact that it is being distributed by nvidia themselves?
09:56Hypfer: And I guess it comes with actual documentation?
09:57x512[m]: To make official NVIDIA kernel driver satisfy open source conditions and be able to use GPL Linux API I suppose.
09:58x512[m]: No, documentation for GSP is not available, only source code how to use it.
10:00Hypfer: Right, imprecise, sorry
10:07Hypfer: Okay so essentially, all it takes now would "just" be someone taking that final blob and doing the long, tedious and thankless RE dance?
10:08Hypfer: Which won't be happening in the west (exception apply to those that do for the sake of it), because the path of least resistance is just to buy a new card that works for idk 100€
10:08Hypfer: Thanks!
10:09airlied[d]: Pretty much, find last driver and go rip it up. NVIDIA won't allow us to redist the fws so you will need a cutter
10:21esdrastarsis[d]: airlied[d]: I don't think performance on Turing will be optimal atm, due to the lack of compute mme support 🙁
10:21airlied[d]: Doing sw rt would probably not be optimal anyways since there is hw
10:26esdrastarsis[d]: make sense 🐸
14:01karolherbst[d]: any further comments/reviews on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40293 (better vec load/store alignment) and https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40298 (vec2 f2f16)?
23:58karolherbst[d]: mhh we haven't wired up I2IP yet? mhh
23:58karolherbst[d]: I see it could potentially be helpful with f2i/f2u lowering..
23:59karolherbst[d]: ohh and there is `nir_intrinsic_convert_alu_types` as well...