00:06 fdobridge: <g​fxstrand> Don't tell the internet but I honestly don't know if NVK will ever support kepler.
00:07 fdobridge: <g​fxstrand> Images are a PITA and the hardware is getting increasingly irrelevant as time goes on.
00:08 fdobridge: <S​id> probably best if we stick to the GSP babes
00:08 fdobridge: <S​id> maaaayyyybe pascal
00:12 fdobridge: <g​fxstrand> Maxwell is practical
00:12 fdobridge: <g​fxstrand> And that gives us Pascal
00:13 fdobridge: <g​fxstrand> They're basically the same for most things.
00:15 fdobridge: <S​id> hmm
00:17 fdobridge: <g​fxstrand> It's Kepler where we have another instruction set, we have to emulate storage images, and we have different texture headers that come with restrictions.
00:17 HdkR: Just supporting Turing+, it's where the sanity begins :P
00:20 HdkR: Ignore anything with classic Falcon :D
00:21 fdobridge: <g​fxstrand> Yeah.
00:22 fdobridge: <g​fxstrand> I've mostly been hacking on Maxwell this week to prove that my subgroups bug *doesn't* repro there.
00:59 fdobridge: <J​oshie with Max-Q Design> Seem to have had something regress gsp wise
00:59 fdobridge: <J​oshie with Max-Q Design> after updating my system
00:59 fdobridge: <J​oshie with Max-Q Design> https://cdn.discordapp.com/attachments/1034184951790305330/1215826350913355946/message.txt?ex=65fe2987&is=65ebb487&hm=e6d7384fd59f296310d7c1a60e1e97ed01c077e4f498c61f8feddb1c27eabb33&
01:01 fdobridge: <J​oshie with Max-Q Design> Ends up making gfx not work on my system
01:01 fdobridge: <S​id> kernel version?
01:02 fdobridge: <J​oshie with Max-Q Design> `Linux nvkvm 6.7.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 08 Mar 2024 01:59:01 +0000 x86_64 GNU/Linux`
01:02 fdobridge: <J​oshie with Max-Q Design> stock arch 6.7.8
01:02 fdobridge: <J​oshie with Max-Q Design> stock arch 6.7.9 (edited)
01:02 fdobridge: <J​oshie with Max-Q Design> gonna build with -git and debug kernel to see if there is more that can be pulled
01:02 fdobridge: <J​oshie with Max-Q Design> gonna build with -git and debug kernel to see if there is more info that can be pulled (edited)
01:02 fdobridge: <J​oshie with Max-Q Design> cc: @airlied ? 🐸
01:02 fdobridge: <g​fxstrand> Laptop or desktop?
01:02 fdobridge: <J​oshie with Max-Q Design> 2060 Super on desktop
01:03 fdobridge: <S​id> mmm, if it's stock arch kernel I can poke around tomorrow
01:03 fdobridge: <g​fxstrand> Not sure then. I'd try -git. That or I have an nvk branch on gitlab.freedesktop.org/gfxstrand/linux which is what I'm running right now.
01:03 fdobridge: <S​id> it's not the suspend issue
01:04 fdobridge: <J​oshie with Max-Q Design> I'll try your nvk branch, sounds fun
01:04 fdobridge: <S​id> calltrace is different
01:05 fdobridge: <J​oshie with Max-Q Design> yeah def not suspend related, this is just me starting it in a VM
01:05 fdobridge: <S​id> oh, vm
01:05 fdobridge: <S​id> interesting
01:05 fdobridge: <S​id> because calltrace suggests an error on boot time init
01:05 fdobridge: <S​id> still, I'll poke around tomorrow
01:06 fdobridge: <J​oshie with Max-Q Design> ```
01:06 fdobridge: <J​oshie with Max-Q Design> [ 2.821417] nouveau 0000:04:00.0: NVIDIA TU106 (166000a1)
01:06 fdobridge: <J​oshie with Max-Q Design> [ 3.037922] nouveau 0000:04:00.0: bios: version 90.06.60.00.00
01:06 fdobridge: <J​oshie with Max-Q Design> ```
01:06 fdobridge: <J​oshie with Max-Q Design> for reference
01:07 fdobridge: <g​fxstrand> There was a GSP display issue a while back
01:07 fdobridge: <g​fxstrand> That tended to kill nouveau on boot
01:07 fdobridge: <g​fxstrand> It's been fixed for a while, though.
01:09 fdobridge: <J​oshie with Max-Q Design> Kinda annoying that the only tests for QUEUE_FOREIGN_EXT are with modifiers
01:09 fdobridge: <J​oshie with Max-Q Design> Kinda annoying that the only cts tests for QUEUE_FOREIGN_EXT are with modifiers (edited)
01:10 fdobridge: <J​oshie with Max-Q Design> but kinda also makes sense
01:10 fdobridge: <g​fxstrand> Yeah, it's about the only place where it matters
01:13 fdobridge: <J​oshie with Max-Q Design> I was surprised we do *nothing* for QUEUE_EXTERNAL
01:13 fdobridge: <J​oshie with Max-Q Design> heh
01:47 fdobridge: <J​oshie with Max-Q Design> Same issue there
01:50 fdobridge: <g​fxstrand> Ugh...
06:07 cernico: Maxas githubs project from Scott Gray and his articles i had seen. So Maxwell introduces a modified/improved scheduling over the kepler and earlier SIMD hw. So introduction of control codes in scheduling blocks game to light.
11:43 fdobridge: <k​arolherbst🐧🦀> how much of that is required to play some games through proton?
13:47 fdobridge: <m​ohamexiety> well for starters there's no sparse as kepler can't do standard shapes properly
13:47 fdobridge: <m​ohamexiety> then again I guess d3d12 isn't exactly going to be viable on kepler anyway
13:47 fdobridge: <k​arolherbst🐧🦀> yeah.. not really caring about d3d12 here
13:47 fdobridge: <k​arolherbst🐧🦀> more about d3d10 and d3d11 games
13:52 fdobridge: <k​arolherbst🐧🦀> on the other hand the `780 Ti` is like a `GeForce GTX 1660`, which isn't all that slow actually
13:52 fdobridge: <k​arolherbst🐧🦀> maybe more closer to a `GTX 1650 SUPER`
14:00 fdobridge: <m​ohamexiety> it's closer to the 1650 on windows. kepler really didn't age well
14:00 fdobridge: <m​ohamexiety> maxwell was a big jump (and also a big shift)
14:12 fdobridge: <g​fxstrand> Images are pretty important...
14:14 fdobridge: <g​fxstrand> Honestly, images are totally doable. @mohamexiety is going to get to write the CPU detiling code pretty soon and it's just porting that to the GPU. It's just annoying.
14:14 fdobridge: <g​fxstrand> Actually, we might not need to do shader tiling. Maybe it's just format conversation. 🤔
14:15 fdobridge: <k​arolherbst🐧🦀> yeah.. it should only be format conversion
14:15 fdobridge: <k​arolherbst🐧🦀> sucks for perf, but whatever
14:15 fdobridge: <k​arolherbst🐧🦀> it's only relevant if it's a real shader image and not a texture
14:15 fdobridge: <g​fxstrand> But still, that's enough to be a headache. Most games really want `shaderStorageImageRead/WriteWithoutFormat`
14:15 fdobridge: <k​arolherbst🐧🦀> mhhh
14:16 fdobridge: <k​arolherbst🐧🦀> but one of those have to be emulated on newer gens anyway, no?
14:16 fdobridge: <g​fxstrand> Nope
14:16 fdobridge: <k​arolherbst🐧🦀> ohh right..
14:16 fdobridge: <g​fxstrand> On Maxwell+, your just suld/sust and off you go
14:16 fdobridge: <k​arolherbst🐧🦀> then on older gens only one of them is missing
14:17 fdobridge: <g​fxstrand> Kepler may have stores. Those are required by GL.
14:17 fdobridge: <k​arolherbst🐧🦀> yeah
14:17 fdobridge: <k​arolherbst🐧🦀> so only reads without format need to be emulated
14:18 fdobridge: <g​fxstrand> Yeah but those are the ones that cost you perf. 😭
14:18 fdobridge: <g​fxstrand> And the big problem is that emulating them requires me to figure out virtual fiction calls.
14:19 fdobridge: <k​arolherbst🐧🦀> what part of that
14:19 fdobridge: <g​fxstrand> That or emit a pile of ifs. Honestly there's probably the easy way.
14:19 fdobridge: <k​arolherbst🐧🦀> there isn't really much to function calls in the ISA in the first place, it's pretty straight forward
14:20 fdobridge: <k​arolherbst🐧🦀> it's mostly just work on the IR
14:20 fdobridge: <g​fxstrand> Looking at just how much garbage the NVIDIA compiler dumps out for certain things makes me feel less bad about nonsense like image format conversation.
14:20 fdobridge: <g​fxstrand> Yeah and that's not trivial.
14:20 fdobridge: <k​arolherbst🐧🦀> the expensive part is the load itself anyway
14:20 fdobridge: <g​fxstrand> I want to do it anyway, though.
14:20 fdobridge: <k​arolherbst🐧🦀> register saving is probably the only hard part here
14:21 fdobridge: <g​fxstrand> The expensive part is the stall which you can't schedule away if it's in a vfunc. 🫤
14:21 fdobridge: <g​fxstrand> But whatever. We'll figure it out if we have to.
14:22 fdobridge: <k​arolherbst🐧🦀> what do you mean by virtual function anyway? Call to functions aren't really that much different to normal branches
14:24 fdobridge: <k​arolherbst🐧🦀> on pre turing call just pushes to the stack as well
14:24 fdobridge: <k​arolherbst🐧🦀> and you have an API depth system value you may or may not increase
20:07 fdobridge: <b​utterflies> fun fact: NVIDIA somehow shipped a D3D11 driver on Fermi
20:07 fdobridge: <b​utterflies> fun fact: NVIDIA somehow shipped a D3D12 driver on Fermi (edited)
20:08 fdobridge: <b​utterflies> D3D12 on Fermi came in the very last releases before EOL though, FL11_0
20:50 fdobridge: <m​ohamexiety> yeah
20:50 fdobridge: <m​ohamexiety> kepler is FL11_0 too iirc
20:51 fdobridge: <m​ohamexiety> I used the Fermi D3D12 driver a few times when I had a GT 525M laptop. it was actually usually worse LOL