00:06fdobridge: <gfxstrand> Don't tell the internet but I honestly don't know if NVK will ever support kepler.
00:07fdobridge: <gfxstrand> Images are a PITA and the hardware is getting increasingly irrelevant as time goes on.
00:08fdobridge: <Sid> probably best if we stick to the GSP babes
00:08fdobridge: <Sid> maaaayyyybe pascal
00:12fdobridge: <gfxstrand> Maxwell is practical
00:12fdobridge: <gfxstrand> And that gives us Pascal
00:13fdobridge: <gfxstrand> They're basically the same for most things.
00:15fdobridge: <Sid> hmm
00:17fdobridge: <gfxstrand> It's Kepler where we have another instruction set, we have to emulate storage images, and we have different texture headers that come with restrictions.
00:17HdkR: Just supporting Turing+, it's where the sanity begins :P
00:20HdkR: Ignore anything with classic Falcon :D
00:21fdobridge: <gfxstrand> Yeah.
00:22fdobridge: <gfxstrand> I've mostly been hacking on Maxwell this week to prove that my subgroups bug *doesn't* repro there.
00:59fdobridge: <Joshie with Max-Q Design> Seem to have had something regress gsp wise
00:59fdobridge: <Joshie with Max-Q Design> after updating my system
00:59fdobridge: <Joshie with Max-Q Design> https://cdn.discordapp.com/attachments/1034184951790305330/1215826350913355946/message.txt?ex=65fe2987&is=65ebb487&hm=e6d7384fd59f296310d7c1a60e1e97ed01c077e4f498c61f8feddb1c27eabb33&
01:01fdobridge: <Joshie with Max-Q Design> Ends up making gfx not work on my system
01:01fdobridge: <Sid> kernel version?
01:02fdobridge: <Joshie with Max-Q Design> `Linux nvkvm 6.7.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 08 Mar 2024 01:59:01 +0000 x86_64 GNU/Linux`
01:02fdobridge: <Joshie with Max-Q Design> stock arch 6.7.8
01:02fdobridge: <Joshie with Max-Q Design> stock arch 6.7.9 (edited)
01:02fdobridge: <Joshie with Max-Q Design> gonna build with -git and debug kernel to see if there is more that can be pulled
01:02fdobridge: <Joshie with Max-Q Design> gonna build with -git and debug kernel to see if there is more info that can be pulled (edited)
01:02fdobridge: <Joshie with Max-Q Design> cc: @airlied ? 🐸
01:02fdobridge: <gfxstrand> Laptop or desktop?
01:02fdobridge: <Joshie with Max-Q Design> 2060 Super on desktop
01:03fdobridge: <Sid> mmm, if it's stock arch kernel I can poke around tomorrow
01:03fdobridge: <gfxstrand> Not sure then. I'd try -git. That or I have an nvk branch on gitlab.freedesktop.org/gfxstrand/linux which is what I'm running right now.
01:03fdobridge: <Sid> it's not the suspend issue
01:04fdobridge: <Joshie with Max-Q Design> I'll try your nvk branch, sounds fun
01:04fdobridge: <Sid> calltrace is different
01:05fdobridge: <Joshie with Max-Q Design> yeah def not suspend related, this is just me starting it in a VM
01:05fdobridge: <Sid> oh, vm
01:05fdobridge: <Sid> interesting
01:05fdobridge: <Sid> because calltrace suggests an error on boot time init
01:05fdobridge: <Sid> still, I'll poke around tomorrow
01:06fdobridge: <Joshie with Max-Q Design> ```
01:06fdobridge: <Joshie with Max-Q Design> [ 2.821417] nouveau 0000:04:00.0: NVIDIA TU106 (166000a1)
01:06fdobridge: <Joshie with Max-Q Design> [ 3.037922] nouveau 0000:04:00.0: bios: version 90.06.60.00.00
01:06fdobridge: <Joshie with Max-Q Design> ```
01:06fdobridge: <Joshie with Max-Q Design> for reference
01:07fdobridge: <gfxstrand> There was a GSP display issue a while back
01:07fdobridge: <gfxstrand> That tended to kill nouveau on boot
01:07fdobridge: <gfxstrand> It's been fixed for a while, though.
01:09fdobridge: <Joshie with Max-Q Design> Kinda annoying that the only tests for QUEUE_FOREIGN_EXT are with modifiers
01:09fdobridge: <Joshie with Max-Q Design> Kinda annoying that the only cts tests for QUEUE_FOREIGN_EXT are with modifiers (edited)
01:10fdobridge: <Joshie with Max-Q Design> but kinda also makes sense
01:10fdobridge: <gfxstrand> Yeah, it's about the only place where it matters
01:13fdobridge: <Joshie with Max-Q Design> I was surprised we do *nothing* for QUEUE_EXTERNAL
01:13fdobridge: <Joshie with Max-Q Design> heh
01:47fdobridge: <Joshie with Max-Q Design> Same issue there
01:50fdobridge: <gfxstrand> Ugh...
06:07cernico: Maxas githubs project from Scott Gray and his articles i had seen. So Maxwell introduces a modified/improved scheduling over the kepler and earlier SIMD hw. So introduction of control codes in scheduling blocks game to light.
11:43fdobridge: <karolherbst🐧🦀> how much of that is required to play some games through proton?
13:47fdobridge: <mohamexiety> well for starters there's no sparse as kepler can't do standard shapes properly
13:47fdobridge: <mohamexiety> then again I guess d3d12 isn't exactly going to be viable on kepler anyway
13:47fdobridge: <karolherbst🐧🦀> yeah.. not really caring about d3d12 here
13:47fdobridge: <karolherbst🐧🦀> more about d3d10 and d3d11 games
13:52fdobridge: <karolherbst🐧🦀> on the other hand the `780 Ti` is like a `GeForce GTX 1660`, which isn't all that slow actually
13:52fdobridge: <karolherbst🐧🦀> maybe more closer to a `GTX 1650 SUPER`
14:00fdobridge: <mohamexiety> it's closer to the 1650 on windows. kepler really didn't age well
14:00fdobridge: <mohamexiety> maxwell was a big jump (and also a big shift)
14:12fdobridge: <gfxstrand> Images are pretty important...
14:14fdobridge: <gfxstrand> Honestly, images are totally doable. @mohamexiety is going to get to write the CPU detiling code pretty soon and it's just porting that to the GPU. It's just annoying.
14:14fdobridge: <gfxstrand> Actually, we might not need to do shader tiling. Maybe it's just format conversation. 🤔
14:15fdobridge: <karolherbst🐧🦀> yeah.. it should only be format conversion
14:15fdobridge: <karolherbst🐧🦀> sucks for perf, but whatever
14:15fdobridge: <karolherbst🐧🦀> it's only relevant if it's a real shader image and not a texture
14:15fdobridge: <gfxstrand> But still, that's enough to be a headache. Most games really want `shaderStorageImageRead/WriteWithoutFormat`
14:15fdobridge: <karolherbst🐧🦀> mhhh
14:16fdobridge: <karolherbst🐧🦀> but one of those have to be emulated on newer gens anyway, no?
14:16fdobridge: <gfxstrand> Nope
14:16fdobridge: <karolherbst🐧🦀> ohh right..
14:16fdobridge: <gfxstrand> On Maxwell+, your just suld/sust and off you go
14:16fdobridge: <karolherbst🐧🦀> then on older gens only one of them is missing
14:17fdobridge: <gfxstrand> Kepler may have stores. Those are required by GL.
14:17fdobridge: <karolherbst🐧🦀> yeah
14:17fdobridge: <karolherbst🐧🦀> so only reads without format need to be emulated
14:18fdobridge: <gfxstrand> Yeah but those are the ones that cost you perf. 😭
14:18fdobridge: <gfxstrand> And the big problem is that emulating them requires me to figure out virtual fiction calls.
14:19fdobridge: <karolherbst🐧🦀> what part of that
14:19fdobridge: <gfxstrand> That or emit a pile of ifs. Honestly there's probably the easy way.
14:19fdobridge: <karolherbst🐧🦀> there isn't really much to function calls in the ISA in the first place, it's pretty straight forward
14:20fdobridge: <karolherbst🐧🦀> it's mostly just work on the IR
14:20fdobridge: <gfxstrand> Looking at just how much garbage the NVIDIA compiler dumps out for certain things makes me feel less bad about nonsense like image format conversation.
14:20fdobridge: <gfxstrand> Yeah and that's not trivial.
14:20fdobridge: <karolherbst🐧🦀> the expensive part is the load itself anyway
14:20fdobridge: <gfxstrand> I want to do it anyway, though.
14:20fdobridge: <karolherbst🐧🦀> register saving is probably the only hard part here
14:21fdobridge: <gfxstrand> The expensive part is the stall which you can't schedule away if it's in a vfunc. 🫤
14:21fdobridge: <gfxstrand> But whatever. We'll figure it out if we have to.
14:22fdobridge: <karolherbst🐧🦀> what do you mean by virtual function anyway? Call to functions aren't really that much different to normal branches
14:24fdobridge: <karolherbst🐧🦀> on pre turing call just pushes to the stack as well
14:24fdobridge: <karolherbst🐧🦀> and you have an API depth system value you may or may not increase
20:07fdobridge: <butterflies> fun fact: NVIDIA somehow shipped a D3D11 driver on Fermi
20:07fdobridge: <butterflies> fun fact: NVIDIA somehow shipped a D3D12 driver on Fermi (edited)
20:08fdobridge: <butterflies> D3D12 on Fermi came in the very last releases before EOL though, FL11_0
20:50fdobridge: <mohamexiety> yeah
20:50fdobridge: <mohamexiety> kepler is FL11_0 too iirc
20:51fdobridge: <mohamexiety> I used the Fermi D3D12 driver a few times when I had a GT 525M laptop. it was actually usually worse LOL