00:10fdobridge: <gfxstrand> @airlied While we're messing about, it'd be great if you could rebase the linux-firmware copr
00:11fdobridge: <gfxstrand> I've updated grub and am attempting an install
00:27fdobridge: <airlied> okay throwing into copr in a fewmins
00:35fdobridge: <airlied> https://copr.fedorainfracloud.org/coprs/airlied/nouveau-gsp/build/6593115/ will be it when it finished
00:38fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> NAK is still too slow to replace codegen (and it only supports Turing)
00:39fdobridge: <gfxstrand> Thanks!
00:52fdobridge: <airlied> okay dropped in on my f39 machine and installed a kernel with nouveau.config=NvGspRm=1 on command line and it at least boots 🙂
01:04fdobridge: <gfxstrand> Woo
01:09fdobridge: <gfxstrand> I've got my 780 in my box right now and not enough power for both that and the 2060 (1 kW PSU coming tomorrow) so I'll play with it later.
03:17fdobridge: <airlied> okay sent the gsp pullreq to Linus, we shalll see if we make 6.7 or 6.8
03:30fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Hopefully he doesn't accidentally drop it 🍁
03:32fdobridge: <airlied> I said it was optional for 6.7 so he will hopefully tell me yes or no 🙂
08:54fdobridge: <rhed0x> assuming it does get merged, what are the next steps?
08:58fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Basically you have to install the GSP firmware and set a kernel command line option on pre-Ada (once you've installed the new kernel)
09:00fdobridge: <rhed0x> no, I mean what work needs to happen on it
09:04fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> The first part (that has various display fixes) has already been merged so hwmon support is one of the missing features when using GSP
09:49fdobridge: <airlied> Not sure how hwmon would work, also svm faults need work
10:36fdobridge: <karolherbst🐧🦀> could have a worker thread which polls GSP every 10 seconds or so
10:36fdobridge: <karolherbst🐧🦀> and with `*sleep_range` might not even hurt too much
10:36fdobridge: <karolherbst🐧🦀> and then hwmon just returns cached values
10:42fdobridge: <karolherbst🐧🦀> what's the actual problem here btw?
10:42fdobridge: <karolherbst🐧🦀> we do have the interfaces to GSP for that afaik
11:24fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> 10 seconds 🐢
11:24fdobridge: <karolherbst🐧🦀> well.. it's good enough
11:24fdobridge: <karolherbst🐧🦀> I think nvidia polls like every 5 seconds?
11:24fdobridge: <karolherbst🐧🦀> or maybe they just cache it?|
11:25fdobridge: <karolherbst🐧🦀> something something
12:30mupuf: airlied: oh, I was just coming to ask about the GSP situation :D
12:31mupuf: "hwmon support is one of the missing features when using GSP" --> I feel called out
12:32mupuf: karolherbst: please don't do that. Can't you call the GSP every time someone asks for the temperature?
12:32mupuf: Just set the update property to 5s, and the userspace will know how often to poll
12:33karolherbst: the thing is, that some people use like UI applications
12:33mupuf: whoever gets to it first, I have been allowed to spend some time doing review for Nouveau... so I guess hwmon would be something I can actually help with
12:33karolherbst: and they poll everything at once
12:33karolherbst: and they might do so every second
12:33karolherbst: and each value probably would be one RPC call
12:33mupuf: and? It's not efficient but... does it matter?
12:33karolherbst: it might be too slow
12:34karolherbst: or causes other issues
12:34karolherbst: also not very efficient anyway
12:34mupuf: I would hope the RPC interface would be fast-enough for these not to matter
12:34karolherbst: I think there is just one RPC to give us everything at once
12:34mupuf: premature optimization is the root of all evil ;)
12:34mupuf: well, at least there is that
12:34karolherbst: we could also just cache it for x seconds
12:34mupuf: yeah, that sounds sane
12:35karolherbst: like hwmon does a request, we fetch everything and then just use that for a while
12:35mupuf: +1 for that
12:37mupuf: Honestly, the real question here is going to be the mapping from GSP to HWMON's interface
12:37karolherbst: yeah
12:38karolherbst: some things like temperature are probably straightforward
12:38karolherbst: maybe we even have fan rpms
12:38karolherbst: power consumption? probably as well, but the question is doe we get one or multiple values?
12:38karolherbst: a lot of unknowns :)
12:39mupuf: oh, it isn't documented what they expose?
12:39mupuf: I would expect somewhat the same structure as what the thermal vbios table would container
12:39karolherbst: uhh.. I just don't remember, I've looked at the header though
12:39mupuf: contain*
12:39karolherbst: let me find it
12:46karolherbst: can't find it anymore :'(
14:24fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I just updated the nouveau-fw-gsp package to the more recent firmware: https://aur.archlinux.org/cgit/aur.git/commit/PKGBUILD?h=nouveau-fw-gsp&id=59b755901bdecfa6c93d7eeaed7d2edc56dc5b90
15:21fdobridge: <gfxstrand> Annoyingly, `NVK_DEBUG=push_dump` doesn't dump QMDs...
15:21fdobridge: <karolherbst🐧🦀> yeah.. I've talked with @marysaka about that
15:21fdobridge: <karolherbst🐧🦀> the dumper might want to get a context and parse some of the interesting things
15:21fdobridge: <karolherbst🐧🦀> so it could dump QMDs, shaders, macros, etc...
15:22fdobridge: <karolherbst🐧🦀> or maybe a callback to the runtime to return the blob
15:22fdobridge: <karolherbst🐧🦀> something something
15:22fdobridge: <gfxstrand> Yeah, that's the way the Intel dumper works.
15:27fdobridge: <marysaka> yeah...
15:33fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> That term seems pretty obscure (does it mean `Queue Meta Data`?)
15:34fdobridge: <karolherbst🐧🦀> correct
15:38fdobridge: <gfxstrand> Ugh... we're not handling relocs
15:38fdobridge: <karolherbst🐧🦀> do we need them for kepler?
15:38fdobridge: <gfxstrand> Where do those even come from?
15:39fdobridge: <karolherbst🐧🦀> ohh.. you mean the codegen stuff? uhhh
15:39fdobridge: <karolherbst🐧🦀> that shouldn't happen after tesla...
15:39fdobridge: <karolherbst🐧🦀> ehh wait..
15:39fdobridge: <karolherbst🐧🦀> we do...
15:39fdobridge: <karolherbst🐧🦀> pain
15:39fdobridge: <karolherbst🐧🦀> it's for the builtin stuff 🥲
15:40fdobridge: <karolherbst🐧🦀> @gfxstrand just lower opcodes
15:40fdobridge: <karolherbst🐧🦀> in nir
15:40fdobridge: <gfxstrand> Only for OP_CALL
15:40fdobridge: <karolherbst🐧🦀> yeah
15:40fdobridge: <karolherbst🐧🦀> that's for integer division mostly I think
15:40fdobridge: <karolherbst🐧🦀> or so..
15:40fdobridge: <karolherbst🐧🦀> anyway
15:40fdobridge: <karolherbst🐧🦀> lower it on the nir level
15:42fdobridge: <gfxstrand> Yeah, I'm not worried about that. I was more worried I had relocations
15:42fdobridge: <gfxstrand> But I don't. I added an assert
15:48fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> How important is lowering for NAK? :ferris:
15:49fdobridge: <gfxstrand> Every compiler employs some amount of lowering somewhere. NAK tries to do as much up-front in NIR as it can.
15:49fdobridge: <gfxstrand> There isn't even control-flow in this shader. How is it going wrong depending on address?!?
16:00fdobridge: <karolherbst🐧🦀> that would rather sound like an alignment issue....
16:00fdobridge: <karolherbst🐧🦀> but yeah...
16:25fdobridge: <gfxstrand> Alignment doesn't make sense either
16:42fdobridge: <gfxstrand> 0x000, 0x100, 0x200, 0x300 all work, 0x400 doesn't
16:43fdobridge: <gfxstrand> I wonder if the GPU is tiling my shader buffer
16:43fdobridge: <gfxstrand> I wonder if the GPU is tiling my shader heap (edited)
16:55fdobridge: <karolherbst🐧🦀> there is some cache involved, but not sure how much that one is relevant here
17:00fdobridge: <gfxstrand> Okay, playing with PTE kinds is definitely doing something. 🤦🏻♀️
17:01fdobridge: <gfxstrand> IDK if I should be poking at the old UAPI or not...
17:02fdobridge: <gfxstrand> PTE stuff definitely changed at Maxwell
17:02fdobridge: <gfxstrand> pascal, rather
17:04fdobridge: <karolherbst🐧🦀> yeah.. might be worth a try to see if it's indeed an UAPI issue or not
17:04fdobridge: <karolherbst🐧🦀> but also.. pain
17:04fdobridge: <karolherbst🐧🦀> not sure why it would really matter
17:05fdobridge: <gfxstrand> Like I said, this feels a whole lot like someone is tiling my shader heap
17:05fdobridge: <gfxstrand> Which sounds like PTEs messed up somehow
17:05fdobridge: <gfxstrand> the old UAPI takes a different, presumably working, path.
17:06fdobridge: <karolherbst🐧🦀> mhhh.. yeah, could be
17:09fdobridge: <gfxstrand> Old API doesn't seem any better
17:13fdobridge: <gfxstrand> I wonder of this is why the GL driver likes to push so much stuff through DMAs
17:17fdobridge: <gfxstrand> Like, it can't be my map being screwed up. 0x400 is still inside a tile
17:18fdobridge: <gfxstrand> Like, it can't be my map being screwed up. 0x400 is still inside a page (edited)
17:19fdobridge: <gfxstrand> I bet my shader at 0x400 is still somewhere. I just need to find it.
17:39fdobridge: <karolherbst🐧🦀> but where would it be honestly?
17:40fdobridge: <karolherbst🐧🦀> maybe it's just some cache flushing afterall
17:45fdobridge: <gfxstrand> If I fill the whole heap with my shader....
17:46fdobridge: <gfxstrand> It's at 0x0000, 0x0100, 0x0200, 0x0300, 0x2000, 0x1f00, 0x1e00, 0x1d00, 0x1c00
17:46fdobridge: <gfxstrand> I can't find it anywhere else.
17:46fdobridge:<karolherbst🐧🦀> odd
17:46fdobridge: <gfxstrand> It's not at 0x0400, 0x0500, 0x0600, 0x0700, 0x1a00, or 0x1b00
17:47fdobridge: <karolherbst🐧🦀> what if you upload it via dma?
17:47fdobridge: <karolherbst🐧🦀> like the command buffer stuff
17:47fdobridge: <gfxstrand> I haven't tried that
17:47fdobridge: <gfxstrand> We don't really have infra for that in NVK right now
17:47fdobridge: <gfxstrand> I'm debating if I type it or keep going
17:47fdobridge: <karolherbst🐧🦀> mhh, yeah...
17:48fdobridge: <karolherbst🐧🦀> it's good to have as a helper anyway
17:48fdobridge: <karolherbst🐧🦀> is the buffer coherent and everything?
17:48fdobridge: <karolherbst🐧🦀> (and also mapped as such)
17:48fdobridge: <gfxstrand> When my shader is at 0x0000, everything works.
17:48fdobridge: <gfxstrand> Shader executes, dEQP checks the result buffer okay
17:48fdobridge: <gfxstrand> And the shader writes 9.0f to a buffer, so I'm pretty sure it's doing things
17:49fdobridge: <karolherbst🐧🦀> anyway, I'd check how the buffer is created and if it's a coherent one, because if not, that _might_ explain things
17:49fdobridge: <gfxstrand> hrm...
17:50fdobridge: <gfxstrand> Yeah, placing it in GART works.
17:50fdobridge: <gfxstrand> Maybe PCIe maps are broken
17:51fdobridge: <karolherbst🐧🦀> pain
17:51fdobridge: <gfxstrand> Not hitting some VRAM banks or something
17:51fdobridge: <karolherbst🐧🦀> but yeah.. uploading data via the pushbuffer solves all those caching issues :ferrisUpsideDown:
17:51fdobridge: <gfxstrand> Yeah but it also sucks. 😦
17:52fdobridge: <karolherbst🐧🦀> yeah...
17:52fdobridge: <karolherbst🐧🦀> it's fine in GL, but I can see how it's annoying in vulkan
17:53fdobridge: <gfxstrand> Yeah, in Vulkan, I basically have to have a queue of uploads that gets flushed periodically
17:53fdobridge: <karolherbst🐧🦀> yeah
17:53gfxstrand: dakr: ^^
17:53gfxstrand: If you want something to look into
17:53fdobridge: <karolherbst🐧🦀> I wonder if we can make VRAM mappings coherent or something...
17:53gfxstrand: I can get VRAM maps on Kepler but they seem very weirdly busted.
17:54fdobridge: <gfxstrand> I mean, typically they-re write-combine
17:54fdobridge: <gfxstrand> But who knows if they actually work.
17:54fdobridge: <karolherbst🐧🦀> yeah...
17:54fdobridge: <karolherbst🐧🦀> no idea
17:54fdobridge: <karolherbst🐧🦀> I'm sure there is a reason we've never used it in gl
17:54fdobridge: <karolherbst🐧🦀> like.. nvidia probably also never done it in gl this way
17:54fdobridge: <gfxstrand> I'm sure there's *a* reason. I'm not sure it's a good one.
17:54fdobridge: <gfxstrand> 😝
17:55fdobridge: <karolherbst🐧🦀> right
17:55fdobridge: <karolherbst🐧🦀> but some parts are also just "copy nvidia" 😄
17:55fdobridge: <karolherbst🐧🦀> but yeah...
17:55fdobridge: <karolherbst🐧🦀> I think in GL nvidia generally uploads via the push buffers
17:55fdobridge: <karolherbst🐧🦀> maybe they changed it
17:55fdobridge: <karolherbst🐧🦀> but I know that in the traces I've seen in the past they don't map upload them
17:58fdobridge: <gfxstrand> The thing that really sucks about having to do them via DMA is that shader deletes have to check if there's an upload pending for that shader and flush the queue if there is.
17:58fdobridge: <gfxstrand> Or we have to refcount or something
17:58fdobridge: <karolherbst🐧🦀> yeah...
17:58fdobridge: <karolherbst🐧🦀> or just use GART on kepler 🤷
17:58fdobridge: <gfxstrand> Which would mean a delete can fail. Yeah, we don't want that.
17:58fdobridge: <gfxstrand> GART on kepler is the short-term plan
17:59fdobridge: <karolherbst🐧🦀> mhhh
17:59fdobridge: <karolherbst🐧🦀> could track dirty memory ranges and do the upload where the heap gets bound as well
17:59fdobridge: <karolherbst🐧🦀> ehh wait
18:00fdobridge: <karolherbst🐧🦀> it keeps the same VM address, no?
18:00fdobridge: <gfxstrand> yeah
18:00fdobridge: <karolherbst🐧🦀> mhhh
18:00fdobridge: <gfxstrand> well, sort-of
18:00fdobridge: <gfxstrand> Ugh... resizes
18:00fdobridge: <karolherbst🐧🦀> 🙂
18:00fdobridge: <gfxstrand> But that's okay. I can flush the queue on resize.
18:00fdobridge: <karolherbst🐧🦀> yeah..
18:00fdobridge: <karolherbst🐧🦀> what you could do is to upload dirty ranges of the shader heap bo on dynamic state tracking
18:00fdobridge: <karolherbst🐧🦀> and use those helpers we have in mesa for that
18:00fdobridge: <gfxstrand> But ugh... This is a hidden queue which means that in a multi-queue setup, every vkQueueSubmit() has to flush it and wait on it with a syncobj.
18:01fdobridge: <gfxstrand> yuck
18:01fdobridge: <gfxstrand> I can abstract around it but yuck
18:01fdobridge: <karolherbst🐧🦀> uhhh..
18:01fdobridge: <gfxstrand> GART for now seems better
18:01fdobridge: <karolherbst🐧🦀> right
18:01fdobridge: <karolherbst🐧🦀> yeah... pre turing that all sucks 😄
18:01fdobridge: <karolherbst🐧🦀> I can totally see why nvidia moved aware from that
18:01fdobridge: <karolherbst🐧🦀> *away
18:02fdobridge: <gfxstrand> Maxwell seems fine
18:03fdobridge: <karolherbst🐧🦀> pain
18:03fdobridge: <karolherbst🐧🦀> I think there have been signfiicant changes in the memory controller on maxwell, so that might be related
18:05fdobridge: <gfxstrand> ```
18:05fdobridge: <gfxstrand> Test run totals:
18:05fdobridge: <gfxstrand> Passed: 283/303 (93.4%)
18:05fdobridge: <gfxstrand> Failed: 0/303 (0.0%)
18:05fdobridge: <gfxstrand> Not supported: 20/303 (6.6%)
18:05fdobridge: <gfxstrand> Warnings: 0/303 (0.0%)
18:05fdobridge: <gfxstrand> Waived: 0/303 (0.0%)
18:05fdobridge: <gfxstrand> ```
18:05fdobridge: <gfxstrand> That's ssbo.basic_types
18:06fdobridge: <gfxstrand> That would certainly explain it, yes.
18:07fdobridge: <karolherbst🐧🦀> not sure which gen it was exactly, but they changed a lot there
18:07fdobridge: <karolherbst🐧🦀> we kinda had 2.5 gens of kepler and 2 gens of maxwell
18:07fdobridge: <karolherbst🐧🦀> what kepler and maxwell did you test it on?
18:08fdobridge: <karolherbst🐧🦀> also
18:08fdobridge: <karolherbst🐧🦀> looks like 2nd gen maxwell got the new mmu code
18:09fdobridge: <karolherbst🐧🦀> but yeah..
18:10fdobridge: <karolherbst🐧🦀> 2nd gen maxwell also has updated PTE types
18:15fdobridge: <gfxstrand> But my 750 TI is fine
18:15fdobridge: <gfxstrand> That should be Maxwell A, shouldn't it?
18:16fdobridge: <gfxstrand> NVIDIA website says it
18:16fdobridge: <gfxstrand> NVIDIA website says it is (edited)
18:21gfxstrand: dakr: My NVIDIA guy who knows everything doesn't remember VRAM mappings being broken on Kepler so he suspects we're doing something wrong in nouveau.ko
18:21gfxstrand: dakr: Sadly, I don't think we really have a good reference to look at for HW that old. OGK doesn't even have Kepler headers.
18:22fdobridge: <gfxstrand> ```
18:22fdobridge: <gfxstrand> Test run totals:
18:22fdobridge: <gfxstrand> Passed: 5952/12160 (48.9%)
18:22fdobridge: <gfxstrand> Failed: 17/12160 (0.1%)
18:22fdobridge: <gfxstrand> Not supported: 6191/12160 (50.9%)
18:22fdobridge: <gfxstrand> Warnings: 0/12160 (0.0%)
18:22fdobridge: <gfxstrand> Waived: 0/12160 (0.0%)
18:22fdobridge: <gfxstrand> ```
18:22fdobridge: <gfxstrand> That's all the SSBO tests. The missing 17 are probably codegen failing to RA. 🙃
18:23fdobridge: <gfxstrand> We're off to the races now!
18:25fdobridge: <gfxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26035
18:25fdobridge: <gfxstrand> Time to do another CTS run
18:26fdobridge: <karolherbst🐧🦀> mhhhh.. yeah, that's odd, as that one should use the same kernel code as kepler...
18:26fdobridge: <karolherbst🐧🦀> but it can still be a nouveau bug
18:30fdobridge: <gfxstrand> imageLoad seems busted
18:33fdobridge: <gfxstrand> Does suld not handle formats on Kepler?
18:34fdobridge: <gfxstrand> vulkaninfo says no...
18:38fdobridge: <gfxstrand> I guess it's possible bindless is busted
18:42fdobridge: <gfxstrand> What is it fetching from c0?!?
18:42fdobridge: <gfxstrand> What is it fetching from c1?!? (edited)
18:44fdobridge: <gfxstrand> Oh, this is the image lowering nonsense
18:45fdobridge: <karolherbst🐧🦀> correct
18:46fdobridge: <gfxstrand> Yeah, there's no way we can plumb that through right now
18:46fdobridge: <gfxstrand> That sucks
18:46fdobridge: <karolherbst🐧🦀> I think nvidia limited kepler to vk 1.2
18:46fdobridge: <gfxstrand> That was for other reasons
18:46fdobridge: <karolherbst🐧🦀> right..
18:46fdobridge: <karolherbst🐧🦀> but yeah... for kepler we can't do format lowering in any sane way
18:47fdobridge: <karolherbst🐧🦀> *conversion
18:47fdobridge: <gfxstrand> The problem is that we need to pull all this stuff from the descriptor set, not some compiler-decided cbuf
18:47fdobridge: <gfxstrand> We solved this on Intel because everything pre-SKL needs lowering
18:47fdobridge: <karolherbst🐧🦀> pain
18:48fdobridge: <gfxstrand> It's not too bad. You just do the lowering in NIR
18:49fdobridge: <karolherbst🐧🦀> if we already have it, it might indeed be not too bad
18:50fdobridge: <gfxstrand> We did that lowering in NIR because then we could share between vec4 and fs back-ends
18:50fdobridge: <gfxstrand> Then I just had to rework it a bit and I could deploy it for VK
19:05fdobridge: <gfxstrand> It sucks that so much stuff uses storage images
19:05fdobridge: <gfxstrand> `Pass: 12160, Fail: 6, Crash: 2202, Skip: 130130, Flake: 2, Duration: 3:25, Remaining: 1:31:04`
19:06fdobridge: <karolherbst🐧🦀> oof
19:07fdobridge: <gfxstrand> Still, only 6 fails in the first 3.5 minutes is pretty encouraging.
19:08fdobridge: <gfxstrand> If it's less than 200 or so and most of my crashes are the assert I just added, I'm going to say Kepler works.
19:13fdobridge: <karolherbst🐧🦀> yeah.. there isn't really good reasons why kepler shouldn't 🙂
19:13fdobridge: <gfxstrand> Well, VRAM maps. 😛
19:15fdobridge: <karolherbst🐧🦀> right.. but that's an unexpected reason 😛
19:15fdobridge: <karolherbst🐧🦀> I'd be more worried about the nil stuff
19:15fdobridge: <karolherbst🐧🦀> but also...
19:24fdobridge: <gfxstrand> NIL checks out according to my image layout fuzzer
19:25fdobridge: <karolherbst🐧🦀> col
19:25fdobridge: <karolherbst🐧🦀> *cool
19:25fdobridge: <gfxstrand> Well, the machine didn't survive
19:25fdobridge: <gfxstrand> Let's try this again. 😅
19:26fdobridge: <karolherbst🐧🦀> :ferrisUwU:
20:05fdobridge: <gfxstrand> RIP
20:05fdobridge: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1170091440165093436/message.txt?ex=6557c792&is=65455292&hm=0098426656b576f4e85672471209bab6b6819d7394d95a3f45da6e51babadf0e&
20:07fdobridge: <karolherbst🐧🦀> mhhhhh
20:09fdobridge: <karolherbst🐧🦀> I wished I'd know what the reg nouveau waits on to change would do
20:09fdobridge: <karolherbst🐧🦀> `PFIFO_FLUSH.FLUSH_CTRL` mhhh
20:09fdobridge: <karolherbst🐧🦀> bit 2 means `BUSY`... odd
20:10fdobridge: <karolherbst🐧🦀> @gfxstrand mind dumping the content of that reg on timeout? though not sure if `nvkm_msec` returns an error or not...
20:20fdobridge: <gfxstrand> It took the whole machine down. 😢
20:21fdobridge: <karolherbst🐧🦀> yeah....
20:21fdobridge: <karolherbst🐧🦀> it's kinda a known issue
20:21fdobridge: <karolherbst🐧🦀> somehow the GPU stops doing stuff
20:22fdobridge: <karolherbst🐧🦀> but a bug like that is also a pain to trigger
20:22fdobridge: <karolherbst🐧🦀> so what it's doing is to trigger a cache flush, but that never completes
20:23fdobridge: <karolherbst🐧🦀> and the GPU not doing anything else also kinda means it's kinda toast
20:27fdobridge: <gfxstrand> deqp-runner on Kepler hits it pretty reliably. 😂
20:40fdobridge: <karolherbst🐧🦀> fair enough. Maybe I can try to figure this one out then
21:11fdobridge: <gfxstrand> I'm running on @airlied's GSP branch FWIW
22:32fdobridge: <karolherbst🐧🦀> mhhh
22:32fdobridge: <karolherbst🐧🦀> given that I've seen that error before, I doubt it's related 😄
22:32fdobridge: <karolherbst🐧🦀> but yeah.. maybe makes it more likely or whatever
22:34fdobridge: <phomes> I was thinking that it could be a fun project for the evening to see if I could fix a visual glitch in Serious Sam on nvk
22:34fdobridge: <phomes> I have been looking at various things without spotting anything wrong. Perhaps someone here has an idea of where I should even begin to look? 🙂
22:35fdobridge: <phomes> https://cdn.discordapp.com/attachments/1034184951790305330/1170129050904297552/Screenshot_from_2023-11-03_22-20-06.png?ex=6557ea99&is=65457599&hm=d9017e7482dbb77dd55567b425ced61513fbe7325265cd17b9a07bd864e7a803&
22:35fdobridge: <phomes> The shadows drawn in the top left corner should have covered the entire screen. The long lines along the top and left should not be there
22:35fdobridge: <karolherbst🐧🦀> the shadow?
22:35fdobridge: <karolherbst🐧🦀> right...
22:36fdobridge: <karolherbst🐧🦀> mhhhh
22:36fdobridge: <karolherbst🐧🦀> looks like some scaling problem
22:36fdobridge: <phomes> I looked at a capture with RenderDoc and there we do have a 2D color attachment of the same size as the screen
22:36fdobridge: <karolherbst🐧🦀> mhh
22:36fdobridge: <karolherbst🐧🦀> MSAA enabled/disabled?
22:36fdobridge: <phomes> the color attachment is what is in the lower right
22:36fdobridge: <karolherbst🐧🦀> but uhhh
22:36fdobridge: <karolherbst🐧🦀> weird
22:37fdobridge: <karolherbst🐧🦀> it looks like something up with coords in texturing
22:37fdobridge: <karolherbst🐧🦀> it's obviously clamping to the border
22:37fdobridge: <karolherbst🐧🦀> it's obviously clamping to the edge (edited)
22:37fdobridge: <phomes> MSAA disabled
22:38fdobridge: <phomes> samler is clamp to edge for U and V
22:38fdobridge: <karolherbst🐧🦀> right
22:38fdobridge: <karolherbst🐧🦀> make it look funky my switching the clamp mode to repeat 😄
22:38fdobridge: <karolherbst🐧🦀> *by
22:39fdobridge: <karolherbst🐧🦀> I'd at least try it, to verify it's something with coords
22:39fdobridge: <phomes> I will try that
22:39fdobridge: <karolherbst🐧🦀> please post a screenshot as I want to know how it looks like 😄
22:42fdobridge: <phomes> https://cdn.discordapp.com/attachments/1034184951790305330/1170130815699648603/Screenshot_from_2023-11-03_23-41-47.png?ex=6557ec3e&is=6545773e&hm=2b8aad7550fd67aa270cf9964ea5a9662f44c48734a7cebb75aac0f084d47df8&
22:42fdobridge: <karolherbst🐧🦀> mhhh
22:42fdobridge: <karolherbst🐧🦀> I was thinking mirror_repeat, but uhhh
22:42fdobridge: <karolherbst🐧🦀> 😄
22:42fdobridge:<karolherbst🐧🦀> anyway
22:43fdobridge: <karolherbst🐧🦀> so it's something fucked up with coords
22:43fdobridge: <karolherbst🐧🦀> I'm sure mirror_repeat looks way better
22:43fdobridge: <karolherbst🐧🦀> not sure what's the proper value here in vulkan
22:44fdobridge: <karolherbst🐧🦀> `VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT`
22:44fdobridge: <karolherbst🐧🦀> anyway...
22:45fdobridge: <phomes> https://cdn.discordapp.com/attachments/1034184951790305330/1170131545131077632/Screenshot_from_2023-11-03_23-44-39.png?ex=6557ecec&is=654577ec&hm=e4bef8968aafe1b6a19df5b25550490d8d76b1fe4789d13eef03bf826461dc65&
22:45fdobridge: <karolherbst🐧🦀> anything suspicious the shader is doing with the coords
22:45fdobridge: <karolherbst🐧🦀> I expected that to look more cursed
22:46fdobridge: <karolherbst🐧🦀> uhh.. it's spir-v... so I guess that might be hard to figure out what's wrong here
22:46fdobridge: <karolherbst🐧🦀> @phomes anyway, could file a bug and attach traces or something... dunno
22:47fdobridge: <phomes> I can file a bug of course. I just wanted to have a little fun tonight and see if I could figure this out 🙂
22:47fdobridge: <karolherbst🐧🦀> yeah.. fair
22:48fdobridge: <karolherbst🐧🦀> anyway, image coords being messed up is the best I can help with 😄
23:09fdobridge: <mhenning> I assume you're using codegen? You can try setting `NV50_PROG_OPTIMIZE` to 0 or 1 to see if it's a backend optimization that's breaking stuff
23:10fdobridge: <mhenning> or set `NV50_PROG_DEBUG=1` to print out the shader assembly
23:13fdobridge: <mhenning> I'd probably be trying to narrow down whether the shader compiler is wrong or the rest of the driver
23:14fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> How does codegen assembly compare with NAK assembly? 🤔
23:16fdobridge: <phomes> @mhenning codegen and NAK both have same result
23:17fdobridge: <mhenning> the main difference is NAK instructions are all uppercase which makes me think the compiler is yelling at me
23:19fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I mean how optimized NAK assembly is when compared to codegen (vkcube only has 50-60 FPS with NAK)
23:20fdobridge: <mhenning> Okay, cool. then it's probably not in the backend - although it could still be a compiler bug if it's in one of NVK's lowering passes, which get shared across backends
23:20fdobridge: <phomes> NV50_PROG_OPTIMIZE=1 has same result
23:21fdobridge: <phomes> NV50_PROG_OPTIMIZE=0 crashes with "src/nouveau/codegen/nv50_ir_emit_gv100.cpp:144: void nv50_ir::CodeEmitterGV100::emitFormA(uint16_t, uint8_t, int, int, int): Assertion `insn->src(src0 & FA_SRC_MASK).getFile() == FILE_GPR' failed"
23:22fdobridge: <mhenning> I'm guessing that texture isn't mipmapped?
23:22fdobridge: <pac85> Mmm that's a screen space buffer seemingly sampled using models UVs
23:23fdobridge: <mhenning> woah, you're right - I missed that it's at a bit of an angle
23:24fdobridge: <mhenning> but also... what? how?
23:26fdobridge: <mhenning> maybe we bind the texture to the wrong slot?
23:27fdobridge: <pac85> It does get blended correctly (it gets multiplied with albedo)
23:27fdobridge: <pac85> (And this game has a weird format for shadow maps so if it was bound to that slot you'd definitely get a different result)
23:28fdobridge: <pac85> (And this game has a weird format for light maps so if it was bound to that slot you'd definitely get a different result) (edited)
23:41fdobridge: <gfxstrand> How do I tell if I successfully got GSP?
23:42fdobridge: <karolherbst🐧🦀> the errors look different
23:42fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Run `sensors`
23:42fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> If there's no nouveau then you're using GSP
23:47fdobridge: <gfxstrand> Ah. Updated linux-firmware but not nvidia-firmware
23:53fdobridge: <gfxstrand> It's time we put GSP through it's paces. 😁
23:53fdobridge: <gfxstrand> I feel air coming out of the back of my turing card
23:55fdobridge: <gfxstrand> I haven't seen a flake yet.... This is already promising