13:50AndrewR: karolherbst, thanks for looking into "my" bugs!
13:53karolherbst: no problem
16:18fdobridge: <gfxstrand> @airlied What branches/patches do I need to cobble together to get this supposedly working GSP?
17:08fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I might try to do that
18:08fdobridge: <airlied> https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/nouveau-fw-535.113.01?ref_type=heads
18:09fdobridge: <airlied> https://copr.fedorainfracloud.org/coprs/airlied/nouveau-gsp/
18:10fdobridge: <airlied> Find the nvidia-gpu-firmware rpm for your distro from the copr
18:10fdobridge: <airlied> You might have to downgrade the installed one
18:17fdobridge: <airlied> @gfxstrand there may be some regressions vs non gsp, but let me know
19:10fdobridge: <airlied> I should probably rebase the copr to avoid downgradeling
19:19fdobridge: <gfxstrand> @airlied doesn't build...
19:20fdobridge: <airlied> Hmm missing file or something?
19:20fdobridge: <gfxstrand> ```
19:20fdobridge: <gfxstrand> drivers/usb/typec/altmodes/displayport.c: In function ‘dp_altmode_vdm’:
19:20fdobridge: <gfxstrand> drivers/usb/typec/altmodes/displayport.c:309:33: error: too few arguments to function ‘drm_connector_oob_hotplug_event’
19:20fdobridge: <gfxstrand> 309 | drm_connector_oob_hotplug_event(dp->connector_fwnode);
19:20fdobridge: <gfxstrand> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19:20fdobridge: <gfxstrand> In file included from drivers/usb/typec/altmodes/displayport.c:17:
19:20fdobridge: <gfxstrand> ./include/drm/drm_connector.h:1984:6: note: declared here
19:20fdobridge: <gfxstrand> 1984 | void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode,
19:20fdobridge: <gfxstrand> ```
19:24fdobridge: <gfxstrand> I think I typed an okay fix
19:24fdobridge: <airlied> ah missing fix, I threw it on the branch
19:25fdobridge: <airlied> it's already in drm-next so just never landed it elsewhere
19:40fdobridge: <karolherbst🐧🦀> the timer resolution on nvidia is 1000ns, right?
19:43fdobridge: <gfxstrand> ```
19:43fdobridge: <gfxstrand> grub2-mkrelpath: error: failed to get canonical path of `/boot/vmlinuz-6.6.0-rc7-nvk-uapi+'.
19:43fdobridge: <gfxstrand> ```
19:43fdobridge: <airlied> you going to make me install f39 aren't you
19:48fdobridge: <gfxstrand> IDK
19:48fdobridge: <gfxstrand> Oh, I should really be using an f39 config for this, shouldn't i?
19:49fdobridge: <airlied> I've kicked off an update on one laptop, it might be grub2 bug
19:51fdobridge: <gfxstrand> I was running out of space on /boot so that might have had something to do with it.
19:51fdobridge: <karolherbst🐧🦀> the eternal pain of /boot
19:54fdobridge: <gfxstrand> You made me install f39. 😛
19:55fdobridge: <airlied> ah yes not enough /boot will do bad thins
19:55fdobridge: <airlied> things
19:59fdobridge: <marysaka> nice reminder that I need to update all my fedora installs this week-end :vReiAgony:
20:02fdobridge: <airlied> @gfxstrand then you need to boot with nouveau.config=NvGspRm=1
20:02fdobridge: <airlied> if you are I assume not on Ada
20:03fdobridge: <marysaka> did the Ampere display issue got fixed?
20:04fdobridge: <airlied> there's an ampere display issue?
20:06fdobridge: <gfxstrand> Yeah, IDK what's going wrong with kernel installs
20:06fdobridge: <marysaka> yeah without this commented out the display init doesn't work https://gitlab.freedesktop.org/skeggsb/nouveau/-/commit/3b0b5f211852a86c5f539b4e4ec91e36927dc031#7a34141ffa70af07a2d14b7dedf8d3864519a0bc_65_73
20:07fdobridge: <gfxstrand> It's not copying vmlinuz to /boot
20:08fdobridge: <marysaka> I may have forgot to report it properly, if you want me to post it on some tracker let me know @airlied
20:09fdobridge: <airlied> doesn't work with GSP?
20:10fdobridge: <airlied> https://gitlab.freedesktop.org/drm/nouveau/-/issues is probably a place to go
20:12fdobridge: <gfxstrand> Yeah, `/usr/bin/install-kernel` is somehow busted
20:14fdobridge: <gfxstrand> And it's a binary so I can't easily debug it
20:14fdobridge: <airlied> just building a kernel now so will see in a minute and escalate if I can reproduce
20:15fdobridge: <gfxstrand> Of course it's part of the systemd package. 🤦🏻♀️
20:26fdobridge: <airlied> okay something is install bzImage instead of vmlinuz
20:26fdobridge: <airlied> fml
20:33fdobridge: <gfxstrand> 🍿
20:36fdobridge: <karolherbst🐧🦀> oh no.. now that NAK actually lands, I don't have a good excuse for ignoring rusticl bugs on nouveau anymore :ferrisCluelesser:
20:37fdobridge: <airlied> okay I've done what I'm best at, escalating it to a mailing list, I'll probably debug it for real later
20:37fdobridge: <airlied> @karolherbst just ask for function calls 😛
20:37fdobridge: <karolherbst🐧🦀> Faith.....
20:37fdobridge: <karolherbst🐧🦀> 😄
20:38fdobridge: <karolherbst🐧🦀> but anyway
20:38fdobridge: <karolherbst🐧🦀> rusticl on nvc0 ain't _that_ bad
20:38fdobridge: <karolherbst🐧🦀> though it throws bunch of `PUSH_NOT_ENOUGH_DATA`
20:38fdobridge: <karolherbst🐧🦀> but that could be my fault flushing in a cursed way on unused queues or something
20:39fdobridge: <karolherbst🐧🦀> though even then it shouldn't matter as we still emit the fence...
20:39fdobridge: <karolherbst🐧🦀> probably some driver bug
20:48fdobridge: <gfxstrand> Uh, IDK that you want to make nvc0 use NAK
20:52fdobridge: <karolherbst🐧🦀> mostly for CL, but then again.. maybe we just want a new driver
20:53fdobridge: <gfxstrand> Yeah, IDK.
20:53fdobridge: <karolherbst🐧🦀> or maybe it's just zink 😄
20:54fdobridge: <karolherbst🐧🦀> might want to give it a go on nvk
20:54fdobridge: <karolherbst🐧🦀> but I'd need `float_controls`
20:54fdobridge: <karolherbst🐧🦀> for correctness
21:20fdobridge: <karolherbst🐧🦀> `Pass 1995 Fails 309 Crashes 150 Timeouts 0: 100%` I think that's not tooooo bad
21:21fdobridge: <karolherbst🐧🦀> 100 fails are like conversion nonsense
21:21fdobridge: <karolherbst🐧🦀> 8/16 bit stuff is just broken and I don't know if I have the 🥄 to fix xodegen
21:21fdobridge: <karolherbst🐧🦀> *codegen
21:28fdobridge:<gfxstrand> plugs in a kepler
21:30fdobridge: <gfxstrand> ```
21:30fdobridge: <gfxstrand> crucible: info : ran 1000 tests
21:30fdobridge: <gfxstrand> crucible: info : pass 491
21:30fdobridge: <gfxstrand> crucible: info : fail 509
21:30fdobridge: <gfxstrand> crucible: info : skip 0
21:30fdobridge: <gfxstrand> crucible: info : lost 0
21:30fdobridge: <gfxstrand> ```
21:30fdobridge: <gfxstrand> Off to a good start. 😅
21:31fdobridge: <karolherbst🐧🦀> prolly something silly 😄
21:33fdobridge: <gfxstrand> Oh, my image layout code doesn't work right on Kepler. Probably some alignment thing that's different.
21:41fdobridge: <gfxstrand> ```
21:41fdobridge: <gfxstrand> crucible: info : ran 1000 tests
21:41fdobridge: <gfxstrand> crucible: info : pass 1000
21:41fdobridge: <gfxstrand> crucible: info : fail 0
21:41fdobridge: <gfxstrand> crucible: info : skip 0
21:41fdobridge: <gfxstrand> crucible: info : lost 0
21:41fdobridge: <gfxstrand> ```
21:41fdobridge: <gfxstrand> Nothing wrong with image layouts. Kepler just tiles funny
21:42fdobridge: <gfxstrand> My tests were asserting that (0, 0) is always at the first byte in the tile which isn't true on Kepler
21:42fdobridge: <orowith2os> depending on the size of the patch, one could argue that it was "only an x line patch" that went from 50% fails to 100% passes :v
21:43fdobridge: <gfxstrand> Good to know that the tiling changed, though. That'll come in handy for VK_EXT_host_copy
21:43fdobridge: <gfxstrand> Good to know that the tiling changed, though. That'll come in handy for VK_EXT_host_image_copy (edited)
21:51fdobridge: <gfxstrand> Looks like something's wrong with the instruction heap. That or I've got an alignment off somewhere
21:52fdobridge: <karolherbst🐧🦀> first instruction needs to be 0x80 aligned
21:53fdobridge: <karolherbst🐧🦀> but that should be the same on later gens until turing
21:58fdobridge: <karolherbst🐧🦀> @gfxstrand ohh before you go crazy on debugging: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25997/diffs?commit_id=84b086d994d18a24bed01a5094cba02ea1316cd2
21:58fdobridge: <karolherbst🐧🦀> might want to land that from that MR asap 😄
22:04fdobridge: <gfxstrand> Yeah, we're not landing that
22:04fdobridge: <gfxstrand> Oh, that one patch? Yeah.
22:05fdobridge: <karolherbst🐧🦀> yeah...
22:05fdobridge: <karolherbst🐧🦀> just that one patch
22:05fdobridge: <karolherbst🐧🦀> 😄
22:09fdobridge: <gfxstrand> Something is wrong on kepler with shader caching
22:09fdobridge: <gfxstrand> Or alignments
22:12fdobridge: <karolherbst🐧🦀> what's the error you are seeing?
22:23fdobridge: <gfxstrand> faults
22:24fdobridge: <gfxstrand> But it depends on shader address and other shaders uploaded previously
22:24fdobridge: <gfxstrand> it's fun like that
22:30fdobridge: <karolherbst🐧🦀> sounds annoying
22:30fdobridge: <gfxstrand> Okay, if my shader is uploaded at 0xff00, it's fine. At 0xfb00, it breaks
22:30fdobridge: <karolherbst🐧🦀> huh?
22:31fdobridge: <karolherbst🐧🦀> is that the first instruction or the header?
22:31fdobridge: <gfxstrand> uh... header?
22:31fdobridge: <gfxstrand> Oh, this is compute. No headers
22:31fdobridge: <karolherbst🐧🦀> well that would be invalid
22:31fdobridge: <karolherbst🐧🦀> ahhh
22:31fdobridge: <karolherbst🐧🦀> okay
22:31fdobridge: <karolherbst🐧🦀> for compute it's fine...
22:32fdobridge: <karolherbst🐧🦀> hope it's not some cache invalidation stuff or something...
22:32fdobridge: <karolherbst🐧🦀> but normally if there is an error the hardware complains loudly before executing the code
22:34fdobridge: <gfxstrand> Hrm... My error depends on where it's located... WTH?!?
22:34fdobridge: <gfxstrand> Now I'm getting OOR_ADDR instead of a fault
22:36fdobridge: <karolherbst🐧🦀> mhhh
22:36fdobridge: <karolherbst🐧🦀> maybe something up with the descriptors?
22:37fdobridge: <karolherbst🐧🦀> `OOR_ADDR` is I think an out of rance access on non global storage
22:37fdobridge: <karolherbst🐧🦀> maybe also on input/output stuff
22:37fdobridge: <karolherbst🐧🦀> but it's compute, soo...
22:38fdobridge: <gfxstrand> The error I get changes as I move the shader around
22:38fdobridge: <gfxstrand> Something funky is going on
22:38fdobridge: <karolherbst🐧🦀> yeah.. sounds like it
22:38fdobridge: <karolherbst🐧🦀> maybe the QMD is busted?
22:38fdobridge: <gfxstrand> Seems plausible
22:38fdobridge: <gfxstrand> But I suspect it's something else because apparently vkcube is busted
22:38fdobridge: <karolherbst🐧🦀> pain
22:38fdobridge: <gfxstrand> but QMDs are a real possibility, too.
22:38fdobridge: <gfxstrand> I'll look more tomorrow
23:20fdobridge: <airlied> @gfxstrand https://bugzilla.redhat.com/show_bug.cgi?id=2239008 is the bug
23:21fdobridge: <airlied> looks like updates-testing might have the fix, will test when I get back home