00:04fdobridge: <gfxstrand> Oh, would you look at that
00:04fdobridge: <gfxstrand> ```
00:04fdobridge: <gfxstrand> [ 568.313367] nouveau 0000:17:00.0: gsp:msg fn:103 len:0x78/0x58 res:0x62 resp:0x62
00:04fdobridge: <gfxstrand> [ 568.313380] msg: 00000000: 04 00 d0 c1 04 00 d0 c1 00 00 1d de 80 00 00 00 ................
00:04fdobridge: <gfxstrand> [ 568.313385] msg: 00000010: 62 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 b...8...........
00:04fdobridge: <gfxstrand> [ 568.313389] msg: 00000020: 00 00 00 00 04 00 d0 c1 00 00 00 00 00 00 00 00 ................
00:04fdobridge: <gfxstrand> [ 568.313392] msg: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00:04fdobridge: <gfxstrand> [ 568.313395] msg: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00:04fdobridge: <gfxstrand> [ 568.313398] msg: 00000050: 00 00 00 00 00 00 00 00 ........
00:04fdobridge: <gfxstrand> [ 568.313854] nouveau 0000:17:00.0: deqp-vk[25607]: VMM allocation failed: -22
00:05fdobridge: <gfxstrand> ```
00:07fdobridge: <marysaka> VMM allocation failed :vReiAgony:
00:08fdobridge: <pac85> Timing related or is it a different path in the kernel driver?
00:20fdobridge: <gfxstrand> Looks like we're not quite ready for prime time yet. Only took about 7m worth of dEQP
00:20fdobridge: <gfxstrand> @airlied ^^
01:08fdobridge: <airlied> Yes there is a patch to fix that, just not sure about it
01:35fdobridge: <airlied> I'll post it when I get to a pc
03:08fdobridge: <gfxstrand> Cool. Happy to stress it for you
03:25fdobridge: <airlied> nouveau-fw-535.113.01 has the hacky workaround on it now
03:37fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Where can I find it?
04:08fdobridge: <gfxstrand> Just threw 18 threads of dEQP at it. I'll check it in the morning to see if it survived.
04:23fdobridge: <Sid> I should get back to what I was doing but past couple weeks have been so hectic D:
04:27fdobridge: <gfxstrand> Well that could have gone better...
04:27fdobridge: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1170217827807015062/message.txt?ex=65583d47&is=6545c847&hm=a0943cc251f095ec3fe46a273bf52ef23dc00cb997623b7f4b70813bccd1cdcf&
04:29fdobridge: <gfxstrand> Survived a whole 7 minutes
04:29fdobridge: <gfxstrand> @airlied ^^
04:52fdobridge: <airlied> That's a new one I haven't seen, and doesn't immediately shout gsp
08:58fdobridge: <marysaka> I got that one once even without GSP btw
08:59fdobridge: <marysaka> (it just happened randomly and never reproduced it again)
09:07fdobridge: <airlied> It is wierd since it's a CPU side mmio mapping failure
09:08fdobridge: <karolherbst🐧🦀> yeah...
09:08fdobridge: <karolherbst🐧🦀> and `4d6000` isn't an invalid offset
09:08fdobridge: <karolherbst🐧🦀> maybe the GPU disconnected or something..
09:08fdobridge: <karolherbst🐧🦀> that sometimes happens
09:09fdobridge: <marysaka> I was connecting with thunderbolt so that was my assumption at that time
09:14fdobridge: <karolherbst🐧🦀> ahhh.. yeah
09:14fdobridge: <karolherbst🐧🦀> soo
09:14fdobridge: <karolherbst🐧🦀> mhh
15:29fdobridge: <gfxstrand> I'm plugged into a real PCIe slot
15:29fdobridge: <gfxstrand> I can try again
17:20fdobridge: <marysaka> Tested with your new branch, it doesn't work with GSP and without it. Going to write an issue now
17:31fdobridge: <marysaka> @airlied opened an issue with all the detail and root cause of it https://gitlab.freedesktop.org/drm/nouveau/-/issues/270
17:34fdobridge: <marysaka> @airlied opened an issue with all the details and the root cause of it https://gitlab.freedesktop.org/drm/nouveau/-/issues/270 (edited)
17:40fdobridge: <marysaka> If you need more logs I can provide some, sorry for reporting that this late I totally forgot to open it last month... 😅
19:08fdobridge: <mohamexiety> was planning on trying out gsp on ampere tomorrow so I guess I'll need to comment out that line?
19:11fdobridge: <airlied> Probably not
19:12fdobridge: <airlied> The ampere I have works fine
19:13airlied: Lyude: might be an issue to look into ^
19:13Lyude: will do
19:20fdobridge: <airlied> @marysaka can you add card make model info?
19:20fdobridge: <marysaka> will drop the kernel logs in the issue
19:25fdobridge: <marysaka> @airlied added my logs but it report as ``NVIDIA GA107 (b77000a1)``
19:26fdobridge: <marysaka> it's an RTX 3050 Mobile if I'm not wrong
19:26fdobridge: <airlied> So it's in a laptop?
19:26fdobridge: <marysaka> yes
19:26fdobridge: <karolherbst🐧🦀> the issue happens on laptops with pure accelerator GPUs
19:27fdobridge: <karolherbst🐧🦀> or at least some of them
19:27fdobridge: <airlied> Please add the laptop make/model
19:27fdobridge: <karolherbst🐧🦀> like no display, no nothing, only render
19:27fdobridge: <airlied> Might need to conditionalise the change
19:27fdobridge: <karolherbst🐧🦀> @airlied they are also sometimes correctly marked as `3D` devices in lspci
19:27fdobridge: <karolherbst🐧🦀> yes
19:27fdobridge: <karolherbst🐧🦀> nouveau has to detect it
19:28fdobridge: <karolherbst🐧🦀> some GPUs have the entire display blocked fused off
19:28fdobridge: <karolherbst🐧🦀> we do it for non GSP afaik
19:28fdobridge: <airlied> Sounds like it's broken on non gsp
19:28fdobridge: <marysaka> the display block work tho
19:28fdobridge: <marysaka> on non GSP before that it just list 0 display @karolherbst
19:29fdobridge: <karolherbst🐧🦀> does the MMIO region work?
19:29fdobridge: <marysaka> yup
19:29fdobridge: <karolherbst🐧🦀> ahh
19:29fdobridge: <karolherbst🐧🦀> maybe it's just partly fused of...
19:29fdobridge: <karolherbst🐧🦀> something something
19:29fdobridge: <karolherbst🐧🦀> but yeah
19:29fdobridge: <karolherbst🐧🦀> there are GPUs like that
19:29fdobridge: <karolherbst🐧🦀> @marysaka is the GPU listed as a `3D` or `VGA` device in lspci?
19:29fdobridge: <marysaka> we had a conversation about that at the end of september need to look that up again
19:29fdobridge: <marysaka> ``0000:01:00.0 3D controller: NVIDIA Corporation GA107M [GeForce RTX 3050 Mobile] (rev a1)``
19:29fdobridge: <karolherbst🐧🦀> yeah.. I'm to lazy to check
19:29fdobridge: <karolherbst🐧🦀> okay
19:30fdobridge: <karolherbst🐧🦀> so it's correctly identified as a 3D device
19:30fdobridge: <karolherbst🐧🦀> I know that that info is _sometimes_ bogus
19:30fdobridge: <karolherbst🐧🦀> but there should be some way through MMIO to identify it correctly
19:30fdobridge: <karolherbst🐧🦀> probably would have to check how/where non GSP nouveau bails
19:31fdobridge: <airlied> Or just dig in openrm
19:31fdobridge: <airlied> Since I assume this change is inspired by it
19:31fdobridge: <karolherbst🐧🦀> ohh yeah.. might be good to check if openrm even handles it
19:36fdobridge: <marysaka> I don't see any usage of this in openrm https://github.com/NVIDIA/open-gpu-kernel-modules/blob/be3cd9abcb1103115ae6c3c92d8fc4ff5c912f77/src/common/inc/swref/published/ampere/ga100/dev_fuse.h#L133
19:36fdobridge: <marysaka> so I'm kind of confused 😅
19:36fdobridge: <karolherbst🐧🦀> I suspect GSP might
19:37fdobridge: <karolherbst🐧🦀> there _might_ be some GPU info field stating if display is supported
19:37fdobridge: <karolherbst🐧🦀> might make sense to check the RPC calls we have/use there
19:37fdobridge: <karolherbst🐧🦀> identifying if displays are even supported is kinda a mess, because each OEM does it differently
19:37fdobridge: <karolherbst🐧🦀> and it's uhhh... broken
19:37fdobridge: <karolherbst🐧🦀> some even expose one or two fake connectors not doing anything
19:44fdobridge: <marysaka> what I don't get is why it needs to be disabled in devinit :aki_thonk:
19:45fdobridge: <marysaka> if you don't disable it, everything init fine in GSP and non GSP situations
19:50fdobridge: <airlied> It must have caused problems on some other hw, but if openrm never does it, it might not be necessary
20:05fdobridge: <airlied> gpuFuseSupportsDisplay_GA100 is in openrm
20:06fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Why is the word "fuse" in there?
20:06fdobridge: <airlied> because that's what it is
20:07fdobridge: <airlied> so nvidia fail kdispStatePreInitLocked if it's not supported, so I assume that is the inspiration for this
20:08fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Fuse sounds quite scary
20:08fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> https://wiki.archlinux.org/title/Nouveau#Phantom_output_issue
20:13fdobridge: <airlied> actually I think the fix is fairly easy, we shouldn't die on driver load there
20:17fdobridge: <airlied> @marysaka attached a patch
20:20fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> In the mailing list?
20:25fdobridge: <airlied> to the issue
20:28fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> And speaking of nouveau crashes is there a way to prevent nouveau from crashing on my laptop without `nouveau.runpm=0`? :nouveau:
20:31fdobridge: <airlied> can you file an issue with the crash? or dies it die completely?
20:34fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> nouveau stops working (with some kernel warning I think) but the rest of my system is fine (I'll have to get a proper log though)
20:34fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Could it be related to my laptop not supporting runtime D3? 💻
20:35fdobridge: <airlied> seems likely, we shouldn't runpm on something that doesn't support it
21:18fdobridge: <marysaka> will try now 👍
21:33fdobridge: <marysaka> works fine @airlied
21:33fdobridge: <marysaka> will comment on the issue too for reference