00:04 fdobridge: <g​fxstrand> Oh, would you look at that
00:04 fdobridge: <g​fxstrand> ```
00:04 fdobridge: <g​fxstrand> [ 568.313367] nouveau 0000:17:00.0: gsp:msg fn:103 len:0x78/0x58 res:0x62 resp:0x62
00:04 fdobridge: <g​fxstrand> [ 568.313380] msg: 00000000: 04 00 d0 c1 04 00 d0 c1 00 00 1d de 80 00 00 00 ................
00:04 fdobridge: <g​fxstrand> [ 568.313385] msg: 00000010: 62 00 00 00 38 00 00 00 00 00 00 00 00 00 00 00 b...8...........
00:04 fdobridge: <g​fxstrand> [ 568.313389] msg: 00000020: 00 00 00 00 04 00 d0 c1 00 00 00 00 00 00 00 00 ................
00:04 fdobridge: <g​fxstrand> [ 568.313392] msg: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00:04 fdobridge: <g​fxstrand> [ 568.313395] msg: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00:04 fdobridge: <g​fxstrand> [ 568.313398] msg: 00000050: 00 00 00 00 00 00 00 00 ........
00:04 fdobridge: <g​fxstrand> [ 568.313854] nouveau 0000:17:00.0: deqp-vk[25607]: VMM allocation failed: -22
00:05 fdobridge: <g​fxstrand> ```
00:07 fdobridge: <m​arysaka> VMM allocation failed :vReiAgony:
00:08 fdobridge: <p​ac85> Timing related or is it a different path in the kernel driver?
00:20 fdobridge: <g​fxstrand> Looks like we're not quite ready for prime time yet. Only took about 7m worth of dEQP
00:20 fdobridge: <g​fxstrand> @airlied ^^
01:08 fdobridge: <a​irlied> Yes there is a patch to fix that, just not sure about it
01:35 fdobridge: <a​irlied> I'll post it when I get to a pc
03:08 fdobridge: <g​fxstrand> Cool. Happy to stress it for you
03:25 fdobridge: <a​irlied> nouveau-fw-535.113.01 has the hacky workaround on it now
03:37 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Where can I find it?
04:08 fdobridge: <g​fxstrand> Just threw 18 threads of dEQP at it. I'll check it in the morning to see if it survived.
04:23 fdobridge: <S​id> I should get back to what I was doing but past couple weeks have been so hectic D:
04:27 fdobridge: <g​fxstrand> Well that could have gone better...
04:27 fdobridge: <g​fxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1170217827807015062/message.txt?ex=65583d47&is=6545c847&hm=a0943cc251f095ec3fe46a273bf52ef23dc00cb997623b7f4b70813bccd1cdcf&
04:29 fdobridge: <g​fxstrand> Survived a whole 7 minutes
04:29 fdobridge: <g​fxstrand> @airlied ^^
04:52 fdobridge: <a​irlied> That's a new one I haven't seen, and doesn't immediately shout gsp
08:58 fdobridge: <m​arysaka> I got that one once even without GSP btw
08:59 fdobridge: <m​arysaka> (it just happened randomly and never reproduced it again)
09:07 fdobridge: <a​irlied> It is wierd since it's a CPU side mmio mapping failure
09:08 fdobridge: <k​arolherbst🐧🦀> yeah...
09:08 fdobridge: <k​arolherbst🐧🦀> and `4d6000` isn't an invalid offset
09:08 fdobridge: <k​arolherbst🐧🦀> maybe the GPU disconnected or something..
09:08 fdobridge: <k​arolherbst🐧🦀> that sometimes happens
09:09 fdobridge: <m​arysaka> I was connecting with thunderbolt so that was my assumption at that time
09:14 fdobridge: <k​arolherbst🐧🦀> ahhh.. yeah
09:14 fdobridge: <k​arolherbst🐧🦀> soo
09:14 fdobridge: <k​arolherbst🐧🦀> mhh
15:29 fdobridge: <g​fxstrand> I'm plugged into a real PCIe slot
15:29 fdobridge: <g​fxstrand> I can try again
17:20 fdobridge: <m​arysaka> Tested with your new branch, it doesn't work with GSP and without it. Going to write an issue now
17:31 fdobridge: <m​arysaka> @airlied opened an issue with all the detail and root cause of it https://gitlab.freedesktop.org/drm/nouveau/-/issues/270
17:34 fdobridge: <m​arysaka> @airlied opened an issue with all the details and the root cause of it https://gitlab.freedesktop.org/drm/nouveau/-/issues/270 (edited)
17:40 fdobridge: <m​arysaka> If you need more logs I can provide some, sorry for reporting that this late I totally forgot to open it last month... 😅
19:08 fdobridge: <m​ohamexiety> was planning on trying out gsp on ampere tomorrow so I guess I'll need to comment out that line?
19:11 fdobridge: <a​irlied> Probably not
19:12 fdobridge: <a​irlied> The ampere I have works fine
19:13 airlied: Lyude: might be an issue to look into ^
19:13 Lyude: will do
19:20 fdobridge: <a​irlied> @marysaka can you add card make model info?
19:20 fdobridge: <m​arysaka> will drop the kernel logs in the issue
19:25 fdobridge: <m​arysaka> @airlied added my logs but it report as ``NVIDIA GA107 (b77000a1)``
19:26 fdobridge: <m​arysaka> it's an RTX 3050 Mobile if I'm not wrong
19:26 fdobridge: <a​irlied> So it's in a laptop?
19:26 fdobridge: <m​arysaka> yes
19:26 fdobridge: <k​arolherbst🐧🦀> the issue happens on laptops with pure accelerator GPUs
19:27 fdobridge: <k​arolherbst🐧🦀> or at least some of them
19:27 fdobridge: <a​irlied> Please add the laptop make/model
19:27 fdobridge: <k​arolherbst🐧🦀> like no display, no nothing, only render
19:27 fdobridge: <a​irlied> Might need to conditionalise the change
19:27 fdobridge: <k​arolherbst🐧🦀> @airlied they are also sometimes correctly marked as `3D` devices in lspci
19:27 fdobridge: <k​arolherbst🐧🦀> yes
19:27 fdobridge: <k​arolherbst🐧🦀> nouveau has to detect it
19:28 fdobridge: <k​arolherbst🐧🦀> some GPUs have the entire display blocked fused off
19:28 fdobridge: <k​arolherbst🐧🦀> we do it for non GSP afaik
19:28 fdobridge: <a​irlied> Sounds like it's broken on non gsp
19:28 fdobridge: <m​arysaka> the display block work tho
19:28 fdobridge: <m​arysaka> on non GSP before that it just list 0 display @karolherbst
19:29 fdobridge: <k​arolherbst🐧🦀> does the MMIO region work?
19:29 fdobridge: <m​arysaka> yup
19:29 fdobridge: <k​arolherbst🐧🦀> ahh
19:29 fdobridge: <k​arolherbst🐧🦀> maybe it's just partly fused of...
19:29 fdobridge: <k​arolherbst🐧🦀> something something
19:29 fdobridge: <k​arolherbst🐧🦀> but yeah
19:29 fdobridge: <k​arolherbst🐧🦀> there are GPUs like that
19:29 fdobridge: <k​arolherbst🐧🦀> @marysaka is the GPU listed as a `3D` or `VGA` device in lspci?
19:29 fdobridge: <m​arysaka> we had a conversation about that at the end of september need to look that up again
19:29 fdobridge: <m​arysaka> ``0000:01:00.0 3D controller: NVIDIA Corporation GA107M [GeForce RTX 3050 Mobile] (rev a1)``
19:29 fdobridge: <k​arolherbst🐧🦀> yeah.. I'm to lazy to check
19:29 fdobridge: <k​arolherbst🐧🦀> okay
19:30 fdobridge: <k​arolherbst🐧🦀> so it's correctly identified as a 3D device
19:30 fdobridge: <k​arolherbst🐧🦀> I know that that info is _sometimes_ bogus
19:30 fdobridge: <k​arolherbst🐧🦀> but there should be some way through MMIO to identify it correctly
19:30 fdobridge: <k​arolherbst🐧🦀> probably would have to check how/where non GSP nouveau bails
19:31 fdobridge: <a​irlied> Or just dig in openrm
19:31 fdobridge: <a​irlied> Since I assume this change is inspired by it
19:31 fdobridge: <k​arolherbst🐧🦀> ohh yeah.. might be good to check if openrm even handles it
19:36 fdobridge: <m​arysaka> I don't see any usage of this in openrm https://github.com/NVIDIA/open-gpu-kernel-modules/blob/be3cd9abcb1103115ae6c3c92d8fc4ff5c912f77/src/common/inc/swref/published/ampere/ga100/dev_fuse.h#L133
19:36 fdobridge: <m​arysaka> so I'm kind of confused 😅
19:36 fdobridge: <k​arolherbst🐧🦀> I suspect GSP might
19:37 fdobridge: <k​arolherbst🐧🦀> there _might_ be some GPU info field stating if display is supported
19:37 fdobridge: <k​arolherbst🐧🦀> might make sense to check the RPC calls we have/use there
19:37 fdobridge: <k​arolherbst🐧🦀> identifying if displays are even supported is kinda a mess, because each OEM does it differently
19:37 fdobridge: <k​arolherbst🐧🦀> and it's uhhh... broken
19:37 fdobridge: <k​arolherbst🐧🦀> some even expose one or two fake connectors not doing anything
19:44 fdobridge: <m​arysaka> what I don't get is why it needs to be disabled in devinit :aki_thonk:
19:45 fdobridge: <m​arysaka> if you don't disable it, everything init fine in GSP and non GSP situations
19:50 fdobridge: <a​irlied> It must have caused problems on some other hw, but if openrm never does it, it might not be necessary
20:05 fdobridge: <a​irlied> gpuFuseSupportsDisplay_GA100 is in openrm
20:06 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Why is the word "fuse" in there?
20:06 fdobridge: <a​irlied> because that's what it is
20:07 fdobridge: <a​irlied> so nvidia fail kdispStatePreInitLocked if it's not supported, so I assume that is the inspiration for this
20:08 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Fuse sounds quite scary
20:08 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> https://wiki.archlinux.org/title/Nouveau#Phantom_output_issue
20:13 fdobridge: <a​irlied> actually I think the fix is fairly easy, we shouldn't die on driver load there
20:17 fdobridge: <a​irlied> @marysaka attached a patch
20:20 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> In the mailing list?
20:25 fdobridge: <a​irlied> to the issue
20:28 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> And speaking of nouveau crashes is there a way to prevent nouveau from crashing on my laptop without `nouveau.runpm=0`? :nouveau:
20:31 fdobridge: <a​irlied> can you file an issue with the crash? or dies it die completely?
20:34 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> nouveau stops working (with some kernel warning I think) but the rest of my system is fine (I'll have to get a proper log though)
20:34 fdobridge: <!​[NVK Whacker] Echo (she) 🇱🇹> Could it be related to my laptop not supporting runtime D3? 💻
20:35 fdobridge: <a​irlied> seems likely, we shouldn't runpm on something that doesn't support it
21:18 fdobridge: <m​arysaka> will try now 👍
21:33 fdobridge: <m​arysaka> works fine @airlied
21:33 fdobridge: <m​arysaka> will comment on the issue too for reference