09:55fdobridge_: <Sid> @airlied what's it mean when an RC is triggered?
09:56fdobridge_: <Sid> referring specifically to `NV_VGPU_MSG_EVENT_RC_TRIGGERED`
09:59fdobridge_: <Sid> is it literally **R**elease **C**hannel triggered
10:05fdobridge_: <Sid> ...could the mmu error be because of the GPU trying to access memory outside the assigned BAR?
10:20fdobridge_: <tom3026> cool linux-next runs my vkcube and external monitor
10:21fdobridge_: <tom3026> i smell some really wonky thing with 6.7 not even related to nouveau breaking all sorts of things 😛
10:23fdobridge_: <Sid> I figure the issues I was trying to debug are userspace
10:23fdobridge_: <Sid> since reproducibility varies wildly between different mesa builds
10:44fdobridge_: <tom3026> cyberpunk didnt run 😦
10:44fdobridge_: <tom3026> ```
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: gsp: Xid:13 Graphics SM Warp Exception on (GPC 0, TPC 0, SM 0): Out Of Range Address
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: gsp: Xid:13 Graphics SM Global Exception on (GPC 0, TPC 0, SM 0): Multiple Warp Errors
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: gsp: Xid:13 Graphics Exception: ESR 0x504730=0xc0f000e 0x504734=0x4 0x504728=0xc81eb60 0x50472c=0x1174
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:16 type:13 scope:1 part:233
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: fifo:c00000:0002:0010:[GameThread[2488]] errored - disabling channel
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: GameThread[2488]: channel 16 killed!
10:44fdobridge_: <tom3026> jan 21 11:41:57 acer kernel: nouveau 0000:01:00.0: GameThread[2488]: error fencing pushbuf: -19
10:44fdobridge_: <tom3026> ```
10:47fdobridge_: <redsheep> Isn't cyberpunk dx12 only? Vkd3d isn't really working yet.
10:49fdobridge_: <!DodoNVK (she) 🇱🇹> Xid on nouveau 👀
10:52fdobridge_: <tom3026> yeah hehe
10:52fdobridge_: <tom3026> so it seems 😄
10:58fdobridge_: <!DodoNVK (she) 🇱🇹> vkd3d-proton tests kind of work on :triangle_nvk: (so someone needs to work on these)
11:11fdobridge_: <Sid> that's what I came to report too :o
11:11fdobridge_: <Sid> xid 13 on Sea of Thieves
11:12fdobridge_: <Sid> ```
11:12fdobridge_: <Sid> [Sun Jan 21 16:40:28 2024] nouveau 0000:01:00.0: SoTGame.exe[5699]: job timeout, channel 24 killed!
11:12fdobridge_: <Sid> [Sun Jan 21 16:40:28 2024] nouveau 0000:01:00.0: SoTGame.exe[5699]: error fencing pushbuf: -19
11:12fdobridge_: <Sid> [Sun Jan 21 16:40:32 2024] nouveau 0000:01:00.0: gsp: Xid:13 Graphics SM Warp Exception on (GPC 0, TPC 0, SM 0): Out Of Range Address
11:12fdobridge_: <Sid> [Sun Jan 21 16:40:32 2024] nouveau 0000:01:00.0: gsp: Xid:13 Graphics SM Global Exception on (GPC 0, TPC 0, SM 0): Multiple Warp Errors
11:12fdobridge_: <Sid> [Sun Jan 21 16:40:32 2024] nouveau 0000:01:00.0: gsp: Xid:13 Graphics Exception: ESR 0x504730=0xc00000e 0x504734=0x4 0x504728=0x4c1eb72 0x50472c=0x174
11:12fdobridge_: <Sid> ```
11:12fdobridge_: <Sid> and that's a DX11 game
11:17fdobridge_: <Sid> Amid Evil dx12 also works™️
11:17fdobridge_: <Sid> gameplay doesn't render but the HUD works
11:17fdobridge_: <Sid> and you can interact with the game and hear the sound effects and presumably move around the map
11:24fdobridge_: <mohamexiety> hey, it's not 109 so it's a win
11:24fdobridge_: <Sid> aha!
11:24fdobridge_: <Sid> reproduced my bug again
11:24fdobridge_: <Sid> this time with a new logline on top```
11:24fdobridge_: <Sid> [Sun Jan 21 16:53:35 2024] nouveau 0000:01:00.0: RichardBurnsRal[8660]: error fencing pushbuf: -19
11:24fdobridge_: <Sid> [Sun Jan 21 16:53:41 2024] nouveau 0000:01:00.0: gsp: mmu fault queued
11:24fdobridge_: <Sid> [Sun Jan 21 16:53:41 2024] nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:24 type:31 scope:1 part:233
11:24fdobridge_: <Sid> [Sun Jan 21 16:53:41 2024] nouveau 0000:01:00.0: fifo:001001:0003:0018:[RichardBurnsRal[8660]] errored - disabling channel
11:24fdobridge_: <Sid> ```
11:25fdobridge_: <Sid> everything else is identical
11:26fdobridge_: <Sid> I still don't know if this is userspace or kernel though :frog_gears:
11:26fdobridge_: <Sid> s\/if this is/if this is because of
11:26fdobridge_: <tom3026> https://forums.developer.nvidia.com/t/multiple-cuda-rtx-vulkan-application-crashing-with-xid-13-109-errors/235459/297 would be so annoying if this is a firmware issue 🥺
11:31fdobridge_: <Sid> weh, I give up on hunting this down
12:18fdobridge_: <karolherbst🐧🦀> normall thoes are all userspace issues
12:18fdobridge_: <karolherbst🐧🦀> basically just a but in the command buffer
12:44fdobridge_: <Sid> hmm
18:30fdobridge_: <airlied> RC is robust channel I think
20:22fdobridge_: <airlied> maybe robust context, either way, the firmware detects the channel isn't making forward progress somehow and blows it away and tells the driver
20:26fdobridge_: <Sid> hm