01:32 fdobridge: <g​fxstrand> I like that my CTS runs are back down to 50 minutes.
02:25 fdobridge: <g​fxstrand> Okay, let's run Maxwell and see if my machine survives.
02:34 fdobridge: <g​fxstrand> Yeah, it's not going well...
02:34 fdobridge: <g​fxstrand> ```
02:34 fdobridge: <g​fxstrand> [209201.708804] Call Trace:
02:34 fdobridge: <g​fxstrand> [209201.708808] <IRQ>
02:34 fdobridge: <g​fxstrand> [209201.708811] ? nvkm_engn_cgrp_get+0x80/0x90 [nouveau]
02:34 fdobridge: <g​fxstrand> [209201.709204] ? __warn+0x81/0x130
02:34 fdobridge: <g​fxstrand> [209201.709213] ? nvkm_engn_cgrp_get+0x80/0x90 [nouveau]
02:34 fdobridge: <g​fxstrand> [209201.709606] ? report_bug+0x171/0x1a0
02:34 fdobridge: <g​fxstrand> [209201.709614] ? handle_bug+0x3c/0x80
02:34 fdobridge: <g​fxstrand> [209201.709618] ? exc_invalid_op+0x17/0x70
02:34 fdobridge: <g​fxstrand> [209201.709621] ? asm_exc_invalid_op+0x1a/0x20
02:34 fdobridge: <g​fxstrand> [209201.709630] ? nvkm_engn_cgrp_get+0x80/0x90 [nouveau]
02:34 fdobridge: <g​fxstrand> [209201.710020] ? nvkm_engn_cgrp_get+0x75/0x90 [nouveau]
02:34 fdobridge: <g​fxstrand> [209201.710409] nvkm_runl_rc_engn+0x3f/0xc0 [nouveau]
02:35 fdobridge: <g​fxstrand> [209201.710798] gf100_fifo_mmu_fault_recover+0x32c/0x340 [nouveau]
02:35 fdobridge: <g​fxstrand> [209201.711187] gm107_fifo_intr_mmu_fault_unit+0xf2/0x120 [nouveau]
02:35 fdobridge: <g​fxstrand> [209201.711574] gf100_fifo_intr_mmu_fault+0x51/0xb0 [nouveau]
02:35 fdobridge: <g​fxstrand> [209201.711961] gk104_fifo_intr+0x2b8/0x3a0 [nouveau]
02:35 fdobridge: <g​fxstrand> [209201.712345] nvkm_intr+0x129/0x240 [nouveau]
02:35 fdobridge: <g​fxstrand> [209201.712642] __handle_irq_event_percpu+0x47/0x1a0
02:35 fdobridge: <g​fxstrand> [209201.712651] handle_irq_event+0x38/0x80
02:35 fdobridge: <g​fxstrand> [209201.712657] handle_edge_irq+0x8b/0x230
02:35 fdobridge: <g​fxstrand> [209201.712663] __common_interrupt+0x3c/0xa0
02:35 fdobridge: <g​fxstrand> [209201.712669] common_interrupt+0x81/0xa0
02:35 fdobridge: <g​fxstrand> [209201.712674] </IRQ>
02:35 fdobridge: <g​fxstrand> [209201.712675] <TASK>
02:35 fdobridge: <g​fxstrand> [209201.712677] asm_common_interrupt+0x26/0x40
02:35 fdobridge: <g​fxstrand> ```
02:38 fdobridge: <a​irlied> Looks like a racy mmu fault recover, just fix the mmu fault :-p
02:39 fdobridge: <e​sdrastarsis> I saw the Rust project that you are helping to develop, why did you choose Rust?
02:41 fdobridge: <a​irlied> Just experimenting with things, was there another language to write rust in?
02:45 fdobridge: <a​irlied> You can write kernel drivers in C or rust and we know how to write them in C, so it doesn't leave a lot of scope to pick another language
02:49 fdobridge: <e​sdrastarsis> I asked because I wanted to know if Rust's security features help a lot in developing DRM drivers
02:50 fdobridge: <a​irlied> Not really just help the same as in any driver, better lifetime and lack of use after frees
02:52 fdobridge: <a​irlied> Don't think it helps DRM drivers in any special way
03:17 fdobridge: <g​fxstrand> Yeah, I started again andmore of the same. I'm going to give up for now.
03:28 fdobridge: <a​irlied> at least we can probably avoid blaming gsp for that 😛
03:28 fdobridge: <a​irlied> is there some more info prior to that call trace?
03:32 fdobridge: <g​fxstrand> Probably but the machine died so IDK
03:33 fdobridge: <g​fxstrand> If I get more info, is someone going to care? Should I file a bug?
03:33 fdobridge: <g​fxstrand> It would help a lot with improving Maxwell if it could survive a CTS run.
03:42 fdobridge: <a​irlied> I'd file a bug if you can get a bit more info, but fixing it probably involves finding what the crash is, NULL ptr or whatever and trying to see what is racing
05:42 fdobridge: <g​fxstrand> @asdqueerfromeu This should fix your dmesg problem: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26441
05:42 fdobridge: <g​fxstrand> @asdqueerfromeu Did you ever file a bug about that? If so, could you either paste it here or tag the MR with it?
05:43 fdobridge: <g​fxstrand> If not, that's fine. We'll just consider this a code clean-up
07:19 fdobridge: <!​DodoNVK (she) 🇱🇹> Do you mean the GR class error?
14:05 fdobridge: <g​fxstrand> Yes
14:06 fdobridge: <g​fxstrand> Yes
14:07 fdobridge: <!​DodoNVK (she) 🇱🇹> I just submitted a comment about that
14:07 fdobridge: <g​fxstrand> Okay, cool
14:07 fdobridge: <!​DodoNVK (she) 🇱🇹> I'm not sure if that's nouveau-related though (I think @tiredchiku did encounter this too)
14:07 fdobridge: <g​fxstrand> That gives us a breadcrumb at least
14:08 fdobridge: <S​id> I did
14:25 fdobridge: <!​DodoNVK (she) 🇱🇹> To clarify things, that NVK copy engine MR fixes the weird GR class errors I had on NVK (the nouveau OpenGL driver obviously has no change)
14:46 Tom^: regarding dgpu laptops, power management like the D3 one nvidia blob does. power it off when nothing is using it, is that something thats either being worked on or is it already working? or is that bits in the GSP thingie
14:48 fdobridge: <!​DodoNVK (she) 🇱🇹> D3 support already exists (I was able to make the GPU go into D3cold mode)
14:48 fdobridge: <g​fxstrand> Cool. Yeah, I never really trusted the 2D engine. With that MR, we don't use it at all anymore.
14:49 Tom^: oh nice, just gotta wait for 6.7 to land and il fire the testing up then on this 3060 heh
14:49 fdobridge: <!​DodoNVK (she) 🇱🇹> What's wrong with it?
14:54 fdobridge: <g​fxstrand> It's got piles of restrictions we don't fully understand. It's also probably slower than the copy engine. We mostly used it because FillBuffer was one of the first things we ever wired up and we didn't know the copy engine could do that.
14:55 fdobridge: <g​fxstrand> Yesterday, I took one look at a blob trace and it was obvious. 🤦🏻‍♀️
16:45 fdobridge: <g​fxstrand> I like seeing all these 60 FPS numbers. 😁
16:45 fdobridge: <g​fxstrand> I'm legit looking forward to the next Phoronix benchmark run. I expect things will look WAY better than the pre-GSP run he did earlier this year.
17:03 fdobridge: <!​DodoNVK (she) 🇱🇹> This was with an unlocked framerate (so it can go higher and actually up to 600 FPS)
19:12 fdobridge: <g​fxstrand> Nice!
19:13 fdobridge: <g​fxstrand> @karolherbst is there a flag I can set on ldl/stl which will make them noop on OOB instead of faulting?
19:13 fdobridge: <k​arolherbst🐧🦀> yes and no
19:14 fdobridge: <k​arolherbst🐧🦀> ehh wait LDL?
19:14 fdobridge: <k​arolherbst🐧🦀> mhhh
19:14 fdobridge: <k​arolherbst🐧🦀> not sure
21:08 fdobridge: <g​fxstrand> Ugh... I have no idea how instruction deps are supposed to work with these dumb BMOV instructions
21:10 fdobridge: <g​fxstrand> It's clear that the barrier file is internally scoreboarded. That's fine.
21:11 fdobridge: <g​fxstrand> But some barrier instructions require `.yld`, most seem to ignore (or require 0) for other dep stuff.
21:12 fdobridge: <g​fxstrand> But `bmov R B` appears to be a fixed-latency instruction?!?
21:15 fdobridge: <g​fxstrand> Requiring `.yld` makes sense if it's internally scoreboarded. Yield is really the only way the HW has to bail out if it's waiting on something.
21:15 fdobridge: <g​fxstrand> Especially requiring `.yld` on `bsync` is pretty much a "no duh"
21:15 fdobridge: <g​fxstrand> I'm less sure about `bmov` and `break`
22:22 fdobridge: <k​arolherbst🐧🦀> nvidia puts `.yld` on almost everything
22:22 fdobridge: <k​arolherbst🐧🦀> I think it means something "this block/thread is free to be scheduled away from" or something
22:23 fdobridge: <k​arolherbst🐧🦀> and if you don't put it, it can't or something
22:25 fdobridge: <k​arolherbst🐧🦀> yield is also required to ensure forward progress
22:28 fdobridge: <k​arolherbst🐧🦀> nothing is internally scoreboarded
22:29 fdobridge: <k​arolherbst🐧🦀> the main difference is just if they are fixed or variable latency
22:31 fdobridge: <k​arolherbst🐧🦀> the yield flag really just means that control can be moved over to a different set of threads
22:31 fdobridge: <k​arolherbst🐧🦀> which makes sense if the thread blocks on bsync