IRC Logs of #nouveau on irc.freenode.net for 2023-08-15

02:37 fdobridge: <airlied> @gfxstrand were you having some FORTRAN flashbacks when you wrote the NAK shader dumper 🙂
02:38 fdobridge: <gfxstrand> I was making it look like the disassembler
02:38 fdobridge: <gfxstrand> I kinda want to lower-case the whole thing, though.
02:39 fdobridge: <airlied> ah does nv disasm look that bad, makes sense
03:42 fdobridge: <airlied> @gfxstrand opened another mr with the second set of nak patches for fs bits, I'm not as confident in these and I think you might have differing ideas, so feel free to do what you wish with it!, I won't be around too much for the rest of the week
03:42 fdobridge: <gfxstrand> Kk
03:44 fdobridge: <gfxstrand> I have a minion showing up in a week and I think I'm going to set her loose on 3D stages. I scanned through the patches in your other MR and wasn't sure about them. The assert fix was clear the other encoding fix probably was, too, if I actually looked at codegen a bit. Others are going to take more thought.
03:46 fdobridge: <airlied> also things are quite broken without serial in lots of places
03:47 fdobridge: <gfxstrand> Yes, I'm well aware. I've got someone who's going to start looking at that.
03:48 fdobridge: <gfxstrand> Or I will once I sort spilling
03:50 fdobridge: <gfxstrand> Believe it or not, I don't just eat oatmeal and poop out compilers. They take time. 😅
03:50 fdobridge: <airlied> oh man I though that was how most compilers were written
03:52 fdobridge: <gfxstrand> I'm hoping to get back to it for real tomorrow. I've been focused on NIR refactors and trying to get NVK merged the last month or two.
03:53 fdobridge: <airlied> this is mostly rust education for me 🙂
03:54 fdobridge: <airlied> I don't think any of my patches will make things worse, they just might not make things sufficiently better
03:54 fdobridge: <gfxstrand> 😅
03:54 fdobridge: <airlied> but I've gone over most of the sascha demos now
03:57 fdobridge: <gfxstrand> If you've got time in the next few weeks to throw at something, I'd love to know why parallel runs are faulting. Best guess is something around eviction. If it's hangs in one context affecting another, that shouldn't cause faults. At worst, it should cause an unexplained lost device.
04:01 fdobridge: <airlied> once I get back into things next week I'll probably start there, dakr might also have some ideas
04:03 fdobridge: <gfxstrand> Bit of good news: your allocation patch seems to speed up CTS runs by as much as 10 minutes. Maybe even 15.
04:03 fdobridge: <gfxstrand> It also maybe reduces flakes
04:04 fdobridge: <airlied> it reduced flakes for me on ampere by a massive amount
04:07 fdobridge: <gfxstrand> Seemed to reduce flakes for me by maybe as much as 40%
04:07 fdobridge: <airlied> I went down to flakes ~= crashes
04:07 fdobridge: <gfxstrand> Which also makes be think it might be migrations
04:07 fdobridge: <airlied> yeah just need to figure out a reproducer that doesn't take an hour to trigger 😛
04:08 fdobridge: <gfxstrand> The first fault reproduces in a few minutes. 🤷🏻‍♀️
04:08 fdobridge: <gfxstrand> Maybe even less
04:08 fdobridge: <airlied> yeah there's a lot of noise in 18 threads running in parallel 😛
04:09 fdobridge: <airlied> even for a few minutes
04:09 fdobridge: <airlied> I suspect it'll be more code reading, and praying it's not some missing hw TLB flush type thing
04:10 fdobridge: <gfxstrand> 😅
04:14 fdobridge: <gfxstrand> Should be able to make a crucible test that uses a bunch of memory
04:14 fdobridge: <gfxstrand> As much as I kinda hate IGT, this is what it's good for... 😅
04:14 fdobridge: <gfxstrand> As much as I kinda hate IGT, this sort of thing is what it's good for... 😅 (edited)
10:47 fdobridge: <georgeouzou> I am analyzing the failures of *pipeline.monolithic.dynamic_control_points*
10:47 fdobridge: <georgeouzou> It seems that the failure are triggered by codegen MemoryOpt optimizations
10:49 fdobridge: <georgeouzou> Specifically if MemoryOpt::combineSt and MemoryOpt::combineLd do not run the tests pass
10:49 fdobridge: <georgeouzou> I am analyzing the failures of *pipeline.monolithic.dynamic_control_points*
10:49 fdobridge: <georgeouzou> It seems that the failures are triggered by codegen MemoryOpt optimizations (edited)
10:50 fdobridge: <georgeouzou> the combination happens inside branch on gl_in[].gl_Position and gl_out[]gl_Position
10:51 fdobridge: <georgeouzou> Any ideas?
10:51 fdobridge: <georgeouzou> the combination happens inside branch on gl_in[].gl_Position and gl_out[].gl_Position (edited)
10:52 fdobridge: <georgeouzou> the combination happens inside a branch on gl_in[].gl_Position and gl_out[].gl_Position (edited)
11:36 fdobridge: <karolherbst🐧🦀> yeah.. nuke that optimization, it's broken 🙃
11:36 fdobridge: <karolherbst🐧🦀> although I think @mhenning has a patch, but I think the answer here is to do it on a nir level instead
12:16 doras: airlied: this implements what we discussed on Friday in regards to exporting GEM handles: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24648
12:17 doras: If you think you are able to review it, I'd appreciate it.
12:36 fdobridge: <georgeouzou> @karolherbst should i nuke it only for tesc shaders ? or all types?
13:09 fdobridge: <karolherbst🐧🦀> nah.. it's pretty much broken for anything besides GL
13:10 fdobridge: <karolherbst🐧🦀> maybe disable it for anything not being GL
13:10 fdobridge: <karolherbst🐧🦀> I think the nir_shader has that info?
13:10 fdobridge: <georgeouzou> I will check this out
13:10 fdobridge: <karolherbst🐧🦀> mhh.. maybe we drop that info when coming from spirv actually
13:11 fdobridge: <karolherbst🐧🦀> ahh, found the MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22118
13:13 fdobridge: <karolherbst🐧🦀> and then we need to vectorize memory loads and then we might be able to drop memoryopt
13:14 fdobridge: <karolherbst🐧🦀> or fix it..
13:14 fdobridge: <karolherbst🐧🦀> it's a bit broken but not terribly so
15:11 fdobridge: <mhenning> MemoryOpt is pretty broken. It ignores alignments so to fix it we'd either need to plumb that info through the ir or infer it later. I'd much rather fix it in nir than try to fix MemoryOpt
15:12 fdobridge: <mhenning> Deleting MemoryOpt is one of my long term goals, but doing so isn't easy without regressing perf on stuff we care about
15:12 fdobridge: <mhenning> In other words, it's a known issue and I'm working on it.
15:14 fdobridge: <georgeouzou> Ok, i will not change anything then. If we want to fix the tests we can just disable MemoryOpt for the NVK case
15:20 fdobridge: <karolherbst🐧🦀> we could move MemoryOpt to opt level 4 and make nvk only use 3 🙃
15:21 fdobridge: <karolherbst🐧🦀> as a dirty hack
15:26 fdobridge: <karolherbst🐧🦀> nv50 pain
15:26 fdobridge: <karolherbst🐧🦀> `gr: TRAP_MP_EXEC - TP 0 MP 0: 00000010 [INVALID_OPCODE] at 06e648 warp 0, opcode 0072696a 006c6a66`
15:26 fdobridge: <karolherbst🐧🦀> I think I'm screwing up uploading shaders...
16:02 fdobridge: <karolherbst🐧🦀> I'm sure the hardware wasn't made to execute shaders with 10000+ instructions
16:03 fdobridge: <pixelcluster> can I interest you in shaders with 100,000+ instructions 🙃
16:03 fdobridge: <karolherbst🐧🦀> sure, but not on my.... GT218 😄
16:04 fdobridge: <pixelcluster> oof yeah alright
16:04 fdobridge: <karolherbst🐧🦀> I think it's a GeForce 210 or something
16:04 fdobridge: <mohamexiety> and here I thought I was the weird one for using a 1030 for dev
16:04 fdobridge: <mohamexiety> and here I thought I was the weird one for using a 1030 for dev work (edited)
16:05 fdobridge: <karolherbst🐧🦀> well.. I'm fixing a bug, but passively cooled GPUs are nice, because they don't make any noise 😄
16:05 fdobridge: <karolherbst🐧🦀> and GPUs from that era with fans are kinda loud
16:05 fdobridge: <mohamexiety> yeah...
16:05 fdobridge: <karolherbst🐧🦀> not becasue of the fans, simply because the fans are old and kinda broken
16:06 fdobridge: <karolherbst🐧🦀> mhhh.. so I've fixed on problem.. uploading to much stuff at once which the hardware didn't like
16:07 fdobridge: <karolherbst🐧🦀> but now the hardware executes instructions which I'm sure are never uploaded
16:07 fdobridge: <mohamexiety> HW driven optimization. the GPU knows best 🐸
16:48 fdobridge: <karolherbst🐧🦀> mhhh
16:48 fdobridge: <karolherbst🐧🦀> odd
16:54 fdobridge: <karolherbst🐧🦀> now it works..
16:54 fdobridge: <karolherbst🐧🦀> uhhhh...
16:55 fdobridge: <karolherbst🐧🦀> yeah no idea...
16:57 fdobridge: <georgeouzou> Made an MR with the hack 😄
16:58 fdobridge: <karolherbst🐧🦀> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24706
16:58 fdobridge: <karolherbst🐧🦀> nice
16:59 fdobridge: <karolherbst🐧🦀> I like straightforward bugs like that
17:55 fdobridge: <georgeouzou> Does anyone get hangs when running the full CTS tests with deqp-runner?
17:55 fdobridge: <georgeouzou> Does anyone get hangs when running the full vulkan CTS tests with deqp-runner? (edited)
18:49 fdobridge: <georgeouzou> https://pastebin.com/8Kden3Xd
19:18 fdobridge: <karolherbst🐧🦀> yeah....
22:08 fdobridge: <gfxstrand> Yes. They're kernel bugs, I'm pretty sure. What's wrong? That's unclear.
22:31 fdobridge: <georgeouzou> Before the change to the uapi, full runs were running normally for Turing. But now I get a hang each time I run the full suite
23:14 fdobridge: <gfxstrand> What branch are you on? My NVK kernel branch has a bunch of fixes.
23:15 fdobridge: <gfxstrand> Also, do you have an Intel Wi-Fi card in your box? The iwlwifi module is crashing for me.