00:55 fdobridge_: <r​edsheep> Ok here's vkpeak on a 4090 before any of the patches, clearly showing more than enough performance to disprove my theory from earlier. Interestingly I already see performance that is more than half of what the prop driver gets on windows, so yeah it's obviously not something just cutting it cleanly in half.
00:55 fdobridge_: <r​edsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1197705883732422686/beforePatches.txt?ex=65bc3d86&is=65a9c886&hm=bc36bafb7c7b424e68c623de88d76cd96b2aeb8aa0e3cbe0875a38d4445aa784&
01:00 fdobridge_: <r​edsheep> I am now going to attempt to build mesa with 27024, 27156, 27157, and 27159 so I can test the new perf, and with 27154 I will be able to test crucible.
01:05 fdobridge_: <g​fxstrand> I merged most of those today, BTW.
01:06 fdobridge_: <g​fxstrand> Everything except imad is either merged or on the queue.
01:09 fdobridge_: <r​edsheep> Yeah, I'd rather not wait for CI lol...
01:11 fdobridge_: <r​edsheep> Unsurprisingly I am getting hunks failed, so I might have to wait or fix that myself
01:16 fdobridge_: <g​fxstrand> Hehe
01:30 fdobridge_: <g​fxstrand> Looks like everything except `ffma` and `imad` have landed and I think imad should apply clean.
01:31 fdobridge_: <g​fxstrand> I can rebase quick
01:35 fdobridge_: <g​fxstrand> Done Pull https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159 and you should be good to go
01:35 fdobridge_: <g​fxstrand> Done. Pull https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159 and you should be good to go (edited)
01:41 fdobridge_: <r​edsheep> Okay, 27024 doesn't apply cleanly anymore so I will just forget that one for a moment. If I pull that I still won't get 27157 yet right? That one doesn't appear to be running CI yet
01:52 fdobridge_: <r​edsheep> Nevermind, I see 27159 includes the same changes
02:10 fdobridge_: <r​edsheep> That did it, and the results are very impressive. We're looking around a 60% average uplift in int32 and fp32, and more than *double* with fp64. Even better, it's now at 80-90% of the performance of the prop driver here, which is incredible progress for a single day's work.
02:10 fdobridge_: <r​edsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1197724808868270151/afterPatches.txt?ex=65bc4f27&is=65a9da27&hm=f0b9c1a331e844a3f061ac7399a899bd38ec7f61de0a7aab9fed9e5b9b9cb891&
02:12 fdobridge_: <g​fxstrand> Yeah, there's clearly a little more we can do but it'll take some looking
02:12 fdobridge_: <g​fxstrand> I'll happily take 90%
02:14 fdobridge_: <r​edsheep> The crucible test confirms it hits all the SMs, which with the results above is pretty obvious
02:14 fdobridge_: <r​edsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1197725634701566096/SMTest.txt?ex=65bc4feb&is=65a9daeb&hm=c06657e4b8298ec11216cb966e31a6b264dd5a9fbdb9ccd8dec9ee83ce707cfc&
02:26 fdobridge_: <r​edsheep> The results with 3D are less obvious, as is probably expected. VKMark went from 3158 to 3203, talos went from 74 to 75. Those could be noise. The witness has been really inconsistent so I don't know exact numbers, but it seems like it's probably better. It used to be pretty easy to get down near 20 fps and I haven't managed to get below 35 with the new patches.
02:30 fdobridge_: <g​fxstrand> Yeah, I think I've got some serious work to do on stalls
02:33 fdobridge_: <r​edsheep> Would it help for me to grab a trace of the witness or poke at it with renderdoc, or is that kind of perf stuff still further down the road? Given it renders properly but very slowly even at 720p it seems like it's probably pretty near an ideal case for investigating bottlenecks.
02:34 fdobridge_: <g​fxstrand> Yeah
02:34 fdobridge_: <g​fxstrand> I think I've got a copy
02:34 fdobridge_: <g​fxstrand> But yeah, slow at 720 is definitely worth investigating
02:51 fdobridge_: <r​edsheep> Just to close out my testing here's the vkpeak numbers from windows for reference. The fp64 comparison is the most impressive, nvk achieves 94% of the prop driver there.
02:51 fdobridge_: <r​edsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1197734953216122931/winPeak.txt?ex=65bc5899&is=65a9e399&hm=b39e82d233f747e5ad5be823a9bbc1733326e84965f0d47e0caec3b567b9f5e2&
03:14 fdobridge_: <e​sdrastarsis> This is incredible, seriously
04:15 fdobridge_: <g​fxstrand> Just wait until we start beating them. 😉
04:30 fdobridge_: <r​edsheep> Careful, you might upset a certain man in a leather jacket 😆
04:31 fdobridge_: <g​fxstrand> Oh, the proprietary Vulkan driver team can't wait.
04:32 fdobridge_: <g​fxstrand> IDK how leather jacket guy feels or if he even cares.
04:37 fdobridge_: <r​edsheep> Why is that? I suppose it could serve to help their case for getting more resources dedicated to vulkan.
04:38 fdobridge_: <g​fxstrand> Oh, I think they're just looking forward to some healthy competition. Most of the engineers over there are good folks.
04:41 fdobridge_: <r​edsheep> Yeah people love to hate on Nvidia but every time an engineer has gotten a chance to do an interview they've been super cool.
04:43 fdobridge_: <S​id> woah
04:43 fdobridge_: <S​id> so much happened while I was asleep
04:43 fdobridge_: <g​fxstrand> Yeah, I'm on pretty good terms with a bunch of the folks on the Vulkan team. I also know a few on the Linux team.
04:47 fdobridge_: <g​fxstrand> Yeah, pull and enjoy. There's a little something in today's MRs for everybody. 😁
04:47 fdobridge_: <S​id> let me build with all those MRs and try it out too
04:47 fdobridge_: <g​fxstrand> Just pull !27159 and you'll get everything
04:47 fdobridge_: <g​fxstrand> Most of it is already merged.
04:48 fdobridge_: <S​id> :o
04:58 fdobridge_: <g​fxstrand> I should fire up steam on my laptop and install The Witness
04:59 fdobridge_: <r​edsheep> It's also just a fantastic game
04:59 fdobridge_: <S​id> can confirm
05:01 fdobridge_: <g​fxstrand> I played a bit but never got hooked.
05:03 fdobridge_: <g​fxstrand> I think that'll be my tomorrow project
05:03 fdobridge_: <g​fxstrand> A nice relaxing day of figuring out why the hell it's slow
05:03 fdobridge_: <g​fxstrand> Or maybe I'll be pleasantly surprised that I accidentally fixed it today.
05:05 fdobridge_: <S​id> 27159 doesn't cleanly apply on current master what
05:06 fdobridge_: <S​id> ```
05:06 fdobridge_: <S​id> Reversed (or previously applied) patch detected! Skipping patch.
05:06 fdobridge_: <S​id> 1 out of 1 hunk ignored -- saving rejects to file src/nouveau/compiler/nak/api.rs.rej
05:06 fdobridge_: <S​id> ```
05:06 fdobridge_: <S​id> amazing
05:06 fdobridge_: <g​fxstrand> That's fine. It's just because it contained a patch from another MR.
05:06 fdobridge_: <S​id> ah
05:07 fdobridge_: <g​fxstrand> I just rebased and pushed again so it's clean now
05:07 fdobridge_: <S​id> right, I see it
05:07 fdobridge_: <S​id> `nak: Enable NIR fuse_ffmaN`
05:09 fdobridge_: <g​fxstrand> Okay, so my Intel card plays it fine at FHD with high settings. Let's try NVK.
05:10 fdobridge_: <r​edsheep> I have some suspicions that the witness is slow because or some kind of synchronization or feedback that most games probably don't need. Not sure exactly what features are involved but there are tons of places where the exact way something is drawn affects the game mechanically
05:11 fdobridge_: <r​edsheep> Like right near the start there's a secret puzzle that only works once you're far enough away that a rough texture mips down to be smooth.
05:13 fdobridge_: <S​id> 💪
05:14 fdobridge_: <S​id> ```
05:14 fdobridge_: <S​id> [sidpr@strogg build]$ marigold -k ./vkpeak 0
05:14 fdobridge_: <S​id> WARNING: NVK is not a conformant Vulkan implementation, testing use only.
05:14 fdobridge_: <S​id> device = TU116
05:14 fdobridge_: <S​id>
05:14 fdobridge_: <S​id> fp32-scalar = 4923.54 GFLOPS
05:14 fdobridge_: <S​id> fp32-vec4 = 5219.50 GFLOPS
05:15 fdobridge_: <S​id>
05:15 fdobridge_: <S​id> fp16-scalar = 0.00 GFLOPS
05:15 fdobridge_: <S​id> fp16-vec4 = 0.00 GFLOPS
05:15 fdobridge_: <S​id> fp16-matrix = 0.00 GFLOPS
05:15 fdobridge_: <S​id>
05:15 fdobridge_: <S​id> fp64-scalar = 182.81 GFLOPS
05:15 fdobridge_: <S​id> fp64-vec4 = 182.83 GFLOPS
05:15 fdobridge_: <S​id>
05:15 fdobridge_: <S​id> int32-scalar = 4284.78 GIOPS
05:15 fdobridge_: <S​id> int32-vec4 = 5190.36 GIOPS
05:15 fdobridge_: <S​id>
05:15 fdobridge_: <S​id> int16-scalar = 0.00 GIOPS
05:15 fdobridge_: <S​id> int16-vec4 = 0.00 GIOPS
05:15 fdobridge_: <S​id> ```
05:16 fdobridge_: <g​fxstrand> Yeah, it's playable but super-janky
05:16 fdobridge_: <g​fxstrand> My Intel GPU is definitely better
05:16 fdobridge_: <g​fxstrand> Now I just have to remember how to point it at the RenderDoc layer
05:16 fdobridge_: <r​edsheep> Yeah, it's rough. If you run it with mangohud the graph shows it's extremely uneven frame delivery
05:18 fdobridge_: <g​fxstrand> It could be shader compiles but I wouldn't expect ANV to be that much better now that we've got a cache.
05:18 fdobridge_: <g​fxstrand> Feels like resource loading of some sort
05:18 fdobridge_: <g​fxstrand> Or maybe queries going awry?
05:18 fdobridge_: <g​fxstrand> We had problems with Talos in Intel back in the day thanks to queries
05:18 fdobridge_: <r​edsheep> ... Am I seeing things or are your fp64 results already better than the blob?
05:19 fdobridge_: <S​id> you're not seeing things
05:19 fdobridge_: <r​edsheep> That's insane
05:19 fdobridge_: <S​id> they are 1-2 GFLOPS higher than blob
05:19 fdobridge_: <S​id> maybe that's within the margin of error, maybe it isn't, but it *is* higher :D
05:22 fdobridge_: <r​edsheep> So now we have at least one scenario with a cpu bound win and one with a gpu bound win, that's huge.
05:25 fdobridge_: <r​edsheep> It really doesn't feel like shader compilation, it doesn't really correspond to anything new happening visually and I've never had an issue with compilation stutter on any other drivers I have played it on.
05:26 fdobridge_: <g​fxstrand> Yeah, it's really unclear what's going on.
05:26 fdobridge_: <g​fxstrand> I'll have a much better idea once I look at a trace
05:26 fdobridge_: <S​id> guess what
05:26 fdobridge_: <g​fxstrand> But if the game is doing a lot of dynamic feedback stuff, I could totally believe some of that is going awry
05:26 fdobridge_: <S​id> this solved the job timeouts too
05:27 fdobridge_: <S​id> or at least mitigated them greatly
05:27 fdobridge_: <g​fxstrand> \o/
05:27 fdobridge_: <r​edsheep> Oh? Don't those take a really long time to come up?
05:27 fdobridge_: <S​id> the game that was consistently throwing channel kills and job timeouts on boot just loaded to the main menu for me
05:27 fdobridge_: <S​id> (Metal: Hellsinger)
05:28 fdobridge_: <S​id> nope, for me it was pretty much as soon as anything dxvk tried to render
05:28 fdobridge_: <r​edsheep> What even changed today that could have done that? Is it even possible there were there instruction sequences before that the hardware couldn't actually do? That's super weird
05:29 fdobridge_: <S​id> if my understanding of this is correct, we were stalling because of sending inappropriate instruction sequences
05:29 fdobridge_: <S​id> or, well, sending incorrect instructions
05:30 fdobridge_: <S​id> whatever the case was, I don't understand it very well even now e-e
05:31 fdobridge_: <S​id> ok yeah they're not fixed, more greatly mitigated
05:32 fdobridge_: <S​id> ```
05:32 fdobridge_: <S​id> [Fri Jan 19 11:01:05 2024] nouveau 0000:01:00.0: gsp: mmu fault queued
05:32 fdobridge_: <S​id> [Fri Jan 19 11:01:05 2024] nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:24 type:31 scope:1 part:233
05:32 fdobridge_: <S​id> [Fri Jan 19 11:01:05 2024] nouveau 0000:01:00.0: fifo:001001:0003:0018:[Metal.exe[82910]] errored - disabling channel
05:32 fdobridge_: <S​id> [Fri Jan 19 11:01:05 2024] nouveau 0000:01:00.0: Metal.exe[82910]: channel 24 killed!
05:32 fdobridge_: <S​id> ```
05:32 fdobridge_: <S​id> but this mmu fault appears to be a different issue from the classic job timeout we were having so what do I know
05:33 fdobridge_: <S​id> maybe @airlied will find it interesting
05:33 fdobridge_: <S​id> fwiw I have the `pci=nommconf` kernel param set because it solves a bug on my system
05:34 fdobridge_: <S​id> the infamous PCI Bus Error flood
05:35 fdobridge_: <S​id> I'll try changing that to `pci=noaer` and see if it helps
05:38 fdobridge_: <S​id> guilty gear strive now runs at 35-40 fps
05:38 fdobridge_: <S​id> it used to be 25
05:42 fdobridge_: <r​edsheep> If it just happens to make it easier to get through CTS that would be incredible. Maybe I will see if I can run the 1.3 CTS.
05:42 fdobridge_: <g​fxstrand> Unlikely
05:42 fdobridge_: <g​fxstrand> None of the stuff I did today should make the CTS noticably faster
05:42 fdobridge_: <S​id> I've got CTS ready to go, are we looking for a full run?
05:43 fdobridge_: <r​edsheep> Is it possible it cleared some fails?
05:43 fdobridge_: <r​edsheep> Or crashes
05:43 fdobridge_: <g​fxstrand> And... I can't get a trace. 😭 Something is failing to allocate when I try to use the renderdoc layer. This sounds like a problem for tomorrow's me.
05:45 fdobridge_: <S​id> full CTS run is running
05:46 fdobridge_: <S​id> I'll share TestResults.qpa when it finishes
05:47 fdobridge_: <S​id> some news
05:47 fdobridge_: <S​id> ```
05:47 fdobridge_: <S​id> [Fri Jan 19 11:15:58 2024] nouveau 0000:01:00.0: deqp-vk[86332]: job timeout, channel 32 killed!
05:47 fdobridge_: <S​id> [Fri Jan 19 11:16:09 2024] nouveau 0000:01:00.0: deqp-vk[86332]: job timeout, channel 32 killed!
05:47 fdobridge_: <S​id> [Fri Jan 19 11:16:20 2024] nouveau 0000:01:00.0: deqp-vk[86332]: job timeout, channel 32 killed!
05:47 fdobridge_: <S​id> ```
05:48 fdobridge_: <S​id> still have the not enough memory thing too
06:08 fdobridge_: <S​id> ```Test case 'dEQP-VK.memory.pipeline_barrier.host_read_host_write.1024'..
06:08 fdobridge_: <S​id> terminate called after throwing an instance of 'vk::Error'
06:08 fdobridge_: <S​id> what(): vkd.deviceWaitIdle(device): VK_ERROR_DEVICE_LOST at vktMemoryPipelineBarrierTests.cpp:9345
06:08 fdobridge_: <S​id> Aborted (core dumped)
06:08 fdobridge_: <S​id> ```
06:13 fdobridge_: <S​id> something is borked with my network I'm not able to upload it anywhere
06:13 fdobridge_: <S​id> I'll upload it later
07:32 fdobridge_: <e​sdrastarsis> If he cares, maybe he is happy, a good quality driver on linux = more nvidia users on linux = profit = buy more leather jackets 🐸
07:35 fdobridge_: <S​id> who's leather jacket guy
07:38 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://upload.wikimedia.org/wikipedia/commons/c/c4/Jensen_Huang_%28cropped%29.jpg
07:40 fdobridge_: <S​id> ...ah
09:07 fdobridge_: <S​id> it does not
09:53 HdkR: The more you buy, the more you save!
14:27 fdobridge_: <S​id> out of curiosity, what's up with our fp16/int16?
15:07 fdobridge_: <g​fxstrand> Int16 works. @marysaka is supposed to do fp16 whenever she gets fed up with mesh. 😅
15:08 fdobridge_: <m​ohamexiety> btw is there any weirdness with doing/wiring up fp16 from the software side? because I heard that on Turing and onwards FP16 is done on the tensor cores rather than general vALUs
15:10 fdobridge_: <!​DodoNVK (she) 🇱🇹> I wonder if she's secretly trying to run Nvidium /s
15:17 fdobridge_: <!​DodoNVK (she) 🇱🇹> Feel free to review this: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27174
15:21 fdobridge_: <g​fxstrand> If, right. Budget. I forgot about that one. I'll give it a look today. Unfortunately, we don't yet have a dynamic kernel query but we should be able to do something.
15:54 fdobridge_: <m​arysaka> tbh I think I will drop the MR for mesh shader on monday and start looking fp16 that day
15:54 fdobridge_: <g​fxstrand> Cool
15:54 fdobridge_: <g​fxstrand> 🥳
15:56 fdobridge_: <m​arysaka> I just end up in a loop inside my brain trying to figure out how to keep track of what I want to extract when nir_cf_extract possibly invalidate my blocks and cursors 😅
15:57 fdobridge_: <m​arysaka> but yeah otherwise I cleaned up most of the mesh changes
15:58 fdobridge_: <m​arysaka> would be nice to implement the cs invocation count in shader thing we discuss so I could enable the statistic feature in mesh/task
16:00 fdobridge_: <m​arysaka> (I think I will look into that on Monday too as it seems mostly trivial to implement)
16:19 fdobridge_: <r​edsheep> I thought mesh would take way longer, that's super exciting. Somebody is working on raytracing too, right?
16:20 fdobridge_: <r​hed0x> @konstantinseurer has started REing the acceleration structure format but he's stopped and there's still a TON of work to do
16:27 fdobridge_: <r​edsheep> I'm super curious how that's going to look in terms of performance, it's unclear from the outside whether nvidia's dominance there is due to hardware or great drivers
16:28 fdobridge_: <r​edsheep> Really, it's probably both
16:36 fdobridge_: <p​ixelcluster> that’s what I’d say too
16:36 fdobridge_: <p​ixelcluster> I think NV's drivers pull in CUDA runtime stuff for RT?
16:38 fdobridge_: <p​ixelcluster> they probably profit both from the decade+ of engineering effort targeted specifically at running really complex compute workloads as fast as possible as well as actual throughput of their RT hw
16:39 fdobridge_: <r​edsheep> Yeah if you know what to look for it's clear that Nvidia was already making major changes to the architecture with an eye towards RT performance all the way back in maxwell
16:39 fdobridge_: <r​edsheep> I'm sure they had been preparing long before work even started on turing
16:58 fdobridge_: <S​id> oh shit I completely forgot
16:58 fdobridge_: <S​id> https://drive.google.com/file/d/1yl8Y5PIhPKtqck_KMv2gvqPKPX6Lzzr_/view
16:59 fdobridge_: <!​DodoNVK (she) 🇱🇹> I just rebased the ESO/GPL MR for :triangle_nvk:
17:04 fdobridge_: <!​DodoNVK (she) 🇱🇹> `vkcube-wayland: ../mesa/src/vulkan/runtime/vk_pipeline_cache.c:272: vk_pipeline_cache_object_deserialize: Assertion 'reader.current == reader.end && !reader.overrun' failed.` :cursedgears:
17:05 fdobridge_: <g​fxstrand> uh oh...
17:05 fdobridge_: <S​id> also, bringing this up again, but the 2 extensions required for us to be able to advertise vulkan 1.3 are both optional
17:06 fdobridge_: <S​id> VK_EXT_texture_compression_astc_hdr (not supported by hw) and VK_KHR_shader_float16_int8
17:07 fdobridge_: <S​id> we've implemented every other required ext for 1.1, 1.2, and 1.3
17:07 fdobridge_: <r​edsheep> It still doesn't pass the whole 1.3 CTS though, so isn't is still wrong to expose it?
17:08 fdobridge_: <m​arysaka> I mostly had everything figured and ready since early December, the main issue is how to handle 128 local invocation when you only have 32 max in hardware... so I will just land a draft MR without the proper lowering pass for now 😅
17:09 fdobridge_: <S​id> this is what I'm also confused about
17:09 fdobridge_: <!​DodoNVK (she) 🇱🇹> Let's disable the assert and see if that code explodes like crazy ☢️
17:10 fdobridge_: <g​fxstrand> Yeah, the serialize stuff is a little crazy
17:10 fdobridge_: <!​DodoNVK (she) 🇱🇹> Anyway vkcube-wayland now works with it disabled
17:13 fdobridge_: <g​fxstrand> I just rebased. Want to give that a try?
17:16 fdobridge_: <!​DodoNVK (she) 🇱🇹> I already have my own rebase (updating Overwatch 2 right now)
17:18 fdobridge_: <t​om3026> @sid hm im also seeing those job timeouts even tho i havent updated mesa in a while, are you also seeing a weird mm dma stacktrace at beginning of boot in dmesg? Something has regressed in the kernel between rc4 and 6.7.0 not related to nouveau. Cant even boot nvidia blob
17:19 fdobridge_: <S​id> haven't seen any stacktrace
17:19 fdobridge_: <S​id> and I'm able to both both nouveau and proprietary driver with just kernel cmdline shenanigans
17:19 fdobridge_: <S​id> and I'm able to boot both nouveau and proprietary driver with just kernel cmdline shenanigans (edited)
17:20 fdobridge_: <t​om3026> Hm what shenanigans? 😄
17:20 Sid127: `module_blacklist=nouveau` when I want proprietary driver
17:21 Sid127: `module_blacklist=nvidia nouveau.config=NvGspRm=1` when I want nouveau with GSP
17:21 Sid127: and you can guess what I use for nouveau without GSP :P
17:21 fdobridge_: <t​om3026> Oh thought you meant somethin else like disabling aspm or similar but yeah
17:21 Sid127: nope, don't need anything else
17:21 fdobridge_: <t​om3026> Oh well bisecting time then bleh
17:22 Sid127: in fact I think I force enable aspm
17:22 Sid127: all the best with bisect e-e
17:22 Sid127: such a long process, so much waiting
17:23 Sid127: ..I guess I could bisect it myself too, I've got the next 48 hours all to myself
17:23 Sid127: also, @tom3026, were you not seeing any job timeouts on rc4?
17:24 fdobridge_: <t​om3026> not on vkcube, now it doesnt even run at all
17:24 fdobridge_: <S​id> woah
17:25 fdobridge_: <S​id> ok, I'll take a look at identifying the issue myself
17:25 fdobridge_: <S​id> ok, I'll take a look at identifying the issue myself as well (edited)
17:25 fdobridge_: <S​id> rc6 had a patch that allowed me to properly use it (prime setup), I'll just backport that single patch and test it all out
17:26 fdobridge_: <t​om3026> but im getting this https://gist.github.com/gulafaran/24a52defdfc482f21685c133201b8705 at boot so something has gone bork with 6.7 , blob or nouveau doesnt matter. 6.6.8 works fine as it always has
17:26 fdobridge_: <S​id> so if the regression happened it's between rc4 and rc6 (maybe rc7), because I've been getting job timeouts ever since
17:26 fdobridge_: <!​DodoNVK (she) 🇱🇹> That's not good :cursedgears:
17:26 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197955267443306556/message.txt?ex=65bd25c8&is=65aab0c8&hm=c02158adb597935bf443e32ff18eef1227422f9299f8bb220697a022ff5028ec&
17:26 fdobridge_: <S​id> nope, haven't seen this
17:33 fdobridge_: <!​DodoNVK (she) 🇱🇹> @gfxstrand With your GPL rebase vkcube now hangs/freezes instead :cursedgears:
17:34 fdobridge_: <S​id> quick! check the dmesg!
17:34 fdobridge_: <!​DodoNVK (she) 🇱🇹> I think I only got a job timeout
17:38 fdobridge_: <g​fxstrand> Womp womp
17:38 fdobridge_: <g​fxstrand> I'll look after a bit
17:39 fdobridge_: <t​om3026> i just rebuilt mesa, and your vulkan-nouveau aur package, https://i.imgur.com/UKPfSIA.png on 6.6.8 ofc not gsp, then rebooting into 6.7 https://i.imgur.com/Hjko3nT.png
17:40 fdobridge_: <t​om3026> plus my awful mm dma trace at boot not related to noveau. wasnt having these issues on the rc kernel i had around earlier i think it was rc4 so something is bork bork 😛
17:40 fdobridge_: <S​id> I'm compiling 6.7-rc5 as we speak
17:41 fdobridge_: <S​id> since yo usaid job timeout didn't exist on rc4
17:41 fdobridge_: <t​om3026> well i didnt see it on vkcube nor cs2 but it might have? and/or this is just a new thing? and im having 2 bugs currently xD
17:41 fdobridge_: <t​om3026> but the blob isnt even booting either hardlocks with some null pointer dereference in nvidia driver so yeah idk
17:42 fdobridge_: <S​id> well, only one way to find out :D
17:54 fdobridge_: <g​fxstrand> Ugh... I hate trying to GDB stuff with wine....
17:55 Sid127: why not use winedbg
17:56 Sid127: or are you trying to do something else
18:04 fdobridge_: <g​fxstrand> How do I attach the dumb thing?
18:06 fdobridge_: <t​om3026> is your igpu amd?
18:08 fdobridge_: <S​id> intel
18:08 fdobridge_: <S​id> you just do `winedbg exe.exe` instead of `wine exe.exe`
18:08 fdobridge_: <S​id> iirc
18:08 fdobridge_: <S​id> I'm not too familiar with winedbg either e-e
18:10 fdobridge_: <t​om3026> okay, im building rc's now il try to figure out wich rc borks that mm, and same time vkcube. doesnt seem to be gsp related disabled that boot cmdline and if i got things right gsp isnt enabled by default on ampere yet?
18:10 fdobridge_: <!​DodoNVK (she) 🇱🇹> Or use `gdb` with some Wine patches (the Whacker way)
18:11 fdobridge_: <z​mike.> if you're trying to debug driver stuff on a wine app, you gdb attach to the process, source https://gist.github.com/zmike/4a8059403ce1e330a9fcdff79d214fbd, and then `wr` to load all your symbols
18:25 fdobridge_: <S​id> @tom3026 job timeout on rc5
18:27 fdobridge_: <t​om3026> rc4 is compiling atm
18:27 fdobridge_: <S​id> same here 😅
18:27 fdobridge_: <t​om3026> go for 3!
18:27 fdobridge_: <S​id> rc3? okie
18:28 fdobridge_: <t​om3026> are you on arch? does it timeout on -lts for you? its still on 6.6.12
18:28 fdobridge_: <t​om3026> no gsp tho but still
18:28 fdobridge_: <S​id> I just have one single commit added on top to make sure GSP doesn't freeze my entire system for me
18:28 fdobridge_: <g​fxstrand> Woah! Magic!
18:28 fdobridge_: <S​id> and yeah, job timeouts happen only when GSP's enabled
18:29 fdobridge_: <S​id> so naturally they won't happen on 6.6
18:29 fdobridge_: <t​om3026> does dmsg tell when gsp is loaded?
18:31 fdobridge_: <S​id> don't think it does
18:31 fdobridge_: <m​ohamexiety> it does
18:31 fdobridge_: <S​id> run sensors, if there's no nouveau then you're using GSP
18:31 fdobridge_: <S​id> oh
18:32 fdobridge_: <g​fxstrand> Ugh... DXVK is trying to use sparse images even though NVK doesn't support them.
18:32 fdobridge_: <!​DodoNVK (she) 🇱🇹> I think all of those errors partially corrupted the NVIDIA GPU
18:32 fdobridge_: <r​edsheep> That's with the witness?
18:32 fdobridge_: <g​fxstrand> Yeah
18:32 fdobridge_: <g​fxstrand> That's why RenderDoc is crashing
18:34 fdobridge_: <m​ohamexiety> ```
18:34 fdobridge_: <m​ohamexiety> [ 4.087613] nouveau 0000:01:00.0: gsp: firmware "nvidia/ga102/gsp/booter_load-5303002.bin" loaded - 42104 byte(s)
18:34 fdobridge_: <m​ohamexiety> ```
18:34 fdobridge_: <m​ohamexiety> this is old but you get the idea
18:34 fdobridge_: <g​fxstrand> But somehow it still runs?!?
18:35 fdobridge_: <r​edsheep> So is this just a case of fixing that game will need to wait on that feature being added?
18:35 fdobridge_: <t​om3026> then yeah @sid im getting those without gsp on 6.7
18:35 fdobridge_: <t​om3026> just thought i failed to disable it
18:35 fdobridge_: <S​id> timeout without gso *sweat
18:35 fdobridge_: <S​id> timeout without gsp *sweat* (edited)
18:35 fdobridge_: <S​id> what GPU do you have again?
18:35 fdobridge_: <S​id> if you were using vkd3d-proton you could configure it to avoid using specific extensions fwiw
18:36 fdobridge_: <r​edsheep> How hard are sparse images?
18:36 fdobridge_: <m​ohamexiety> it could be the game isn't using them but it's a hard req for DXVK? I am not sure though
18:36 fdobridge_: <t​om3026> 3060
18:36 fdobridge_: <g​fxstrand> I think I can probably fake it and say everything's in the miptail
18:36 fdobridge_: <m​ohamexiety> they're mostly done but I just need to finish my finals to clean up and fix a few small bugs
18:36 fdobridge_: <m​ohamexiety> they're mostly done but I just need to finish my finals (one on sunday, one on tuesday) to clean up and fix a few small bugs (edited)
18:36 fdobridge_: <S​id> ok yeah, ampere card, gsp is disabled by default and needs the cmdline to enable it
18:37 fdobridge_: <S​id> all the best for your finals!
18:37 fdobridge_: <g​fxstrand> It's DXVK, not VKD3D
18:37 fdobridge_: <S​id> I know, just saying :P
18:38 fdobridge_: <S​id> ~~motivation to get vkd3d rendering working~~
18:38 fdobridge_: <S​id> Amid Evil runs on NVK btw
18:38 fdobridge_: <S​id> https://cdn.discordapp.com/attachments/1171505720349446276/1186936165094404117/image.png
18:39 fdobridge_: <S​id> a lot of the game isn't rendered, but you *can* interact with it, move around, fire your weapons, all that stuff
18:39 fdobridge_: <S​id> some weapon effects render and create pretty artifacts
18:43 fdobridge_: <g​fxstrand> Okay, I've got a capture now. Let's see what this thing is doing.
19:03 fdobridge_: <S​id> no immediate timeout on rc3
19:03 fdobridge_: <S​id> testing further to see if it happens later
19:04 fdobridge_: <t​om3026> i messed my rc4 build so rebuilding again.. but il check soon
19:04 fdobridge_: <S​id> no worries~
19:04 fdobridge_: <S​id> long way to go even after we identify between which two release candidates the regression occurred e-e
19:05 fdobridge_: <t​om3026> yeah :/
19:05 fdobridge_: <S​id> and it's only half past midnight where I am
19:05 fdobridge_: <S​id> so technically I've got plenty time >:D
19:06 fdobridge_: <t​om3026> 20:05 here but got 2 kids having a cold so gotta get to bed early and prepare for a nightmare night 😄 but il test rc4 first
19:06 fdobridge_: <S​id> oof, hope they get well soon
19:06 fdobridge_: <S​id> few bonuses of being a student in the dorms on a weekend is that time suddenly becomes a social construct 😅
19:09 fdobridge_: <g​fxstrand> So far some annoying but probably fine fragment shaders
19:10 fdobridge_: <r​edsheep> So the sparse image stuff wasn't causing the slowdown?
19:12 fdobridge_: <g​fxstrand> IDK yet
19:12 fdobridge_: <g​fxstrand> I got a capture on ANV to compare and we'll go from there
19:12 fdobridge_: <S​id> `[Sat Jan 20 00:41:55 2024] nouveau 0000:01:00.0: dirtrally2.exe[6217]: job timeout, channel 24 killed!`
19:12 fdobridge_: <S​id> rc3
19:13 fdobridge_: <S​id> however it didn't happen on my usual instant-timeout game, Metal: Hellsinger
19:13 fdobridge_: <S​id> and I was able to finish one stage on Dirt Rally 1 as well
19:13 fdobridge_: <S​id> so somewhere between rc3 and rc5, the timeouts got worse
19:15 fdobridge_: <S​id> and going back to M:H after encountering that one, it happened instantly again :D
19:16 fdobridge_: <!​DodoNVK (she) 🇱🇹> After a reboot vkcube is fine without the GPL patch at least
19:16 fdobridge_: <g​fxstrand> Yeah, something is definitely going wrong with occlusion
19:16 fdobridge_: <g​fxstrand> It's rendering massively more geometry on NVK
19:19 fdobridge_: <S​id> culling busted
19:21 fdobridge_: <g​fxstrand> Or not? IDK. It's hard to tell
19:21 fdobridge_: <S​id> ok yeah, a reboot later and timeouts are back to normal frequency on rc3
19:21 fdobridge_: <S​id> at this rate I'll hit rc1 🐸
19:21 fdobridge_: <z​mike.> probably the same issue I cited in all the traces I posted weeks ago
19:22 fdobridge_: <t​om3026> rc4 vkcube runs now, still got that unrelated stacktrace tho
19:23 fdobridge_: <S​id> ..I've been trying games, not vkcube D:
19:23 fdobridge_: <t​om3026> yeah vkcube instantly dies on 6.7 here not rc4 tho nor 6.6.8 :p
19:23 fdobridge_: <S​id> but then again, vkcube works fine for me even on 6.7.0, so
19:24 fdobridge_: <S​id> this is weird
19:25 fdobridge_: <!​DodoNVK (she) 🇱🇹> And Overwatch 2 too (so there's definitely something wrong with the GPL MR/patch)
19:25 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197985278850244739/Screenshot_20240119_212226.png?ex=65bd41bb&is=65aaccbb&hm=23c6afae4a665ccac91251149007b89aedb9cc0df152141c2f85ea030d7dfde9&
19:28 fdobridge_: <!​DodoNVK (she) 🇱🇹> There's this warning though (I've gotten these on NVK before in a different form) :ferris:
19:28 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197985852664582224/message.txt?ex=65bd4244&is=65aacd44&hm=5bc0160aaea76d64d05492abd3b49636b8d0caeb8927a15f22a008f39d8a1bb7&
19:31 fdobridge_: <S​id> time to try rc1
19:32 fdobridge_: <S​id> if I run into timeout there too...
19:41 fdobridge_: <g​fxstrand> Oh, yeah. My ESO branch is hosed.
19:54 fdobridge_: <S​id> gonna try rc1, then get back to 6.7.0 to play some stardew valley for a while
19:58 fdobridge_: <!​DodoNVK (she) 🇱🇹> Overwatch 2 managed to compile shaders for a little bit but then nouveau errors started piling in
19:58 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197993526013739108/message.txt?ex=65bd496a&is=65aad46a&hm=704d74684f8a30696d0da312896d42b46fa9ec3e520a425daafd1dc6a59ecd21&
20:00 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197994104081104957/message.txt?ex=65bd49f4&is=65aad4f4&hm=562df0b45965e9aec2cce63908edf1c5637cedee0b5c711bc87ed5616feacdea&
20:01 fdobridge_: <!​DodoNVK (she) 🇱🇹> Here's more dmesg errors
20:01 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197994227485913228/message.txt?ex=65bd4a11&is=65aad511&hm=e6b23ae198e4d80c2edfc3986efe6a8b4ee39a1ce8c8dc013682a75bc3942689&
20:07 fdobridge_: <!​DodoNVK (she) 🇱🇹> @airlied I found a new way to cause full system hangs with nouveau (you already fixed the previous hang cause) :nouveau:
20:07 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1197995784461877409/message.txt?ex=65bd4b84&is=65aad684&hm=9b91701302db2d7a19a4b4230ec75e5e9507fcacb5dcdc99bc61b1949a142e01&
20:08 fdobridge_: <!​DodoNVK (she) 🇱🇹> Weirdly pressing the power button on my laptop unhanged my system 🤔
20:11 f_: karolherbst It happened again!
20:11 f_: Slowdown on eDP-1 (laptop monitor)
20:11 f_: grabbing dmesg log...
20:13 f_: Looks similar to the dmesg I had before last time I had the issue
20:13 f_: https://bin.vitali64.duckdns.org/65aad7ee
20:15 f_: So basically what I have is a laggy unusable eDP-1, somehow DP-2 is fine
20:16 f_: And `nouveau 0000:01:00.0: DRM: base-1: timeout` showing up in dmesg continuously
20:18 fdobridge_: <!​DodoNVK (she) 🇱🇹> When trying to reboot the system was stuck at a black screen (Caps Lock still worked though)
20:20 fdobridge_: <S​id> @tom3026 no timeout on rc1 so far
20:20 fdobridge_: <t​om3026> so yeah rc4 vkcube works, plugging in external monitor works. with gsp/without, 6.7 vkcube timeouts, plugging in monitor hardlocks without errors in dmesg
20:20 f_: Once again, switching to another mode and back fixes it.
20:20 fdobridge_: <t​om3026> sadly im still getting those mm traces on rc1 so rc1 brought in some regression
20:21 fdobridge_: <S​id> rc1 is also the first implementation of GSP in mainline 🙃
20:21 fdobridge_: <t​om3026> yeah but those traces are unrelated to nouveau, happends on blob aswell 😛
20:22 fdobridge_: <S​id> ah
20:22 fdobridge_: <S​id> right, then it's a different issue on your end 😅
20:22 fdobridge_: <S​id> atm am really only bothered about nouveau bugs, sorry :P
20:22 fdobridge_: <t​om3026> yeah nouveau tho is somewhere around rc4 and upwards
20:23 fdobridge_: <g​fxstrand> Actually, it's seeming kinda okay according to the CTS?!?
20:23 fdobridge_: <S​id> I'd say rc3
20:23 fdobridge_: <S​id> though I haven't tested rc2 either
20:23 fdobridge_: <S​id> I'll spend tomorrow properly bisecting 6.7 :)
20:24 fdobridge_: <!​DodoNVK (she) 🇱🇹> Not according to Overwatch 2 though
20:25 fdobridge_: <g​fxstrand> Yeah...
20:25 fdobridge_: <g​fxstrand> It does seem to maybe flake a little more but I can't repro the fails. 🙃
20:25 fdobridge_: <!​DodoNVK (she) 🇱🇹> I wonder if there's a Vulkan test that compiles a crazy amount of shaders with GPL
20:26 fdobridge_: <S​id> yeah, absolutely no timeout on rc1
20:26 fdobridge_: <S​id> even dirt rally 2 didn't throw one
20:26 fdobridge_: <g​fxstrand> If you're getting `[VIRT_WRITE]` fails, that's not the compiler.
20:26 fdobridge_: <!​DodoNVK (she) 🇱🇹> It happened during shader compilation though (so did something else explode during that compilation?)
20:27 fdobridge_: <g​fxstrand> What do you mean by "during shader compilation"?
20:27 fdobridge_: <g​fxstrand> Like during Steam's "compiling shaders" thing?
20:27 fdobridge_: <S​id> dxvk's gpl compilation methinks
20:28 fdobridge_: <!​DodoNVK (she) 🇱🇹> Yes (DXVK shows that when shaders are being compiled)
20:28 fdobridge_: <g​fxstrand> Hrm...
20:28 fdobridge_: <S​id> you need to set an env var for it
20:28 fdobridge_: <g​fxstrand> If literally the only thing happening is compiling shaders, that is odd
20:28 fdobridge_: <S​id> `DXVK_HUD=compiler`
20:28 fdobridge_: <S​id> it's different from steam's compiling shaders dialog box, it happens during gameplay
20:29 fdobridge_: <!​DodoNVK (she) 🇱🇹> I'm setting `all` for the most detailed information
20:29 fdobridge_: <g​fxstrand> Wait... That almost looks like maybe it's my mmap that's failing somehow.
20:30 fdobridge_: <g​fxstrand> @asdqueerfromeu If you do `lspci -vv` what memory regions does it show for your NVIDIA card?
20:31 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1198001899559977071/message.txt?ex=65bd5136&is=65aadc36&hm=447da279a8f91a27e441884edbf9a6af1810ca97632a262aab0c83f6e913d0ab&
20:32 fdobridge_: <g​fxstrand> Okay, so your BAR isn't resized. Can you poke around in your BIOS settings and try to enable resizable BAR?
20:32 fdobridge_: <!​DodoNVK (she) 🇱🇹> I can't really enable it on my laptop
20:32 fdobridge_: <S​id> hardware unsupported
20:32 fdobridge_: <g​fxstrand> Ugh...
20:33 fdobridge_: <g​fxstrand> Is it really blowing past 256M with just shaders?!?
20:33 fdobridge_: <S​id> rebar is only for 3000 series and above
20:34 fdobridge_: <!​DodoNVK (she) 🇱🇹> DXVK showed almost 10000 shaders at one point so 🤷‍♀️
20:36 fdobridge_: <g​fxstrand> Yeah, that's probably it.
20:36 fdobridge_: <g​fxstrand> In `nvk_device.c`, drop the `NOUVEAU_WS_BO_LOCAL` from `shader_heap`.
20:39 fdobridge_: <r​edsheep> Quite a few newer titles can end up with several GBs of shaders, hundreds of thousands
20:39 fdobridge_: <S​id> if I manage to bisect and pinpoint the source of these two nouveau+GSP bugs...
20:40 fdobridge_: <S​id> maybe this "meme" will come true
20:40 fdobridge_: <S​id> https://cdn.discordapp.com/attachments/1034184951790305330/1198003981511835658/image.png?ex=65bd5327&is=65aade27&hm=71bab806c7d2681e31897c15f1390f4eb6bf64975fc2d2e751f37bab9413a2cd&
20:40 fdobridge_: <S​id> yeah but ideally you'd want the shaders to compile in a way that it doesn't overwhelm the BAR
20:40 fdobridge_: <S​id> so there should be a queue of sorts
20:40 fdobridge_: <g​fxstrand> Well, the good news is that, now that we have VM_BIND, my options for solving this problem are vastly expanded.
20:41 fdobridge_: <S​id> instead of everything trying to compile all at once
20:41 fdobridge_: <g​fxstrand> The bad news is that ugh...
20:41 fdobridge_: <g​fxstrand> I need to give this all a very long think
20:41 fdobridge_: <g​fxstrand> We're going to have to DMA shaders.
20:41 fdobridge_: <g​fxstrand> Thanks, I hate it.
20:41 fdobridge_: <g​fxstrand> Some days I miss my UMA
20:44 fdobridge_: <r​edsheep> Careful what you wish for, the industry trend seems to make it look like dGPUs could be a thing of the past before long.
20:45 fdobridge_: <g​fxstrand> Getting PCI out of the way would be a good thing.
20:45 fdobridge_: <S​id> just give everyone powerful APUs
20:45 fdobridge_: <g​fxstrand> Hopefully with PCIe5, it'll be faster and REBAR will be mandatory
20:45 fdobridge_: <g​fxstrand> Let everything map everything else's memory, please!
20:46 karolherbst: f_: yeah.. looks like something random happened mhh...
20:46 fdobridge_: <m​ohamexiety> the big compute APUs are really insane in this regard
20:46 f_: karolherbst: Pretty much..
20:46 karolherbst: Lyude: https://bin.vitali64.duckdns.org/65aad7ee any ideas?
20:47 f_: I don't seem to any way to reproduce this reliably.
20:47 f_: *to have any way...
20:47 Ellenor: I just wish the world would stop with this accelerating escalation in compopting.
20:48 fdobridge_: <!​DodoNVK (she) 🇱🇹> Let's recompile now
20:48 fdobridge_: <r​edsheep> It will have some huge benefits, but that future will probably mean no upgradability at all. Want more ram or storage? Want more GPU power?Probably best to buy a new computer than using any of the much slower external or pcie solutions.
20:48 fdobridge_: <g​fxstrand> Eh, I don't think we're actually going to get there. People have been saying it for a decade now and the PC is still here. It just uses different connectors for everything.
20:49 fdobridge_: <r​edsheep> True but tightly integrated chiplets only just good good enough to ship for consumers in like the last few months
20:50 fdobridge_: <r​edsheep> It being something that's been a rumor or trend for a long time doesn't mean it won't happen.
20:50 fdobridge_: <g​fxstrand> It's less about integrated chiplets and more that, as long as there are companies specialzing in various pieces of the computing stack, people are going to want to be able to plug A into B.
20:50 fdobridge_: <k​arolherbst🐧🦀> the most likely thing is that RAM DIMMs go away 😛
20:51 fdobridge_: <r​edsheep> Meteor lake is basically the first true chiplet product that isn't for servers
20:51 fdobridge_: <g​fxstrand> Intel doesn't have a comptent APU so they're going to fight like hell to keep PCI alive.
20:51 fdobridge_: <S​id> isn't Intel working on an API for the MSI handheld
20:51 fdobridge_: <g​fxstrand> Otherwise, they're going to become irrelevant in the datacenter and they know that.
20:51 fdobridge_: <k​arolherbst🐧🦀> yeah...
20:52 fdobridge_: <S​id> s\/API/APU
20:52 Ellenor: "Intel doesn't have a competent APU" Define competent
20:52 fdobridge_: <k​arolherbst🐧🦀> well..
20:52 fdobridge_: <k​arolherbst🐧🦀> if intel would figure out their RAM situation it might help but.. ehh
20:52 fdobridge_: <k​arolherbst🐧🦀> I doubt they will
20:52 fdobridge_: <g​fxstrand> Until Intel is able to ship a desktop CPU with a built-in APU that can compete with whatever NVIDIA's equivalent of the 4080 at that time is, they're going to want to support plugging one in.
20:52 fdobridge_: <g​fxstrand> Because the only other option is that people stop buying their CPUs.
20:53 fdobridge_: <k​arolherbst🐧🦀> you mean like do what apple is doing? 😛
20:53 fdobridge_: <g​fxstrand> Yup
20:53 fdobridge_: <k​arolherbst🐧🦀> but yeah..
20:53 fdobridge_: <k​arolherbst🐧🦀> that means DIMM has to go 😄
20:53 fdobridge_: <g​fxstrand> Apple is only able to get away with it because they have a dedicated userbase that's happy to pay 2x the cost for a non-upgradable laptop.
20:54 fdobridge_: <g​fxstrand> The rest of the computing industry is built on competitors making different components and the end user (or OEM) plugging them together.
20:54 fdobridge_: <k​arolherbst🐧🦀> not sure if DIMM is fixable even
20:54 fdobridge_: <k​arolherbst🐧🦀> Dell tried, but they only got +50% bandwidth, which is a lot short on apples +800% 😄
20:55 fdobridge_: <g​fxstrand> I'm sure someone could figure out a bus to let you strap HBM3 onto your Intel CPU.
20:55 fdobridge_: <k​arolherbst🐧🦀> mhhh
20:55 fdobridge_: <k​arolherbst🐧🦀> maybe?
20:55 fdobridge_: <g​fxstrand> That's basically what AMD/Sony/MSFT did for the PS4/5 and XBox whatever.
20:55 fdobridge_: <k​arolherbst🐧🦀> not sure how well HBM3 would work with a CPU
20:55 fdobridge_: <k​arolherbst🐧🦀> ohh.. they used HBM?
20:55 fdobridge_: <k​arolherbst🐧🦀> I thought it was always very fast DDR
20:56 fdobridge_: <g​fxstrand> Maybe it's just very fast DDR. I thought it was HBM
20:56 fdobridge_: <S​id> karol if I write a couple kernel patches would it be possible to get them into the kernel with my name still attached :3
20:56 fdobridge_: <k​arolherbst🐧🦀> PS5 is GDDR
20:56 fdobridge_: <k​arolherbst🐧🦀> PS5 is GDDR6 (edited)
20:56 fdobridge_: <k​arolherbst🐧🦀> same for the xbox
20:57 fdobridge_: <k​arolherbst🐧🦀> I think the problem with HBM is that it relies to much on wide data buses making it a bit painful to use for CPU like workloads or something
20:57 fdobridge_: <g​fxstrand> Yeah, probably
20:57 fdobridge_: <k​arolherbst🐧🦀> and GDDR is generally good enough for GPUs
20:57 fdobridge_: <k​arolherbst🐧🦀> anyway
20:57 fdobridge_: <k​arolherbst🐧🦀> apple also just uses DDR
20:58 fdobridge_: <k​arolherbst🐧🦀> just uhh.. more channels and stuff
20:58 fdobridge_: <g​fxstrand> Yeah
20:58 fdobridge_: <g​fxstrand> They have a crazy wide bus
20:58 fdobridge_: <k​arolherbst🐧🦀> the issue is just, how fat can you go if you have it removable
20:58 fdobridge_: <g​fxstrand> (Or same bus width just more channels)
20:58 fdobridge_: <k​arolherbst🐧🦀> *fast
20:58 fdobridge_: <k​arolherbst🐧🦀> well..
20:58 fdobridge_: <g​fxstrand> Yeah.
20:58 fdobridge_: <k​arolherbst🐧🦀> the CPU can't access full speed anyway
20:58 fdobridge_: <g​fxstrand> Don't get me wrong. Connectors are massively limiting.
20:59 fdobridge_: <g​fxstrand> They're just too damn big and too far away.
21:00 fdobridge_: <g​fxstrand> Any news?
21:02 fdobridge_: <!​DodoNVK (she) 🇱🇹> I switched to GSP this time and I got a job timeout instead (and a system hang not too long after that)
21:03 fdobridge_: <S​id> dodo switch to 6.7-rc1
21:03 fdobridge_: <r​edsheep> Ponte vecchio is a pretty competent apu afaik, it's just super expensive and not entirely fabbed by Intel.
21:03 fdobridge_: <S​id> with the prime patchset applied on top
21:03 fdobridge_: <S​id> no job timeouts there
21:07 fdobridge_: <r​edsheep> Their data center relevance is hanging by a string regardless. I don't think they're incapable of making a product with hbm and enough cache for good CPU perf, or making a good laptop lpddr based apu, it's just a matter of having good enough architecture and manufacturing to make it not suck.
21:08 fdobridge_: <r​edsheep> Mostly in terms of efficiency.
21:08 fdobridge_: <r​edsheep> Nvidia and AMD have great perf/watt and production costs, and Intel is really struggling with that.
21:09 fdobridge_: <S​id> holy shit nvk and gsp is handling whatever I throw at it on rc1
21:09 fdobridge_: <S​id> haven't tried zink very well yet but stardew valley works great
21:10 fdobridge_: <S​id> no timeouts in my dmesg yet
21:10 fdobridge_: <S​id> no mmu faults being queued either
21:11 fdobridge_: <S​id> well, zink just caused a timeout but I have a feeling that's more because of zink not doing imad properly for nvidia
21:12 fdobridge_: <S​id> like we learned yesterday
21:18 fdobridge_: <r​edsheep> You're still using mesa with the new patches though right? If zink goes through NVK then wouldn't all of that new compiler work still help with that?
21:19 fdobridge_: <S​id> I am using mesa with the new patches, yeah
21:20 fdobridge_: <S​id> finally ran into a few job timeouts on sea of thieves on rc1
21:20 fdobridge_: <S​id> but I'm surprised by how... rare it is on rc1, there's definitely a regression somewhere that's exacerbating the issue
21:20 fdobridge_: <S​id> anyway, time to try out rc2
21:24 fdobridge_: <!​DodoNVK (she) 🇱🇹> I tried non-GSP again with that flag removed and I still got nouveau error spam (the shaders almost got compiled though)
21:25 fdobridge_: <!​DodoNVK (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1198015453172334722/message.txt?ex=65bd5dd6&is=65aae8d6&hm=43a1d37664ce8e1b1eb7da2c4a0c5326de28880498b515324e9496aaf8b98d09&
21:26 fdobridge_: <S​id> did you clear the shader cache
21:26 fdobridge_: <S​id> chances are it picked up where it left off if you didn't
21:26 fdobridge_: <!​DodoNVK (she) 🇱🇹> I forgot to do that this time
21:30 fdobridge_: <S​id> ok so
21:30 fdobridge_: <S​id> job timeouts have been a thing since rc1
21:30 fdobridge_: <S​id> `mmu fault queued` error has been there since rc3
21:31 fdobridge_: <S​id> tomorrow I'll bisect b/w rc2 and rc3 to identify and possibly fix why `mmu fault queued` happens
21:31 fdobridge_: <S​id> for job timeouts someone who actually knows what they're doing will have to look into it
21:32 fdobridge_: <S​id> but hey at least now I don't have to wait for the tagged rcs to compile again :D
21:32 fdobridge_: <S​id> also hi rhedox 👋
21:32 fdobridge_: <r​hed0x> Hi
21:32 fdobridge_: <r​hed0x> why is there a 40 second black screen between the systemd init screen and the login screen btw?
21:34 fdobridge_: <r​hed0x> when booting with nouveau that is
21:38 fdobridge_: <m​arysaka> do you have NVIDIA blobs installed?
21:38 fdobridge_: <r​hed0x> i dont think so
21:38 fdobridge_: <r​hed0x> i think i got rid of all of them before installing nvk
21:38 fdobridge_: <m​arysaka> hmm, I had some behavior like that with one of their service timing out after a long time
21:41 fdobridge_: <r​hed0x> anything to look out for? maybe something to look for in journalctl?
21:42 fdobridge_: <a​irlied> I spotted it using ps
21:42 fdobridge_: <a​irlied> Some NVIDIA module load script
21:45 fdobridge_: <!​DodoNVK (she) 🇱🇹> I think I got the same issue with the cache cleared so 🤷‍♀️
21:45 fdobridge_: <!​DodoNVK (she) 🇱🇹> I only have this with GSP enabled
21:45 fdobridge_: <r​hed0x> yeah i do have gsp enabled
21:52 fdobridge_: <!​DodoNVK (she) 🇱🇹> Can you disable it and see if the issue disappears?
21:53 fdobridge_: <g​fxstrand> Weird that it's all at the same address.
21:53 fdobridge_: <g​fxstrand> And those are reads, not writes
21:54 fdobridge_: <g​fxstrand> Okay, ESO run is done. It doesn't look substantialy worse than anything else
21:54 fdobridge_: <g​fxstrand> I'm going to start typing a DMA streamer
22:01 fdobridge_: <S​id> ok I'm tired I'm gonna call it a night
22:01 fdobridge_: <S​id> 0331 where I am...
22:02 fdobridge_: <g​fxstrand> @karolherbst We have 9039 on all the queues, right?
22:03 fdobridge_: <k​arolherbst🐧🦀> yes
22:03 fdobridge_: <k​arolherbst🐧🦀> copy _might_ be special here
22:05 fdobridge_: <r​edsheep> Now that you mention it I have this too. There's a blinking cursor in the top left and that's it, but it's more like 20 seconds for me.
22:07 fdobridge_: <S​id> quick question before I hit the sack, would it be worth bisecting through the whole kernel between rc2 and rc3, or would it be better for me to bisect only the commits on `drivers/gpu/drm/nouveau`
22:08 fdobridge_: <S​id> I'm trying to fix this regression
22:09 fdobridge_: <S​id> logic dictates I should bisect the whole tags, but brain says "no just do the nouveau commits and save yourself work"
22:11 fdobridge_: <r​edsheep> Binary search being what it is it's not that many more iterations to bisect everything right?
22:12 fdobridge_: <r​edsheep> Are you using git bisect?
22:12 fdobridge_: <S​id> yup
22:12 fdobridge_: <S​id> on drivers/gpu/drm/nouveau it's only 11 commits
22:13 fdobridge_: <S​id> but across the tag it's surely anywhere between 100-300
22:13 fdobridge_: <S​id> maybe more
22:13 fdobridge_: <r​edsheep> So 3-4 iterations there then vs probably like 8
22:14 fdobridge_: <S​id> yeah :)
22:14 fdobridge_: <S​id> and if I'm not convinced with my result on nouveau subtree, I can always bisect between those two commits again