IRC Logs of #nouveau on irc.freenode.net for 2025-10-13

01:14 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1427102091826036767/image.png?ex=68eda3e4&is=68ec5264&hm=b2812b3070cf3dd0edde4d6bcb8846cfe0cfa00fdf53e49338ec9be0c789d856&
01:14 mangodev[d]: uh oh
01:15 mangodev[d]: i'll test on official to be sure, but the official discord client runs *horridly* on my system
01:15 mangodev[d]: i think my cpu just can't handle it
01:19 mangodev[d]: the driver generally doesn't like animated media in chromium though
01:19 mangodev[d]: gives some warnings about an invalid mailbox every time a page loads with a gif/animated webp
01:42 gfxstrand[d]: Is that with Zink?
01:45 gfxstrand[d]: Looks like an OOB draw
01:52 dwfreed: tangentially, do note that using anything other than the official discord client is a violation of discord's terms of service
03:26 mangodev[d]: gfxstrand[d]: both with and without (`--enable-features=Vulkan`)
03:27 mangodev[d]: dwfreed: i'm aware
03:27 mangodev[d]: it's seldom enforced, though if it changes such that it is, maybe it'd make some users healthier :P
03:30 mangodev[d]: i'm on linux (as most people here likely are), and the official client is total ass tbh
03:30 mangodev[d]: lags like hell and lacks client-side window decorations (despite the app being designed with it in mind)
03:30 mangodev[d]: although *every* alternative has its own downsides
03:31 mangodev[d]: i wish there was a frequently-used irc-forum-ish platform like discord that is used among open source peoples
03:31 mangodev[d]: not laggy like discord
03:31 mangodev[d]: not always-on like IRC
03:31 mangodev[d]: and not stupidly janky like matrix
03:33 mangodev[d]: IRC is nice for active chats, but i check this channel at certain points in the day, and keep my pc off when i'm away
03:33 mangodev[d]: and going through the paginated chat logs sorted by the time of day of the server is a pain
03:33 mangodev[d]: especially replying to those messages
03:33 mangodev[d]: or referencing older ones
03:35 mangodev[d]: though i've heard of some open source communities (both the rust and zig programming languages) moving to zulip, which from what i've heard, is effectively an open-source themed discord clone
03:35 mangodev[d]: maybe the grass is greener over there than in discord chat bridge land
06:28 airlied[d]: I'm pretty sure NVIDIA just ignore robustness on cmat load/stores
09:25 bigsmarty[d]: mangodev[d]: Are you using Wayland?
09:27 bigsmarty[d]: try running it like this:
09:27 bigsmarty[d]: discord --ignore-gpu-blocklist --disable-features=UseOzonePlatform --enable-features=VaapiVideoDecoder --use-gl=desktop --enable-gpu-rasterization --enable-zero-copy
14:25 jja2000[d]: gfxstrand[d]: tried your current branch on gp10b, vulkaninfo works, other things like vkcube and glxinfo (zink) still blow up on the error messages I got before having to do with memory (and coherency)
14:26 jja2000[d]: Unless I'm missing something, that should be the current state right?
14:27 gfxstrand[d]: It might still be disabling coherency.
14:27 jja2000[d]: https://discord.com/channels/1033216351990456371/1034184951790305330/1367211282419286158 referring to this
14:28 gfxstrand[d]: Oh, I haven't tried anything with window systems or Zink yet.
14:28 gfxstrand[d]: I don't think Zink supports non-coherent maps
14:28 jja2000[d]: Aha, alright, that explains a lot hahahahaha
14:28 chikuwad[d]: hmm how do I replace an intrinsic type
14:29 jja2000[d]: Anything else I can test?
14:29 chikuwad[d]: say, for example, if I wanna replace an fadd with an iadd
14:30 gfxstrand[d]: Typically float vs. int is a different opcode
14:31 gfxstrand[d]: jja2000[d]: I've been running the CTS but that's a pain to build unless you've got a fast arm box somewhere or a good cross-build setup.
14:33 chikuwad[d]: ..wait I might be being dumb, there's no need to pack the f16v2 into a 32-bit scalar
14:33 chikuwad[d]: I think I can just run F32 atomics on each component separately?
14:34 chikuwad[d]: wait no the bit layouts will be different
14:35 jja2000[d]: gfxstrand[d]: Hmm, well could give it a try on my laptop. Should be quite a bit quicker. Any specific settings I should test for comparison with gm20b?
14:36 gfxstrand[d]: What does your laptop have in it?
14:38 jja2000[d]: Snapdragon 8cx Gen 3 (sc8280xp)
14:39 jja2000[d]: Compile on one, transfer to the other. Any shared libs should work too, running the same distro.
14:42 jja2000[d]: (hopefully)
15:20 Lyude: tzimmermann: Reviewed-by: Lyude Paul <lyude@redhat.com>
15:25 jja2000[d]: jja2000[d]: Started building a minute ago, let's see
15:31 jja2000[d]: gfxstrand[d]: just for clarity, debug target for deqp-vk should be okay right? Or is the release version fine (may shave some compile time lol). I'm not too acquainted with this stuff yet
15:40 mohamexiety[d]: debug only if you want to be able to debug it. Otherwise on weak CPUs you should generally prefer release
15:40 mohamexiety[d]: Should shave some compile time and also runtime time for the CTS
15:42 jja2000[d]: Oh it's already done
15:42 jja2000[d]: lol
15:42 jja2000[d]: 4xX1+4xA78 with a sizeable enough powerbudget vs a handful of A57's huh?
15:43 jja2000[d]: Good thing I upped the jobs otherwise it would've taken a while. I'll recompile the release version
15:46 snowycoder[d]: Taken form open-gpu-docs:
15:46 snowycoder[d]: > Number of NOPs for self-modifying gpfifo
15:46 snowycoder[d]: Oh god, why would you want a self-modifying command buffer
16:05 tzimmermann: Lyude, thanks
16:14 jja2000[d]: deqp-vk seems to run okay after transferring, do I just run the application with the options listed in the vulkancts readme then? Or is there an option to show the failed/passed amount like I've seen here before?
16:16 chikuwad[d]: the failed/passed shows at the end of a run
16:17 mhenning[d]: I typically run cts using `deqp-runner`, which does some nice things like parallelizing runs (and will also print out the pass/fail count as you go)
16:17 mhenning[d]: https://crates.io/crates/deqp-runner
16:18 marysaka[d]: snowycoder[d]: well usually you do not modify current command buffer but next one, you can specify that you don't want to prefetch the command buffer
16:19 mhenning[d]: yeah, we do that for indirect dispatch. there's a compute shader that generates a command buffer that we'll run in the future
16:25 jja2000[d]: chikuwad[d]: Ah, well that'll take a while on this hw I guess.
16:25 jja2000[d]: mhenning[d]: Thanks for the suggestion! Will check it out.
16:26 snowycoder[d]: mhenning[d]: That makes a lot of sense now, thx!
16:49 jja2000[d]: `Pass: 9067, Fail: 70, Crash: 6, Skip: 11205, Flake: 152, Duration: 7:26, Remaining: 17:06:12`
16:49 jja2000[d]: It's doing things
16:50 marysaka[d]: jja2000[d]: `sudo sysctl -w kernel.core_pattern=/bin/false` because those coredumps will take a while
16:50 marysaka[d]: might speed that up a biit
16:51 jja2000[d]: Yeah I need to open it in a screen window too since I'm not leaving my laptop on for that long
16:51 jja2000[d]: So I'll do that real quick
16:52 jja2000[d]: `NVK_I_WANT_A_BROKEN_VULKAN_DRIVER=1 NOUVEAU_USE_ZINK=1 meson devenv deqp-runner run --deqp ~/VK-GL-CTS/build/external/vulkancts/modules/vulkan/deqp-vk --caselist ~/VK-GL-CTS/external/vulkancts/mustpass/main/vk-default.txt --output new-run -- --deqp-visibility=hidden`
16:52 jja2000[d]: Just to be sure, running that atm.
16:52 HdkR: coredumps on my Orin took /ages/ because of the slow mmc so I also usually just disabled it there. :D
16:52 jja2000[d]: Yeah this is on SD lmao, even worse
17:09 gtx1650user: Do Turing's power management issues under nvidia-open effect Nouveau?
17:15 mhenning[d]: what power management issues?
17:18 gtx1650user: When nvidia-open originally released, it didn't work well on Turing I believe due to power management IIRC.
17:20 chikuwad[d]: it still doesn't
17:20 chikuwad[d]: no RtD3 on GSP for turing
17:21 chikuwad[d]: not sure about how it is on nouveau though
18:05 snowycoder[d]: mhenning[d]: Related to raw-access-chain, is the issue related to CTS not respecting this bit of the spec?:
18:05 snowycoder[d]: > Result type must be an OpTypePointer. Its Storage Class operand must be the same as the Storage Class of Base
18:05 mhenning[d]: snowycoder[d]: Yes, that's one of the issues
18:06 mhenning[d]: Another is that CTS omits alignment info where the spec requires it
18:06 mhenning[d]: Another is that CTS assumes out-of-bounds reads will be zero but the spec doesn't require that
18:06 snowycoder[d]: Oh, that is fun
18:07 mhenning[d]: Yeah, those are just the three that I found almost immediately
18:08 mohamexiety[d]: what would KHR_shader_fma entail for us? beyond the spec bump for Vk and SPIRV, and then adding the opcodes to spir-v, is there anything we would need to do in nir or nak?
18:08 mhenning[d]: For us? Not really, we just need to make sure nothing splits the fma in nir
18:09 mhenning[d]: There's some discussion around requirements for other hardware on the issue tracker though
18:10 mohamexiety[d]: mhenning[d]: how would that be guaranteed? :thonk:
18:10 mhenning[d]: by not having any optimization that does that
18:10 mhenning[d]: I'm not sure that we do
18:12 mohamexiety[d]: I guess this should work by itself with the spir-v opcode
18:13 mohamexiety[d]: so just have to wait for someone to do a ffma for nir and then implementing the ext shouldnt need anything special extra on top of that
18:15 mhenning[d]: Yeah, once the nir part is sorted out it might just be flipping a boolean on for us.
18:16 mohamexiety[d]: yep. will look into other stuff then. thanks!
18:55 mohamexiety[d]: compression passes all CTS without issue except synchronization.* and synchronization2.*
18:55 mohamexiety[d]: not sure why these are broken tbh but my suspicion is we're compressing something that shouldnt be compressed:
18:55 mohamexiety[d]: <Text>First different byte at offset: 96</Text>
18:55 mohamexiety[d]: <Text>Expected data: (... 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7,
18:55 mohamexiety[d]: 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7,
18:55 mohamexiety[d]: 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5...)</Text>
18:55 mohamexiety[d]: <Text>Actual data: (... 2, 3, 5, 7, 11, 13, 17, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 0, 0, 0, 0, 0, 0, 0,
18:55 mohamexiety[d]: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29,
18:55 mohamexiety[d]: 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 2, 3, 5...)</Text>
19:09 mhenning[d]: does inserting stronger synchronization help?
19:14 gfxstrand[d]: I suspect it's because the VA we allocate on import is wrong.
19:15 gfxstrand[d]: mohamexiety[d]: and I have been DMing
19:16 mohamexiety[d]: yeah I didn't really touch the import path at all
19:17 jja2000[d]: `Pass: 184014, Fail: 1536, Crash: 121, Warn: 2, Skip: 232108, Flake: 3719, Duration: 2:22:53, Remaining: 13:42:46` hm
19:18 jja2000[d]: about 3 times the amount of fails as on gm20b I guess?
20:13 gfxstrand[d]: Yeah, that's a bit worse than I was seeing.
20:15 mohamexiety[d]: gfxstrand[d]: following through stuff, it looks like the import/export path deals directly with the memory object so shouldn't this have all the proper info? :thonk:
20:20 mohamexiety[d]: since everything is derived directly from the memory object it should have all the correct size, alignment, etc... though it doesn't actually propagate the pte_kind?
20:38 jja2000[d]: gfxstrand[d]: I'll check tomorrow what the full result is and can send any logs if you want.
20:42 gfxstrand[d]: Oh, note that I was doing a 10% run so if you're seeing 10x the fails, that's normal.
20:54 jja2000[d]: gfxstrand[d]: Oh lol, yeah this is the entire vk-default run.
20:55 jja2000[d]: Maybe around 6.18 it'll be quicker... gpu devfreq is in drm-misc-next atm
21:18 gfxstrand[d]: Unlikely. It's mostly CPU bound. Those poor A57 cores...
21:31 jja2000[d]: Oh, well it's 2 denver2's and 4 a57's this time. 2 whole weird and very hot translation cores extra!
21:32 steel01[d]: Are they all actually up and running? I had to do some fixes and re-fixes on cpufreq for t186.
21:37 jja2000[d]: Yes, but this is without a couple of your fixes to get actual proper scaling
21:40 steel01[d]: https://lore.kernel.org/r/20250828-tegra186-cpufreq-fixes-v3-0-23a7341db254@gmail.com
21:40 steel01[d]: The site is failing to load for me atm, though. But without this, the denver cores fail to hotplug at all. Plus I did a dumb, where it'd only scale the first core in a cluster, and this has a fix for that too. So if you've just got my first commit to share policy within a cluster, then... yeah, the cpu scaling would be way screwed.
21:48 mohamexiety[d]: and there we go, synchronization.*:
21:48 mohamexiety[d]: DONE!
21:48 mohamexiety[d]: Test run totals:
21:48 mohamexiety[d]: Passed: 22458/60339 (37.2%)
21:48 mohamexiety[d]: Failed: 0/60339 (0.0%)
21:48 mohamexiety[d]: Not supported: 37881/60339 (62.8%)
21:48 mohamexiety[d]: Warnings: 0/60339 (0.0%)
21:48 mohamexiety[d]: Waived: 0/60339 (0.0%)
21:48 mohamexiety[d]: running synch2.* and will see
21:50 jja2000[d]: steel01[d]: Ah, then it's broken yeah
21:51 jja2000[d]: It's just the "vanilla" Fedora distro kernel which is at 6.16
21:55 mohamexiety[d]: synch2
21:55 mohamexiety[d]: DONE!
21:55 mohamexiety[d]: Test run totals:
21:55 mohamexiety[d]: Passed: 28996/70726 (41.0%)
21:55 mohamexiety[d]: Failed: 0/70726 (0.0%)
21:55 mohamexiety[d]: Not supported: 41730/70726 (59.0%)
21:55 mohamexiety[d]: Warnings: 0/70726 (0.0%)
21:55 mohamexiety[d]: Waived: 0/70726 (0.0%)
21:55 mohamexiety[d]: will clean up, apply mel's review notes (thanks so much for the review! <3) and then run another big run tomorrow
21:56 jja2000[d]: I'll have to wait until 6.18 drops into the repo's, may just try the rawhide kernel I guess
22:28 mhenning[d]: mohamexiety[d]: What did the issue end up being?
22:32 mohamexiety[d]: mhenning[d]: We weren’t passing in the pte_kind with imports. So one side was having a compressed pte kind and the other side was having a non compressed/different one. Since ultimately the kind is what enables/disables compression it was causing a mismatch
22:32 mhenning[d]: makes sense
22:41 mhenning[d]: gfxstrand[d]: Is there anything blocking https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36692 or https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37536 ? From here, they look ready to land.
22:42 gfxstrand[d]: Not really. I think the first one was waiting on me to run it on something pre-Turing. The other is just me not getting back to it.
22:43 kicchou: i just [re]submitted a patch to the nouveau ML and the cover letter isn't showing up on patchwork
22:45 kicchou: idk what i'm doing wrong
22:49 mhenning[d]: kicchou: Is that the "implement missing DCB connector types" series? I don't see the cover letter in the moderation queue
22:51 kicchou: yeah the v2 of that
22:52 mhenning[d]: Maybe check that you're sending to the right addresses?
22:53 mhenning[d]: Also I think patches are normally supposed to be CC the maintainers but the ones on the list don't show that
22:54 kicchou: git send-email says the cover letter, part 1, and part 2 all sent with '250' results to nouveau@lists.freedesktop.org and are cc'ed to my personal email
22:54 kicchou: noted with regard to the maintainers
22:54 mhenning[d]: Hmm not sure what's wrong then
22:57 kicchou: my name has UTF-8 characters; should i try again with just ASCII?
22:59 mhenning[d]: I'm not sure why that would affect only the cover letter
23:03 kicchou: fair point
23:09 kicchou: ig i'll resend a month from now with lyude and dakr cc'd
23:09 kicchou: i changed my git name just in case