IRC Logs of #nouveau on irc.freenode.net for 2025-04-20

00:04 gfxstrand[d]: I seem to recall a2lp causing problems on other hardware if you get it wrong, too.
03:06 georgehasrisen: You are the most fantastically popular fecalists, liquid shit LGBT abuse donkeys in the world, you come begging for forgiveness from me or Trump and Musk will put you away for good for the rest of your life. Such nonsense ponydonkeys harassed me on vacation fabricating issues and complaints on me. You never understood much btw. I tried to help you but it seems your abuse only goes to
03:06 georgehasrisen: bigger heights at which point i can not anymore properly save your sick asses.
03:54 gfxstrand[d]: Sometimes I don't even know...
04:12 mangodev[d]: gfxstrand[d]: are the bots getting to this channel too? i refuse to believe these are real people
04:18 mangodev[d]: i'm curious how some have gotten successful 32 bit nvk builds
04:19 mangodev[d]: i think the `lib32-mesa-git` aur package is a little too outdated to try modifying to support nvk
04:19 mangodev[d]: would it be a better idea to try modifying the regular `mesa-git` package to support 32 bit build? as it already has the build prerequisites for nvk already set up
04:34 gfxstrand[d]: I could throw you my cross file some time.
04:34 gfxstrand[d]: It's not too hard
04:34 mangodev[d]: ah alr
04:35 mangodev[d]: the only changes i think i need to make is to
04:35 mangodev[d]: - include the custom `llvm32` lib
04:35 mangodev[d]: - set rust build target
04:35 mangodev[d]: - …voila?
04:40 HdkR: Biggest issue with 32-bit builds is that the LLVM dev packages for i386 on Debian and Ubuntu have been broken for like a year.
04:41 HdkR: ...and rebroken
04:42 mangodev[d]: moment of truth, attempting to build
04:42 mangodev[d]: uh oh
04:42 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1363374314136539136/image.png?ex=6805ccc6&is=68047b46&hm=525f284b0bccb309ab17ed2a1e93008c8af424cbf84848897f6d940041124daa&
04:42 mangodev[d]: something happened alright
04:48 mhenning[d]: I haven't seen that issue before but the arch official package is a good resource for building 32-bit mesa https://gitlab.archlinux.org/archlinux/packaging/packages/lib32-mesa/-/blob/main/PKGBUILD?ref_type=heads
04:54 HdkR: Those are definitely x86-64 register names rather than i386
04:54 HdkR: Sounds like it got compiled for x86-64 but linked for i386?
04:54 HdkR: Probably missing a -m32 in the compile flags or something
04:55 HdkR: export BINDGEN_EXTRA_CLANG_ARGS="-m32"
04:55 HdkR: Is in that file
04:57 mangodev[d]: HdkR: that's one of the things i was missing
04:57 mangodev[d]: the other was `--cross-file lib32`
04:57 mangodev[d]: ig that's what gfxstrand meant by cross file?
04:58 HdkR: yes
04:58 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1363378313225306314/image.png?ex=6805d080&is=68047f00&hm=1f9827c903ca2f379905c99431c12ad77d86b730e2358833ac30e4b82752b5e9&
04:58 mangodev[d]: the euphoria
05:11 mangodev[d]: i wonder
05:11 mangodev[d]: any wayland compositors known to work better under nvk? does native vulkan wsi work better than using an opengl compositor?
05:14 mangodev[d]: i'd think that native vulkan wayland would run better than doing it through a translation layer
05:14 mangodev[d]: maybe less bugs too?
05:15 mangodev[d]: currently on kde, and it works *eh*
05:15 mangodev[d]: very susceptible to system stutter, and moving my mouse between monitors causes a system-wide lag spike
06:34 orowith2os[d]: mangodev[d]: so far the only Wayland compositors that support Vk are going to be smithay and wlroots
06:36 orowith2os[d]: Kwin has plans (see https://invent.kde.org/plasma/kwin/-/issues/169)
06:36 mangodev[d]: orowith2os[d]: have they worked any better than gl compositors?
06:36 orowith2os[d]: That's very much a It Depends
06:36 mangodev[d]: i've been curious about trying a wlroots based desktop like labwc because it has been around a little longer than smithay
06:36 orowith2os[d]: You can give them a shot and see
06:37 orowith2os[d]: I know Mutter doesn't have any plans at all, at least not without porting clutter to Vk too (I should know, I filed the issue myself)
06:38 orowith2os[d]: https://gitlab.gnome.org/GNOME/mutter/-/issues/2606
06:38 mangodev[d]: ~~sway :blobcatnotlikethis:~~
06:44 orowith2os[d]: I do know wlroots had problems with screen capture on their Vulkan backend
06:44 orowith2os[d]: I'm not sure if that was ever fixed
06:44 mangodev[d]: i mean
06:44 mangodev[d]: better than smithay still having problems existing
06:44 mangodev[d]: ¯\_(ツ)_/¯
06:45 orowith2os[d]: I would really like to comment on smithay, but I haven't touched it since my compositor can't make use of it due to cross-platform needs
06:46 mangodev[d]: orowith2os[d]: wdym by "cross-platform needs"
06:46 tiredchiku[d]: BSD, probably
06:46 orowith2os[d]: Non-X11/Wayland windowing systems
06:46 tiredchiku[d]: ah
06:47 orowith2os[d]: Smithay should work fine on bsd, I think, but I haven't tried. Someone definitely has.
06:47 orowith2os[d]: Windows and macOS, though...
06:47 orowith2os[d]: That's the kicker :P
06:48 mangodev[d]: orowith2os[d]: …flinger? dwm? quartz?
06:48 orowith2os[d]: Yeah
06:49 mangodev[d]: those are the only others i can think of
06:49 tiredchiku[d]: arcan
06:49 mangodev[d]: is that the haiku one?
06:49 tiredchiku[d]: https://github.com/letoram/arcan
06:49 tiredchiku[d]: mangodev[d]: no
06:49 orowith2os[d]: It's that Xorg-lookalike, isn't it?
06:50 orowith2os[d]: :monad:
08:55 snowycoder[d]: gfxstrand[d]: how did you resolve `ILLEGAL_SPH_INSTR_COMBO`?
14:02 gfxstrand[d]: I haven't yet. Not fully, anyway.
14:02 gfxstrand[d]: But I got rid of most of them by setting the FP64 bit
14:05 gfxstrand[d]: I might be missing an FP64 instruction in my is_fp64() helper
14:07 snowycoder[d]: gfxstrand[d]: It's a header flag? Makes sense
14:09 karolherbst[d]: gfxstrand[d]: didn't you add those conversion ones?
14:09 karolherbst[d]: in codegen any 64 bit conversions involving floats is considered fp64
14:12 gfxstrand[d]: I missed frnd
14:12 gfxstrand[d]: I got the actual conversions. But frnd is also a conversion.
14:15 tiredchiku[d]: ~~damn I miss my frnd too~~ :myy_TinyGiggle:
14:17 karolherbst[d]: heh
14:54 gfxstrand[d]: Okay, I've added `OpFRnd` to `is_fp64()` and fixed legalization of `dfma`. Let's see if that gets rid of the last of the `ILLEGAL_SPH_INSTR_COMBO`. I kinda doubt it but I'm happy to let the machine run for a couple hours.
14:54 gfxstrand[d]: I find it a little unlikely that a test would use ONLY `OpFRnd` and no other fp64 ops.
14:54 gfxstrand[d]: I guess you could if you just loaded a bunch of stuff from SSBOs, rounded it and then stored it again.
15:10 gfxstrand[d]: gfxstrand[d]: I also added some PID prints so hopefully I can track where the exceptions are coming from if they still happen.
15:52 karolherbst[d]: I was reading up some nv50 code, and I found something interesting. On nv50 you could preload parts of shared memory when launching compute shaders. No idea if it's actually useful for anything, given you have const buffers...
15:52 karolherbst[d]: I wonder if newer gens still have it...
15:53 karolherbst[d]: well.. maybe clearing it all to 0?
17:16 gfxstrand[d]: Clearing to 0 might be useful
17:21 karolherbst[d]: I'm surprised we even got the compute class header for tesla...
17:21 karolherbst[d]: `NV50C0_SET_PARAMETER_SIZE` mhhh
17:22 karolherbst[d]: I don't think it exists anymore
17:37 gfxstrand[d]: gfxstrand[d]: Well, that was it.
17:37 gfxstrand[d]: Pass: 965245, Fail: 102215, Crash: 20, Skip: 1676519, Flake: 2, Duration: 2:20:24, Remaining: 0
17:37 gfxstrand[d]: and dmesg is 100% clean.
17:40 asdqueerfromeu[d]: karolherbst[d]: ~~Don't give 🔺 ideas~~
17:55 gfxstrand[d]: Why does `shfl.up` not work?!? `shfl.down` works fine
18:29 gfxstrand[d]: Okay, so it works if I use the immediate form of 0 but not rz?!?
18:29 gfxstrand[d]: Hardware is cursed...
18:31 karolherbst[d]: huh?
18:31 karolherbst[d]: weird
18:33 gfxstrand[d]: Yeah, IDK what's going on
18:35 snowycoder[d]: gfxstrand[d]: What does it do with rz?
18:35 gfxstrand[d]: This is fine: `shfl.up pt, r5, r5, r3, 0x0`
18:35 gfxstrand[d]: This is not: `shfl.up pt, r5, r5, r3, rz`
18:36 karolherbst[d]: ....
18:37 karolherbst[d]: mhhhhh
18:37 karolherbst[d]: sooooo
18:37 karolherbst[d]: weird
18:38 karolherbst[d]: I wonder if codegen does any lowering...
18:38 snowycoder[d]: maybe nvdisasm sees it as a register but the hardware still loads an immediate?
18:39 karolherbst[d]: that would be silly
18:39 gfxstrand[d]: My understanding is that rz is special-cased all over the hardware. There are misc. instructions that just don't handle it. I suspect SM20 `shfl` is one such instance.
18:39 karolherbst[d]: does a register with the value 0 work?
18:40 karolherbst[d]: is this kepler2 or the fermi ISA btw?
18:41 karolherbst[d]: tho shfl was new with kepler, could have screwed it up
18:41 gfxstrand[d]: Kepler A
18:42 karolherbst[d]: yeah.. so the first hw with shfl
18:42 gfxstrand[d]: Even on Turing, though, there some instructions that just don't like rZ for $reasons
18:42 karolherbst[d]: it's not like we made heavy use of shfl in codegen in the first place, so all odds are off anyway 😄
18:43 karolherbst[d]: I mean.. aren't they all instructions taking vector inputs, or also scalars?
18:43 karolherbst[d]: though... I kinda hoped they'd document it 😄
18:44 karolherbst[d]: gfxstrand[d]: do you have a list handy of instructions that are impacted by that?
18:45 karolherbst[d]: also do you see the same behavior on the 2nd source of SHFL?
18:45 karolherbst[d]: if so.. I might be able to tell which instructions are impacted
18:47 gfxstrand[d]: karolherbst[d]: Not that I can remember immediately. Maybe the data param of `st`.
18:48 karolherbst[d]: mhhh
18:49 karolherbst[d]: also for 32 bit stores?
18:49 gfxstrand[d]: yup
18:49 gfxstrand[d]: Also maybe the data source of `shfl`?
18:52 karolherbst[d]: then I have no idea
18:53 karolherbst[d]: well...
18:53 karolherbst[d]: could be simply if it's not fetched as a trivial 32 bit value, it might not work
18:53 karolherbst[d]: which for shfl would be the 2nd and 3rd source
18:54 gfxstrand[d]: <a:shrug_anim:1096500513106841673>
18:55 gfxstrand[d]: But we only ever put immediates there in NAK so meh.
18:55 gfxstrand[d]: If we ever try to implement CUDA *and* decide we care about Kepler A, I can figure it out later.
18:56 gfxstrand[d]: For now I'll just use a 0 immediate
19:08 gfxstrand[d]: Sorted (well enough): https://gitlab.freedesktop.org/mesa/mesa/-/commit/cd953a7dfa3fca2e95af858e9506eda2a570e6bb
22:09 snowycoder[d]: gfxstrand[d]: should I squash your tex commits into one or shoul I keep them separate?
23:13 gfxstrand[d]: Tex commits?
23:40 gfxstrand[d]: Woo! Volta is conformant now. That's the last one. I'll merge the MR tomorrow or Tuesday. Why wait? Because there's a blog post to go along with it that I'd like to post first.
23:58 gfxstrand[d]: gfxstrand[d]: My branch only has three commits on top of main. I'm not sure what you're thinking would get squashed.
23:59 snowycoder[d]: gfxstrand[d]: I squashed all commits for texture ops in your kepler-tex into one (for a clearer git history)
23:59 snowycoder[d]: in 10 mins I'll push, I need just a bit of rewording