00:23 snowycoder[d]: gfxstrand[d]: I'm a bit out of the loop but I can help with proc-macros if that's needed
00:32 karolherbst[d]: gfxstrand[d]: I kinda got lucky that for MMA on consumer Ampere, I could use the hopper values and it worked reliably enough...
00:40 orowith2os[d]: marysaka: how's Tegra feeling with NVK lately?
00:59 gfxstrand[d]: Sickly
01:00 gfxstrand[d]: I still need to get a Tegra board up and working so I can hack on it.
01:07 gfxstrand[d]: I've got a TX1 and a Xavier sitting in my pile. I just need to find some free time and get one of them booting.
01:12 gfxstrand[d]: But I'm unlikely to find any such free time for a bit yet
01:12 gfxstrand[d]: Who am I kidding? I never have spare time. 😂
01:30 karolherbst[d]: who needs spare time when you can hack on mesa all day!
01:36 gfxstrand[d]: But mesa needs so much hacking!
04:29 airlied[d]: today seems to be another life distraction, but I think I'll probably generate the blackwell latencies from csv and just import the file for now, and do the proper generation for next release if I can
04:29 airlied[d]: adding the csv->rust translator means a bit of meson fuckery I don't really like
06:02 marysaka[d]: orowith2os[d]: as Faith said, no progress there had no time to put back my board somewhere to hack on it
06:56 mohamexiety[d]: airlied[d]: didnt faith already do it last night?
06:56 mohamexiety[d]: https://gitlab.freedesktop.org/gfxstrand/mesa/-/commits/nvk/blackwell-latencies
07:08 rinlovesyou[d]: airlied[d]: this definitely gets rid of those specific errors
07:11 rinlovesyou[d]: now i just need to find out why my entire system freezes up after a few minutes
07:45 marysaka[d]: skeggsb9778: I was trying to understand VMM code last night but I cannot make sense of the alignment currently defined for Pascal+ for large pages https://elixir.bootlin.com/linux/v6.16-rc1/source/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c#L386
07:45 marysaka[d]: Shouldn't that be at the very least 0x1000 like the others or even 64K?
07:48 airlied[d]: marysaka[d]: no you can pack multiple descriptors into one page
07:52 marysaka[d]: airlied[d]: just to be sure the size of a descriptor is 0x8 except for PD0 where it's 0x10?
07:54 airlied[d]: yes I think that is right
07:55 airlied[d]: PGD contains an SPT and LPT ptr I think
08:07 marysaka[d]: from what I'm seeing there shouldn't be a difference on that align field compared to SPT except to differentiate them in nvkm_mmu_ptc_get
08:09 marysaka[d]: would it be fine if I do a series later that move things to use dev_mmu.h/dev_vm.h headers? kind of started doing that to not get lost while navigating those part of the code 😅
12:32 gfxstrand[d]: airlied[d]: I already did the meson hacking.
12:32 gfxstrand[d]: I might rewrite it in rust if I'm feeling spicy by <a:shrug_anim:1096500513106841673>
12:33 gfxstrand[d]: Or you can push a branch with more CSV and some hacks and I can clean it up.
12:34 mac: hello
13:04 x512[m]: gfxstrand[d]: Blackwell have no PTE flags anymore?
13:12 gfxstrand[d]: x512[m]: Nope. Everything is generic
13:59 mohamexiety[d]: https://arxiv.org/pdf/2507.10789 something here may be useful
13:59 gfxstrand[d]: kar1m0: What do `txq.texture_header_type` and `txq.txq.texture_sampler_pos` return? In all channels?
13:59 mohamexiety[d]: GB203
14:00 gfxstrand[d]: Trying to figure out if it's useful or if I need to poke at descriptors myself.
14:00 gfxstrand[d]: gfxstrand[d]: karolherbst[d] , rather
14:01 kar1m0[d]: gfxstrand[d]: I believe that you have mistaken me for someone else
14:01 gfxstrand[d]: IDK why Discord likes to auto-complete @kar to kar1m0[d]
14:02 kar1m0[d]: gfxstrand[d]: Because my username is kar1m0
14:03 gfxstrand[d]: And I guess 1 does come before o. Damn sorting...
14:04 chikuwad[d]: gfxstrand[d]: curses, tech is working as intended :blobfistshake:
14:09 kar1m0[d]: It's good that you pinged me
14:09 kar1m0[d]: Completely forgot that I was planning to work on a little something
14:09 kar1m0[d]: Work does make you forget things
14:10 kar1m0[d]: chikuwad[d]: have you heard anything about someone working on implementing an output for nouveau? As in like with Nvidia
14:10 kar1m0[d]: Though I doubt
14:10 chikuwad[d]: wdym "an output"
14:10 kar1m0[d]: I just wasn't following for the past month
14:11 kar1m0[d]: chikuwad[d]: Gpu usage vram usage
14:11 chikuwad[d]: ah hwmon
14:11 kar1m0[d]: chikuwad[d]: Yes but it only works with old gpus
14:11 kar1m0[d]: Newer ones don't
14:11 chikuwad[d]: I don't think anyone is working on wiring it up for gsp atm
14:11 kar1m0[d]: Alright
14:19 avhe[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1395409463372021770/image.png?ex=687a57d5&is=68790655&hm=39ff65f77ed46550990b72aa7b390d7aea102387505214e214822f6da945cbea&
14:19 avhe[d]: mohamexiety[d]: 🤔
14:20 mohamexiety[d]: avhe[d]: hm?
14:21 avhe[d]: commented-out line in their conclusion
14:21 mohamexiety[d]: ohh missed that. interesting
14:21 avhe[d]: you can download the tex source from arxiv
14:21 avhe[d]: most authors forget to strip comments
14:21 mohamexiety[d]: yeah I didn't know that's a thing :KEKW:
14:23 karolherbst[d]: gfxstrand[d]: gimme a sec
14:24 karolherbst[d]: texture_header_type: r: 0, g: 0, b: ms count, a: 0
14:24 karolherbst[d]: texture_sampler_pos: r: sampler pos in (4.12, 4.12) format, g: 0, b: 0, a: 0
14:25 karolherbst[d]: Some of those channels are reserved for future use
14:35 gfxstrand[d]: Trying to figure out what "sampler pos" means
14:35 gfxstrand[d]: Knowing that it's supposed to be 4.12 helps, I think.
14:36 gfxstrand[d]: Or not
14:57 gfxstrand[d]: Okay, I'm starting to figure out what this all means. I think I might be able to use it. Still not 100% sure, though.
15:11 gfxstrand[d]: Okay, so it's sample positions as per D3D standard sample locations for the samples in the image.
15:11 gfxstrand[d]: I think I can work with that
15:12 karolherbst[d]: it's dx/dy fyi
15:13 gfxstrand[d]: What does that mean?
15:13 karolherbst[d]: whatever IPA and PIXLD need
15:13 karolherbst[d]: or return
15:13 karolherbst[d]: or whatever
15:13 gfxstrand[d]: Yeah. It's sample positions
15:13 gfxstrand[d]: In weird signed fixed-point
15:13 karolherbst[d]: yeo
15:14 karolherbst[d]: anyway.. you get the MS sample position out of it
15:14 gfxstrand[d]: Yeah, I figured that out
15:14 gfxstrand[d]: The real question is if I can get the image sub-pixel coordinate out if it
15:16 gfxstrand[d]: And I think I can
15:17 gfxstrand[d]: The next question then becomes what do suld/sust do on a MSAA image
15:18 gfxstrand[d]: Yup. I can get image sub-pixel coordinates out of it if I'm careful with my maths
15:19 gfxstrand[d]: All I need is for suld/sust to work
15:20 karolherbst[d]: dunno if surface ops handle multisampling in any meaningful way?
15:20 gfxstrand[d]: I'm gonna find out!
15:20 karolherbst[d]: can you even specify the MSAA mode on the image view?
15:21 gfxstrand[d]: Yes
15:21 gfxstrand[d]: But you can't specify it on the suld/sust op
15:21 karolherbst[d]: mhhh
15:21 gfxstrand[d]: I *think* it's just treated as a 2d image that's scaled up. That's what I'm hoping anyway
15:21 karolherbst[d]: yeah...
15:21 karolherbst[d]: that would be my guess
15:21 gfxstrand[d]: It would be nice but sometimes nvidia hardware gets overly clever
15:23 gfxstrand[d]: For those who are curious: I'm trying to implement MSAA image load/store without extra descriptor bits
15:25 marysaka[d]: oh funky
15:40 gfxstrand[d]: Ugh... That's what I was afraid of
15:40 gfxstrand[d]: It treates it as a 2D image... with the original image dimensions.
15:40 gfxstrand[d]: So it gets cropped down to only part of the image
15:41 karolherbst[d]: mhhhh
15:41 gfxstrand[d]: But that might be kind of okay. I think I can work with that
15:43 gfxstrand[d]: NIL might need to learn the difference between texture and storage, though. 😢
15:46 gfxstrand[d]: So I think I can do an MSAA image descriptor with the image size in samples rather than pixels. That'll get my bounds checks for image load/store correct. Then I can use `txq.texture_header_type` to get the sample count to adjust the return value of `imageSize()`. For actual load/store, I can use `txq.sampler_pos` and a bit of math to figure out the x/y offsets to add to x/y to get the per-sample
15:46 gfxstrand[d]: coordinates.
15:46 gfxstrand[d]: And all this without reading the image descriptor manually
15:46 gfxstrand[d]: Yeah, I think this works. It's a little nutty but I think it works.
15:49 karolherbst[d]: that means that all sust/suld need to scale their coordinates then, right?
15:55 gfxstrand[d]: Yeah, I'll have to do that in the shader
22:39 gfxstrand[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36207
23:23 rinlovesyou[d]: rinlovesyou[d]: Looks like i was on a slightly older zink version, however after building from source we're back to not even getting to the login screen