00:06mhenning[d]: Is anyone with a blackwell card willing to run a test program under the proprietary driver for me?
00:19snowycoder[d]: mhenning[d]: I can do that tomorrow
01:08mhenning[d]: snowycoder[d]: Okay, cool. Rough steps are:
01:08mhenning[d]: 1. Boot into proprietary driver
01:08mhenning[d]: 2. Build envyhooks and nv_push_dump
01:08mhenning[d]: 3. Check out https://gitlab.freedesktop.org/mhenning/re/
01:08mhenning[d]: 4. cd into the vk_event directory
01:08mhenning[d]: 5. Edit config.sh to point to the binaries you built in step 2 and set GENERATION=BLACKWELL
01:08mhenning[d]: 6. Run `./compile.sh`
01:08mhenning[d]: 7. Run `./gen_all.sh`
01:08mhenning[d]: 8. Step 7 should have created a file named all.out in the current directory. Send this file to me.
01:08mhenning[d]: Let me know if you have any issues
08:59snowycoder[d]: mhenning[d]: I don't think that `nv_push_dump` in the main mesa branch supports `BLACKWELL`, I need a newer branch?
08:59marysaka[d]: snowycoder[d]: it's on main but you need to build nouveau tools
09:00marysaka[d]: so like ` -Dtools=nouveau`
09:00snowycoder[d]: Right, I was using the wrong meson config and had an outdated version
09:01marysaka[d]: I should write docs on how to use all the re tools tbh might be useful
09:06snowycoder[d]: Ok no wait, https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/headers/nv_push_dump.c?ref_type=heads
09:06snowycoder[d]: The file on main literally does not support `BLACKWELL` as a generation, am I stupid?
09:07marysaka[d]: snowycoder[d]: ... that's true let me do a quick MR for that
09:08marysaka[d]: we used to not have all headers so that's probably why it's missing
09:15marysaka[d]: took me 2 years to see that I used 2 spaces in that file uuum I guess I will fix that after it
09:21snowycoder[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1418527291180585010/all.out?ex=68ce71fd&is=68cd207d&hm=2371e71d5599e29a0dd3048c245cff308f8468991bdc0a77de3d3a1a64352698&
09:21snowycoder[d]: mhenning[d]: Here's the dumped file using `BLACKWELL_A` as the header for eng3d/compute
09:36karolherbst[d]: okay.. uhm.. I think ugpr handling kinda needs to be fixed in nak, because it makes a lot of optimizatons using them unpredictable π’
09:37karolherbst[d]: like not sure that wiring up ULDC wouldn't just cause even more issues
09:38karolherbst[d]: so as I understand things is, this `if !block_uniform {` check in legalize exists in case an instruction consumes a ugpr vector, but it's not aligned
09:39karolherbst[d]: but I wonder if this block could go, and we'll deal with it once such a usage actually exist?
09:46marysaka[d]: snowycoder[d]: actually you need to use BLACKWELL_B for 3D and the compute one for compute
09:46marysaka[d]: and BLACKWELL_DMA_COPY_B for the dma one
09:46marysaka[d]: will throw a smol patch and stop hyperfixating on refactoring my mess :linatehe:
09:55snowycoder[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1418535905035161620/all.out?ex=68ce7a02&is=68cd2882&hm=a22873d4f79d5bddb7b962503ac2e692d8cee2bcf6a8df1a109769a00077094b&
09:55snowycoder[d]: Welp, sorry, this is the dumped file with BLACKWELL_B headers (using nvidia 580.82.09 on latest arch)
09:59marysaka[d]: no worries it's a bit of a mess actually... trying to add all missing classes supported to the dumper right now
10:20orowith2os[d]: I'm correct in reading that this is the OG switch, which means that Tegra can support Vulkan 1.4, right?
10:20orowith2os[d]: https://www.khronos.org/conformance/adopters/conformant-products/vulkan#submission_933
10:21orowith2os[d]: I was worried that I'd be stuck with 1.0 or 1.1 once the shield I have is able to boot into a Linux desktop.
10:21karolherbst[d]: oh the hw can do 1.4 just fine
10:21karolherbst[d]: even the previous switch can
10:43karolherbst[d]: okay.. my divergent branch detection was broken π now the stats look way more consistent
10:44karolherbst[d]: https://gist.githubusercontent.com/karolherbst/070f62ca47263d56d2b13b0511edb256/raw/4d8c46c3897aed3be01aa0b0b5005bb3ac8182a9/gistfile1.txt
11:55marysaka[d]: marysaka[d]: ended up starting typing a script to autogenerate this
12:59marysaka[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1418582356536594502/image.png?ex=68cea545&is=68cd53c5&hm=9b8e2f2b37e65080b3a3c4fc450e3b473f3c1b11561a00c9c5431c595bb4652c&
12:59marysaka[d]: uuurgh
13:00marysaka[d]: AMPERE_CHANNEL_GPFIFO_B header have AMPERE_CHANNEL_GPFIFO_A struct def that's annoying
13:01marysaka[d]: and both some stuffs were changed between those two hmm
13:11marysaka[d]: snowycoder[d]: for blackwell support and a lot more around the edges https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475
13:34snowy_coder: @karolherbst I now have a IRC bouncer/client setup but I don't see any rhys or emma in the major channels 0_o
13:35karolherbst: pendingchaos and anholt
13:35snowy_coder: I see them, thanks!
14:14kode54[d]: I see this discord is using the same cat icons bot for IRC forwarding as LGD
15:02chikuwad[d]: same bridge software
15:41lingm[d]: Both pendingchaos and anholt are here on the discord as well btw
18:15alyssa: gfxstrand: NAK's DCE is a lot more complicated than it probably needs to be
18:16alyssa: unless you're doing pre-RA control flow manipulation in backend, you can't get dead loop header phis out of NIR
18:16alyssa: and therefore can DCE in a single pass
18:16alyssa: (ACO & AGX both do it this way as of 2024, it's less code & faster)
18:19alyssa: even with pre-RA predication I don't *think* we'd get dead loop header phis so that's probably fine
18:19mohamexiety[d]: dumb q but what's DCE?
18:19alyssa: dead code elimination
18:19alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/asahi/compiler/agx_dce.c?ref_type=heads is really all you need, IMHO
18:22mohamexiety[d]: ahh thanks!
18:23mohamexiety[d]: that looks really simple
18:37gfxstrand[d]: alyssa: spilling adds phis
18:39alyssa: gfxstrand[d]: dead ones?
18:39alyssa: in loop headers?
18:41steel01[d]: I see a lot of abbreviations and mean completely different things in tegra land. :p Yay alphabet soup. Conflicting alphabet soup, no less.
18:41gfxstrand[d]: I'm not sure. I haven't thought through that pass enough to be sure.
18:41gfxstrand[d]: I do know it generates dead code, though. Not sure about dead phis.
18:44alyssa: shrug
18:44alyssarosenzweig[d]: echo
18:44alyssa: echo
18:44gfxstrand: Hi, shadow of alyssa
18:45alyssarosenzweig[d]: hi!
18:45gfxstrand: Discord says you're new here and that I should say hi
18:45alyssa: hi!
18:45chikuwad[d]: https://tenor.com/view/star-wars-second-child-there-are-two-of-them-getting-out-of-hand-gif-16342964
18:45chikuwad[d]: https://tenor.com/view/star-wars-second-child-there-are-two-of-them-getting-out-of-hand-gif-16342964
18:46chikuwad[d]: gee thanks discord
18:46chikuwad[d]: amusing that the gif also posted twice
18:46gfxstrand[d]: It did
18:46alyssarosenzweig[d]: https://tenor.com/view/princess-luna-my-little-pony-friendship-is-magic-mlp-mlpfim-gif-4194811303359735317
18:46gfxstrand[d]: But it's Discord so you can delete one if you want
18:47chikuwad[d]: no I think it's funnier this way
18:47chikuwad[d]: :D
18:48chikuwad[d]: been meaning to continue w/ the compiler thing I was trying but I've been buried in phonecalls and paperwork this week
18:48chikuwad[d]: and now that it's over (got the final acceptance letter), I'm traveling to a different state for a conf
18:48chikuwad[d]:sigh
18:54mohamexiety[d]: Sid new job? <a:vibrate:1066802555981672650>
18:54chikuwad[d]: no(t yet)
18:55chikuwad[d]: attending [IndiaFOSS](https://fossunited.org/indiafoss/2025) because a company who'd like for me to work on their hardware sponsored my ticket to it
18:56chikuwad[d]: they're currently looking at having a prototype shipped to me
19:08mohamexiety[d]: noiice, hopefully that goes through. epic stuff and really happy for you β€οΈ <a:vibrate:1066802555981672650>
19:09mohamexiety[d]: gfxstrand[d]: karolherbst[d] pushed to the compute mme branch the unit test, so that's ready for final review. sorry for the delay but things have been a bit bad here this month
19:09gfxstrand[d]: Cool. Thanks!
19:09karolherbst[d]: cool, thanks
19:10gfxstrand[d]: My 50 minute XDC talk has 100 slides. π
19:10alyssarosenzweig[d]: girl...!
19:10gfxstrand[d]: I talk fast?
19:10mohamexiety[d]: :KEKW:
19:11gfxstrand[d]: I'm going to be deleting some of those. I tend to write lots of talk first and trim later.
19:12karolherbst[d]: you can prolly leave some of the stuff out I can talk about π
19:12gfxstrand[d]: Nah. I'm going to cover everything in graphics in my talk. The rest of XDC will be a waist of everyone's time.
19:13HdkR: Split the talk between two people so you get double the time for Nouveau :P
19:13chikuwad[d]: semi-hostile takeover
19:13gfxstrand[d]: Oh, this talk isn't even about nouveau. That's a different talk.
19:13HdkR: lol
19:13gfxstrand[d]: I need to make my nouveau slides next
19:13alyssarosenzweig[d]: girl!!
19:14gfxstrand[d]: The 100 slides is just me rambling about descriptors for an hour
19:15mohamexiety[d]: looking forwards to that one <a:vibrate:1066802555981672650>
19:15marysaka[d]: gfxstrand[d]: so when are we rapping that talk again :3
19:16marysaka[d]: cannot type properly tonight :painpeko:
19:16mohamexiety[d]: roasting descriptors in 50 minutes
19:16mohamexiety[d]: a diss track basically
19:16Mary: oh the bot doesn't take the edits on discord, make sense I guess
19:18mhenning[d]: snowycoder[d]: Thanks!
19:19dwfreed: Mary: edits get far too noisy quickly (the matrix people love to make 5 edits to the same message, and the IRC bridge repeats the message for each one)
19:19gfxstrand[d]: The old Discord<->IRC bridge did, too.
19:20gfxstrand: But I'm a compulsory editor so I always have to catch myself if the person I'm talking to is on IRC and the edit is big.
19:21Mary: yeah I have that a lot...
19:21Mary: discord catching sed syntax also doesn't help
19:23chikuwad[d]: escape one of the slashes in it
19:23chikuwad[d]: but yeah, pain
19:32gfxstrand[d]: Hrm... Maybe I can claim to have enabled Tegra by the end of XDC... π€
19:32chikuwad[d]: π
19:38alyssarosenzweig[d]: i sit in a cubicle all day now π
19:38alyssarosenzweig[d]: no splashy xdc talk from me this year βΊοΈ
19:39gfxstrand[d]: Ah, the famous Intel travel budget...
19:43marysaka[d]: gfxstrand[d]: maybe I can figure out the difference your board have before the end of it :maxpoeSweat:
19:47alyssarosenzweig[d]: gfxstrand[d]: What travel budget?
19:47gfxstrand[d]: Exactly!
20:21airlied[d]: Just ask NVIDIA for budget, seems to work for the upper mgmt π
20:30mohamexiety[d]: something something AI depends on big fast GPU and you are all about making that GPU go fast
20:53alyssarosenzweig[d]: airlied[d]: kek
21:45mhenning[d]: could we ask nvidia for blackwell B GPFIFO headers? The header in open-gpu-kernel-modules looks incomplete and I have a hunch the old tert opcodes might have been removed and used for something else based on the amount of USE_SUBDEVICE_MASK I'm suddenly seeing in the blackwell traces