IRC Logs of #nouveau on irc.freenode.net for 2025-08-26

00:02 gfxstrand[d]: Subc switching is also high
00:02 gfxstrand[d]: Anything else?
00:03 airlied[d]: I assume karolherbst[d] has some things it owould be good to land
00:03 airlied[d]: like the ones he posted earlier, LDSM at least
00:04 gfxstrand[d]: Yeah
00:05 gfxstrand[d]: Address calcs are also pretty high on the list
00:13 mohamexiety[d]: gfxstrand[d]: Compression pls
00:13 gfxstrand[d]: Yes
00:14 airlied[d]: is compression working solidy now?
00:15 mohamexiety[d]: Beyond that I have a few testing avenues for improving performance I am looking into with marysaka[d] but no MRs yet
00:15 gfxstrand[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1409692592404431039/image0.jpg?ex=68ae4e08&is=68acfc88&hm=48ef8e0b0ac46086b86189cb0f2696a4562b4e469dbda9e4086b29ad77dc21b9&
00:15 mohamexiety[d]: airlied[d]: Yes but the way we enable it is a bit janky on the nvk side so it really needs review there
00:16 mohamexiety[d]: mohamexiety[d]: And no modifiers but that could get resolved if we figure out a good sane way
00:17 mohamexiety[d]: Kernel side is p much complete, I guess missing some version change so we can enable it in nvk based on that but beyond that all good I think though I didn’t run the patch check script thingy
00:24 airlied[d]: CTS passes on a few GPUs?
00:27 mohamexiety[d]: Only tried Ada and it passes all except some device synchronization thing
00:27 mohamexiety[d]: And that one is due to how we enable it in nvk right now
00:27 gfxstrand[d]: We need to test on at least Turing-Ada and Blackwell.
00:28 mohamexiety[d]: On my end I sometimes have issues with wsi tests murdering the kernel but I don’t think this is related because this happens without the kernel changes and without the nvk changes
00:28 mohamexiety[d]: (It’s not really wsi tests, but some tests spam opening windows which just kills my kernel on one machine)
00:49 mhenning[d]: gfxstrand[d]: zcull
00:50 mhenning[d]: and maybe maintenance8, although I still want to double check the spec on that one
02:09 gfxstrand[d]: Sorry. I've been a bit out to lunch since finishing off Blackwell. There's been a lot of things tugging at me from all directions. Hopefully not much more between here and XDC, though.
02:15 airlied[d]: oh scheduler it's been a while
02:40 gfxstrand[d]: airlied[d]: Uh oh...
03:50 airlied[d]: found the place where adding a printf makes it go away 🙁
03:59 tiffanylarsson: I said , that i no longer cover the low-level permuration stuff here, it's too big variety of solutions available and it isn't sane to spam them before testing or in such amounts where they total. Something in right moments get tested and will work definitely instead however. and on wasmati it isn't a requirement to compile to wasm i assume there are pleanty of solutions to compile to
03:59 tiffanylarsson: native instead of wasm jit containers and for whatever OS there's driver available. Other stuff i posted is seeming accurate to me when i reread.
04:01 HdkR: Oh what a shame, I'm still here. Better luck next time.
04:03 gfxstrand[d]: airlied[d]: Uh oh... Progress?
04:05 airlied[d]: well I think I'm at the point I get to when I've looked in the past, inth->event->uevent->fence code has multiple layers of "allowed" at least one of which is probably broken 😛
04:06 airlied[d]: I dislike atomics
04:06 chikuwad[d]: gfxstrand[d]: device address binding report has been ready for a while too :froge:
04:08 gfxstrand[d]: Okay, tomorrow I'm going to go through the MR backlog. I'll try to merge or leave detailed review on at least one big one per day and as well as some small ones. Now that Blackwell is working for games, I can properly perf test stuff, too.
04:09 chikuwad[d]: I should also work on getting some of my draft MRs done tbh
04:13 mhenning[d]: airlied[d]: fwiw, I tried to disable some of the "allowed" code and it didn't fix the transfer queue issue
04:14 mhenning[d]: although I'm not 100% sure I did it correctly
04:14 gfxstrand[d]: airlied[d]: So do we all. Except those of us who don't. Those people scare me...
04:17 Lailafugees: I am not hugely bothered with your fecalist pony or rate my poo heros illborn terrorists from Estonia. For both cases i know their Muammar Gaddafi fraud ain't gonna carry out this time, and pony ends his life with dick cut into his mouth or rabbit lip marked like the last ones terroring me.
05:29 airlied[d]: it does seem like we can get the irq before we get the memory write
06:59 karolherbst[d]: airlied[d]: I have something trivial for you to review 😄 https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36391
06:59 karolherbst[d]: which can be reviewed by anybody really
07:04 karolherbst[d]: this will be required when supporting DMMA or the other funky sizes
07:04 karolherbst[d]: or well
07:04 karolherbst[d]: making it more sane to do so
07:08 marysaka[d]: gfxstrand[d]: !36970 to not always allocate on GART for nvk_cmd_mem_create in case we know we have rebar, maybe that can help more stuffs (that passes CTS on Ada for me)
07:34 karolherbst[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36841 would also be good to land
07:35 karolherbst[d]: but not sure if that's something that needs review from folks knowing stuff or just a formal review
07:35 karolherbst[d]: airlied[d]: ohh you might want to review that one as well.. at least the nir and radv bits
09:29 karolherbst[d]: Ohh, if somebody wants to do some performance testing of this MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36877
09:29 karolherbst[d]: should help everything across the board
09:29 karolherbst[d]: by a little bit
09:30 karolherbst[d]: though I can't really tell which games might be impacted the most, because we don't collect the stats where it would show a change 🥲
09:30 karolherbst[d]: maybe I should write an MR to report both values...
10:12 phomes_[d]: !36823 and !36970 each improve performance on VKD3D. Combined they give 3 to 12% better fps
10:39 phomes_[d]: karolherbst[d]: fps reported by mango hud and the game's own fps score are not showing any change with this MR sadly
10:39 karolherbst[d]: fake news 😛
10:40 karolherbst[d]: but yeah.. the expected impact is like 0.5% 😄
10:40 karolherbst[d]: if at all
10:40 karolherbst[d]: it does give me a bit more in coop matrix shaders tho
10:45 karolherbst[d]: maybe I should check in the shader-db what games are impacted more.. but for that I'll need to hack up stats
12:54 gfxstrand[d]: airlied[d]: That's possible. Annoying but possible.
12:55 gfxstrand[d]: One thing we could try is waiting on it on the command stream before kicking the interrupt.
14:05 mohamexiety[d]: sonicadvance1[d]: sorry for ping but is there anything special I have to do for cross compiling mesa 32bit on arch? getting a
14:05 mohamexiety[d]: `meson.build:1856:21: ERROR: Dependency "LLVMSPIRVLib" not found, tried pkgconfig`
14:06 mohamexiety[d]: using `'i686-pc-linux-gnu-pkg-config'` pkg-config
14:06 chikuwad[d]: do you have `lib32-spirv-llvm-translator` installed
14:07 chikuwad[d]: you need 32-bit dependencies, there's a list here https://aur.archlinux.org/packages/lib32-mesa-git?all_deps=1#pkgdeps
14:08 chikuwad[d]: additionally you need to set TARGET=i686-unknown-linux-gnu to tell the rust compiler what's going on
14:09 mohamexiety[d]: chikuwad[d]: yeah that one is done
14:09 mohamexiety[d]: chikuwad[d]: this wasnt. thanks! ❤️
14:10 chikuwad[d]: :meowsalute:
14:10 mohamexiety[d]: I hate cross compiling
14:13 gfxstrand[d]: Yeah... It sucks
14:17 mohamexiety[d]: FAILED: src/compiler/clc/mesa_clc
14:17 mohamexiety[d]: c++ -o src/compiler/clc/mesa_clc src/compiler/clc/mesa_clc.p/mesa_clc.c.o -Wl,--as-needed -Wl,--no-undefined -m32 -Wl,--start-group src/compiler/clc/liblibmesaclc.a src/compiler/nir/libnir.a src/compiler/libcompiler.a src/util/libmesa_util.a src/util/libmesa_util_simd.a src/util/blake3/libblake3.a src/c11/impl/libmesa_util_c11.a src/compiler/spirv/libvtn.a -Wl,--build-id=sha1 -fPIC -lLLVM-20
14:17 mohamexiety[d]: -pthread /usr/lib32/libSPIRV-Tools-opt.so /usr/lib32/libSPIRV-Tools.so /usr/lib32/libSPIRV-Tools-link.so /usr/lib32/libz.so -lm /usr/lib32/libzstd.so /usr/lib32/libunwind.so /usr/lib32/libdrm.so /usr/lib/libclang-cpp.so -lLLVM-20 /usr/lib32/libLLVMSPIRVLib.so -Wl,--end-group
14:17 mohamexiety[d]: /usr/bin/ld: /usr/lib/libclang-cpp.so: error adding symbols: file in wrong format
14:17 mohamexiety[d]: collect2: error: ld returned 1 exit status
14:17 mohamexiety[d]: ninja: build stopped: subcommand failed.
14:17 mohamexiety[d]: hm
14:19 mohamexiety[d]: oops
14:19 mohamexiety[d]: was using 64bit llvm-config
14:23 mohamexiety[d]: well that's interesting
14:23 mohamexiety[d]: gfxstrand[d]: your branch fixed cyberpunk, but avatar still mmu faulted :thonk:
14:23 mohamexiety[d]: [ 7254.088069] nouveau 0000:01:00.0: gsp: mmu fault queued
14:23 mohamexiety[d]: [ 7254.110937] nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:50 gfid:0 level:2 type:31 scope:1 part:233 fault_addr:0000003ec6ccf000 fault_type:00000002
14:23 mohamexiety[d]: [ 7254.110942] nouveau 0000:01:00.0: fifo:d00000:0032:0032:[afop.exe[94567]] errored - disabling channel
14:23 mohamexiety[d]: [ 7254.110950] nouveau 0000:01:00.0: afop.exe[94567]: channel 50 killed!
14:26 gfxstrand[d]: mohamexiety[d]: Does Avatar work on Ampere or Ada?
14:27 mohamexiety[d]: nvm that was unrelated. or well, a different bug I think
14:27 mohamexiety[d]: next launch it started up fine and completed the benchmark
14:28 mohamexiety[d]: so I think blackwell is now parity with Ada in terms of stability
14:29 mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1409907500723212298/image.png?ex=68af162e&is=68adc4ae&hm=c8bddef5054da248bd2346b1e4a4248fdeab261c77229a855c10f82dcafc98d0&
14:29 mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1409907501180260446/image.png?ex=68af162f&is=68adc4af&hm=ae2a591f9db13348b5c285f6a544b3b40460cf876b053c95582c456594e64eae&
14:29 mohamexiety[d]: and the results
14:29 mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1409907606591504384/image.png?ex=68af1648&is=68adc4c8&hm=98434a555465809a8746e865e5010de1f9c96247782160655054c7517af53f9c&
14:29 mohamexiety[d]: and cyberpunk
14:30 mohamexiety[d]: we're not really utilizing the gpu. dont have metrics but the fans are barely spinning and I dont think it's even doing 200W because it's almost cool to touch while it's running. on windows this would burn me :KEKW:
14:32 mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1409908256721211422/image.png?ex=68af16e3&is=68adc563&hm=8c619bb082c846a123498603bea63dc1a2fc27fb72318800f4beac9ac3690ca2&
14:32 mohamexiety[d]: mohamexiety[d]: wait second one is NV prop, sorry. this is the frametime graph for nvk:
14:39 gfxstrand[d]: mohamexiety[d]: Yeah, we have a LOT of stalls. We need to figure that out.
14:40 gfxstrand[d]: The fact that DA:TV runs at 40 FPS on my 5090 is absurd
14:40 gfxstrand[d]: mhenning[d]: 's subc switching will help
14:42 mohamexiety[d]: commented on the MR for documentation and I think that MR is good to go. very nice work! ❤️
14:42 karolherbst[d]: switching between 3d and compute causes a WFI, and given a lot of games do compute these days...
14:42 mohamexiety[d]: karolherbst[d]: it doesnt on blackwell
14:42 karolherbst[d]: yay
14:42 mohamexiety[d]: unless we still insert a WFI there regardless :thonk:
14:43 karolherbst[d]: uploading the QMD via the push buffer should also help
14:43 karolherbst[d]: but...
14:43 karolherbst[d]: that's not really stalling
14:43 karolherbst[d]: ohh yeah.. we should land the DFS and static cycle count fixes, otherwise shader-db stats are annoying to create 😄
14:56 gfxstrand[d]: Well, the big thing to get rid of CS stalls is switching to the compute MME on Ampere+
14:56 gfxstrand[d]: Because right now we're running 3D MMEs for every compute job. It's... not great.
14:56 mohamexiety[d]: what would that involve? :thonk:
14:56 gfxstrand[d]: Mostly just figuring out if compute and 3D use different MME upload space and flipping it on.
14:57 gfxstrand[d]: Shouldn't be hard
14:59 marysaka[d]: yeah we do some 3D MME cal right after some dispatch from what I saw on pyrowave so maybe related (likely the CS invocation counter?)
15:00 marysaka[d]: as a side node would be nice to teach nv_push_dump about the macros names we have to make it more readable
15:01 gfxstrand[d]: Yeah, having a way to name macros would be cool
15:02 gfxstrand[d]: Maybe a callback to the push dumper?
15:04 gfxstrand[d]: There's a lot of things I'd like to see done to make push dumping smarter.
15:05 gfxstrand[d]: Naming macros is just the beginning. We could dump QMDs, even if they're in memory and not inline. We could dump shader source (Intel does that; it's neat). Maybe root constants?
15:06 gfxstrand[d]: I'd also like to see the MME simulator hooked up so we can have a flag to dump what the MME would generate rather than just the MME call
15:06 marysaka[d]: we should probably add QMDs and MMEs dumps, could remove MME dumping from envyhooks too
15:06 gfxstrand[d]: Yeah, disassembling MMEs on upload would be neat, too.
15:16 karolherbst[d]: gfxstrand[d]: ... oof
15:16 karolherbst[d]: so you say if I move to compute MME my coop matrix stuff might be faster? 😄
15:18 gfxstrand[d]: If that means you can burn RH time to do it, then yes, I'm sure it will make it way faster!
15:19 karolherbst[d]: great
15:19 karolherbst[d]: though I wanted to look into `nir_opt_reassociate` first 😄
15:19 karolherbst[d]: this can reassociate divergent sources
15:19 karolherbst[d]: or well the opposite
15:19 karolherbst[d]: point is..
15:19 karolherbst[d]: it can do the `(iadd (iadd ugpr gpr) ugpr)` -> `(iadd (iadd ugpr ugpr) gpr)` opt
15:20 karolherbst[d]: (and will help with the IO stuff)
15:21 karolherbst[d]: but I'm kinda sad that all the other opts don't have an impact as high as the membar.cta stuff
15:21 karolherbst[d]: though
15:21 karolherbst[d]: aggressive loop unrolling helps a lot...
15:22 karolherbst[d]: should get back to that why exactly
15:23 karolherbst[d]: mohamexiety[d]: if you want to try something fun: `op.max_unroll_iterations = 1024;` in `api.rs` and do more benchmarks
15:23 mohamexiety[d]: in games?
15:23 mohamexiety[d]: huh
15:23 karolherbst[d]: maybe just one or two
15:23 karolherbst[d]: could help in shader heavy games
15:24 karolherbst[d]: not that I suggest upstreaming such a change, but it might point out where one could take a look to figure out what to improve
15:25 gfxstrand[d]: karolherbst[d]: Nothing helps as much as not synchronizing across PCI
15:25 karolherbst[d]: well .gpu doesn't sync across pci
15:26 karolherbst[d]: it's this thing: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36877
15:26 karolherbst[d]: the `emits_deferred_barrier` part of it
15:27 karolherbst[d]: setting up the barrier is pretty expensive apparently
15:27 karolherbst[d]: and it kinda blocks the GPU from scheduling more stuff and all... at least that's my current working theory on why the perf impact was _that_ high
15:28 karolherbst[d]: anyway
15:28 karolherbst[d]: that MR cleans up condition on when to emit the nop
15:30 phomes_[d]: interesting. Grabbing a screenshot of CS2 now consistently causes a MMU fault
15:31 mhenning[d]: mohamexiety[d]: Is this documented somewhere? It might mean I need to rethink part of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36823
15:32 mohamexiety[d]: mhenning[d]: yes, 1 sec
15:32 mhenning[d]: Although tbh if it's true we're probably missing a bunch of sync on blackwell
15:33 mohamexiety[d]: https://docs.nvidia.com/nsight-graphics/UserGuide/index.html#subchannel-switch-overlay
15:33 mohamexiety[d]: > On NVIDIA Blackwell Architecture GPUs and newer, subchannel switches do not occur between 3D and compute workloads. The overlay is therefore unavailable for those architectures.
15:33 mohamexiety[d]: amazing documentation really. not mentioned in the tuning guide or the whitepaper. instead it's deep in the nsight docs
15:34 mhenning[d]: Yeah, that's a good catch, thanks for finding that
15:48 karolherbst[d]: ohh right.. somebody should wire up perf counters...
15:49 karolherbst[d]: "to make coop mat go fast, I need to know why slow, therefore..."
15:49 karolherbst[d]: but it's such a pain, I'd hope that nvidia could just give us all the counters they have and how to configure them
15:53 mohamexiety[d]: I looked into this and since we don’t really have any form of documentation for perf counters our best bet is nsight
15:53 mohamexiety[d]: And nsight needs openrm support
15:53 karolherbst[d]: correct
15:54 karolherbst[d]: I think samuel reverse engineered those with nsight on the blob on older gens
15:54 karolherbst[d]: like reading them out in the shader isn't an issue, we have docs on that
15:54 karolherbst[d]: just the counter configs itself is the issue
15:59 marysaka[d]: karolherbst[d]: hmm wait are those not actually configured via those PRI regs on FALCON03/FALCON04 methods?
16:00 marysaka[d]: that kind of ring a bell to me :aki_thonk:
16:00 marysaka[d]: I think nvgpu has some api explicitly for that to do it on the CPU side via their uapi and a general whitelist per gen
16:01 karolherbst[d]: yeah..
16:01 marysaka[d]: so that might be useful for some stuffs, maybe openrm have something similar to at least give us some of those
16:01 karolherbst[d]: we used SW to do it
16:01 karolherbst[d]: like
16:01 karolherbst[d]: how to set them up isn't an issue
16:01 karolherbst[d]: we all know that
16:02 marysaka[d]: we should really create issues on gitlab to track those ideas otherwise they will get forgotten in the chat logs again 😄
16:02 karolherbst[d]: heh
16:02 karolherbst[d]: I created an issue yesterday!
16:03 karolherbst[d]: but not for that
16:03 gfxstrand[d]: karolherbst[d]: Yes. Obviously to make AI go fast we need all the tooling and all the kernel fixes. We probably also need color compression, better UBOs, and less stalling.
16:03 marysaka[d]: color compression is important for buffer actually with PLC
16:04 marysaka[d]: so we need PLC too /s
16:04 gfxstrand[d]: Just like we needed OpenCL for ray-tracing. That totally wasn't out-of-scope at all. :bim_giggle:
16:04 karolherbst[d]: heh
16:04 marysaka[d]: (PLC = L2 compression)
16:04 karolherbst[d]: but yeah.. I get to the point where it's not obvious where perf is lost
16:05 mohamexiety[d]: karolherbst[d]: That’s good. It could be anywhere so you have to work on many things to find it :Tomfoolery:
16:05 karolherbst[d]: karolherbst[d]: but yeah, that gives me like 2% more perf 😄
16:05 karolherbst[d]: it eliminates 2 nops per loop iteration
16:29 cubanismo[d]: stupid question: Where does the drm-misc-fixes branch/tree live? I see dakr referencing it when applying various fix patches (One of which was mine), but I don't see them land anywhere in git.freedesktop.org/git/drm/drm-misc. The gitlab repo for nouveau in MAINTAINERS seems either abandoned or a mirror of drm-misc.
16:29 cubanismo[d]: Feel like I'm missing a link in the chain here.
16:32 chikuwad[d]: https://gitlab.freedesktop.org/drm/misc/kernel
16:32 chikuwad[d]: it's a branch
16:33 chikuwad[d]: as is drm-misc-next (against which new patches are to be submitted)
16:33 chikuwad[d]: or that's what airlie told me to write my patch against
16:43 gfxstrand[d]: Looks like the linear patch has hit -fixes but nothings in -next yet
16:45 dakr: Yeah, we have to wait for the fix being backmerged into -next before applying the rest of the series.
17:00 cubanismo[d]: dakr Yeah, I understand that. I was just looking at the wrong drm-misc tree it seems (git.freedesktop.org/git/drm/drm-misc instead of gitlab.freedesktop.org/drm/misc/kernel)
17:01 dakr: Yeah, this has been moved a while ago.
17:02 cubanismo[d]: Yeah, that was my main concern. I'm essentially working off the wrong tree.
17:03 cubanismo[d]: Thanks for clarifying
17:03 cubanismo[d]: Is this the de-facto nouveau development/maintenance tree as well then?
17:04 dakr: Yes, nouveau patches usually go through the drm-misc tree.
17:05 cubanismo[d]: Should I/someone send a patch to MAINTAINERS to point people looking for the nouveau tree there then? It's confusing when I load up the git repo MAINTAINERS points to and it doesn't have any changes after 2023.
17:05 cubanismo[d]: I understand the action is in Nova these days, but nouveau is what users have for now
17:06 dakr: No need to justify. :) I think it makes perfect sense to correct outdated information.
17:06 dakr: A patch is welcome!
17:06 cubanismo[d]: K, will do.
17:11 snowycoder[d]: Patches to nouveau drm should still be sent with `git send-email` or through GitLab Merge Requests?
17:18 cubanismo[d]: Merge requests don't appear to be enabled on the repo above.
17:21 mohamexiety[d]: gitlab MRs? in _my_ linux kernel?
17:23 mhenning[d]: snowycoder[d]: In practice I think it's git send-email. There was an attempt at accepting gitlab MRs for nouveau at some point but I think it fizzled out.
17:27 mohamexiety[d]: that would be a lot nicer tbh
17:28 mhenning[d]: gfxstrand[d]: Want to take a quick look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36971 before I merge? I don't think it really needs code review, I just want to make sure we're all on the same page.
17:29 mhenning[d]: mohamexiety[d]: Yeah, I'm not a fan of sending patches to the mailing list personally
17:29 mohamexiety[d]: yup, same
17:42 gfxstrand[d]: mhenning[d]: Assuming it does what it says on the tin, yes. 🚢 it!
17:53 cubanismo[d]: Yeah, gitlab/github are the only web-based review workflows I can tolerate. I'd be happy with either of those personally. Just so long as it's something indexable by search engines (I.e., if the kernel moves to a review workflow based on discord) sorry, but I won't be contributing my two patches per decade anymore.
18:04 karolherbst[d]: yeah... at some point I had plans to make it all gitlab MR based, but... it's quite a lot of work to get it working so it fits into the workflow..
18:20 ermine1716[d]: Am i the only one who is fine with both mrs and mailing lists
19:30 gfxstrand[d]: If someone wants to take over https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31648, it's a pretty easy little compiler project
19:31 gfxstrand[d]: The only thing left is hooking up the shared lowering
19:33 chikuwad[d]: I'll give it a shot :meowsalute:
19:33 chikuwad[d]: was looking for things to do yesterday, eventually went back to ext_debug_marker
20:26 phomes_[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35507 is a small one waiting for review too. Not urgent though
22:08 gfxstrand[d]: mhenning[d]: If you wanted to review or at least ack https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914, that'd be good. It fixes one of the Horizon games.
22:09 gfxstrand[d]: And it's got a NIR change so I shouldn't just smash it in.
22:22 gfxstrand[d]: gfxstrand[d]: mohamexiety[d] now has a seemingly working branch for this. MR tomorrow, probably.
22:24 gfxstrand[d]: Building a kernel with the compression patches now. I'm gonna throw it at Blackwell to make sure nothing's toast there.
22:27 mohamexiety[d]: gfxstrand[d]: on Ada, the current state is everything passes except this:
22:27 mohamexiety[d]: dEQP-VK.synchronization2.cross_instance.dedicated.*
22:27 mohamexiety[d]: so blackwell should be similar if nothing is toasted
22:28 gfxstrand[d]: Have you tested on Blackwell?
22:28 mohamexiety[d]: nope
22:28 mohamexiety[d]: was focused on Ada
22:30 gfxstrand[d]: Should I start with a game? Am I that courageous?
22:31 marysaka[d]: it works on Ada for games but weston was very broken last time I checked
22:31 marysaka[d]: *should grab some blackwell gpu*
22:32 gfxstrand[d]: Well so far Steam doesn't seem to want to start
22:34 gfxstrand[d]: Looks like I need a 32-bit build
22:34 HdkR: Those times when Chrome is the biggest enemy.
22:34 gfxstrand[d]: It's alsmost like building an app, some of which is 32-bit, some of which is 64-bit, and all of which is chrome wasn't a great idea
22:35 HdkR: :D
22:35 HdkR: The legacy!
22:36 gfxstrand[d]: Computers were a mistake
22:36 airlied[d]: I feel Valve leave the 32-bit pieces of steam in there just to make people ship 32-bit drivers
22:37 gfxstrand[d]: Yeah, probably
22:37 marysaka[d]: Pretty sure there are quite a bit of 32-bit games
22:38 airlied[d]: once Talos went 64-bit I never looked back 😛
22:39 marysaka[d]: *looks at touhou 20 being 32-bit in 2025*
22:39 mohamexiety[d]: gfxstrand[d]: earlier today I was struggling with 32 bit cross compiling right?
22:40 mohamexiety[d]: it was because of avatar
22:40 mohamexiety[d]: as part of their DRM they ship a launcher
22:40 mohamexiety[d]: and that launcher is 32 bit
22:40 mohamexiety[d]: in 2025
22:40 mohamexiety[d]: (well the game is from 2024 but yeah)
22:49 mohamexiety[d]: gfxstrand[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37016 will go to bed now but posting this in case someone wants to take a look in case I missed anything (beyond locking it to Ampere+)
22:53 gfxstrand[d]: Steam does not seem to like this compression branch
22:54 gfxstrand[d]: But it could also be user error
22:54 gfxstrand[d]: Nope. Up and going now. :frog_party:
22:55 mohamexiety[d]: yayy
23:05 airlied[d]: oh this might not be insane races, it might be the irq storm protection code just being bullshit
23:10 airlied[d]: okay now to figure out how to fix that
23:29 phomes_[d]: mohamexiety[d]: a few fps extra in a few games. Everything seems to work fine
23:29 mohamexiety[d]: nice, thanks! ❤️
23:29 mohamexiety[d]: yeah it wont be anywhere major in games as they dont really spam compute _that_ much but still
23:30 mohamexiety[d]: it felt like a relatively low hanging fruit to tackle
23:40 mohamexiety[d]: It might unironically help karolherbst[d]’s matmul fun though now that I think about it so that could be worthwhile to test
23:51 mhenning[d]: gfxstrand[d]: I'll take a look tomorrow.