00:38airlied: woot tensor layout tests finally pass
10:26eric_engestrom: zzyiwei: yeah, I have some medical stuff planned and I haven't figured out when I'll be able to make releases, that's why it's not in the calendar yet
10:30eric_engestrom: I'll be out starting on oct 22 (maybe 21), and I'm not sure how long until I'm able to do things; medical leave is 6 weeks, but given that release work is not a full day of work and I do it from home (and if it's too painful to sit I can just stop and resume another day), I think I'll try to do it after 2 weeks and see how it goes
10:32eric_engestrom: the branchpoint would normally be on the week before that, but I think there's no point branching and then just leaving it sitting for weeks, diverging for no reason, so I think I'll put the branchpoint 2 weeks after my surgery, ie. on Nov 5
10:32eric_engestrom: thanks for prompting me to actually think it through 😊
10:32eric_engestrom: I'll update the calendera with that
11:12eric_engestrom: (changed my mind, I'll try to do it after one week, meaning 25.3 is pushed back by only 2 weeks)
11:12eric_engestrom: release calendar update here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37435 11:12eric_engestrom: distro packagers, let me know (on the MR) if pushing back 25.3 causes issues for you :)
11:12K900: This will be cutting it very close for NixOS 25.11 end of November
11:12K900: But I think we can make it work
11:15eric_engestrom: K900: thanks for letting me know :)
11:15eric_engestrom:should write down the schedules of all the distros
11:31karolherbst: end of October? Nice, guess I need to focus on landing more stuff until then :D
14:27dcbaker: eric_engestrom: Would you like help with the next release?
14:45eric_engestrom: talked in private, dcbaker will take the 25.3 release cycle, and it will happen at the normal schedule (branchpoint on oct 15) :)
15:53zzyiwei: eric_engestrom: Oh my...no worries about the next release. At least there's no urgency from our end. Please take good care of yourself and I hope you have a good recovery from your surgery!
15:55eric_engestrom: zzyiwei: thank you ❤️ fyi the 25.3 release cycle will be managed by dcbaker so it will not be delayed and not require anything from me :)
18:12simon-perretta-img: Aside from the opcode count increasing and the fact that most drivers don't seem to need/use vec widths other than 2..5/8/16 in NIR, are there any other obvious reasons I might be overlooking for not having more?
18:18dj-death: simon-perretta-img: I think ALU instruction side
18:18dj-death: simon-perretta-img: I tried to propose vec32 at some point
18:18dj-death: alyssa: you might remember why
18:20simon-perretta-img: Ah... yeah that makes sense for upping the limit
18:22alyssa: swizzle[]
18:22alyssa: swizzle[] is on every nir_alu_src and sized to maximum vec length
18:22alyssa: if you double the vec length every ALU instruction bloats accordingly
18:23alyssa: going 16->32 makes every ffma grow by 48 bytes
18:23alyssa: if someone could make swizzle[] not worst-case allocated I wouldn't care
18:23alyssa: but so far no takers on that
18:24alyssa: (Part of me would like to just remove swizzles from NIR but unfortunately that's a nonstarter.)
18:24simon-perretta-img: Fair
18:25simon-perretta-img: Although in which case, I guess defining some of the ops between 5,8,16 without upping the limit probably wouldn't be too controversial?
18:25alyssa: That'd probably be fine yeah
18:25simon-perretta-img: There are some pco-specific reasons for wanting to, but also I imagine nir_opt_load_store_vectorize would benefit from it
18:25alyssa: If you need vec7 or something, go for it
18:26alyssa: but no vec32 unless it comes with a patch to shrink nir_alu_src
18:26simon-perretta-img: Haha noted
18:27simon-perretta-img:sweeps Rogue DMA ops that can output vec1024 under the rug
18:27alyssa: Yeah no
18:28alyssa: Find another solution, vec1024 is not your solution.
18:29simon-perretta-img: Dw I wasn't being serious lol, the most that would ever realistically be needed is a vec18 or something
18:30alyssa: dj-death: If we still need vec32, lmk. now that I'm team bleu I might be more motivated to try to wrestle swizzle[]
18:30alyssa: ("Team Bleu? Is that French?" "Non it's just a cheesy joke")
18:31HdkR: Variable length NIR instructions weeee
18:31alyssa: HdkR: It's already var length
18:31HdkR: Dang
18:31alyssa: it's just an array-of-struct-of-arrays where the outer array is var and hence the inner cannot be
18:32HdkR: ah
18:32jenatali: Make swizzle into a pointer which can point after the sources to a second variable-length chunk of memory? Then null can be "no swizzle"
18:33jenatali: Though that makes it hard to convert from no swizzle to have a swizzle
18:33jenatali: Eh, can just reallocate the instruction with swizzles, not that bad probably
18:33alyssa: That's going to slow down the common cases with e.g. simple non-identity swizzles
18:34alyssa: lots of pointer chasing
18:34alyssa: so would need minimally like a tagged pointer thing
18:34jenatali: If it's to the same block of memory though it should be cache-friendly
18:34alyssa: "inline small swizzle or pointer to large swizzle"
18:34jenatali: Oh, for vec sizes < 8 yeah it could just be inline
18:35dj-death: alyssa: it would help for load uniform blocks
18:35dj-death: alyssa: right now we're limited at 16 * 4 = 64B
18:35alyssa: dj-death: Is that enough of a reason to burn a month on this? :(
18:35dj-death: alyssa: but we could load 128 with 32 items
18:35dj-death: alyssa: really a month? :)
18:35dj-death: alyssa: I thought that would be an afternoon for you
18:35alyssa: Probably.
18:35alyssa: No
18:36alyssa: would need replumbing swizzles in 20 backends
18:36dj-death: arg
18:36jenatali: Does it really need to return an immediately ALU-able vec32 or could it return something else opaque that has smaller vectors extracted from it?
18:37dj-death: well just do ours, throw it at perf CI and if the number look good...
18:37dj-death: jenatali: could be opaque
18:37alyssa: I like opaque tbh
18:37dj-death: but then we don't get the vectorizer
18:37jenatali: Hm, true
18:37alyssa: We could standardize Big Vectors in nir, though
18:38alyssa: common intrinsics to swizzle/extract/build? them
18:38jenatali: Seems reasonable to me
18:38alyssa: makes vectorizer messier but then simplifies everything else
18:39alyssa: and then ideally we would make spirv-to-nir handle vec16 as big vec
18:39jenatali: At that point I wonder if you could drop ALU vec size back down? Probably not though
18:39alyssa: so we can shrink swizzle[] back down to 8
18:39alyssa: jenatali: Should be able to. No Mesa driver does vec16 ALU in hardware, it's just there because of OpenCL
18:40jenatali: Oh cool, I'd assumed someone could do some 8x16 or something
18:40alyssa: Midgard is capable of doing vec16 alu but it's not wired up and I'm nak'ing anyone who tries.
18:40simon-perretta-img: I imagine that opaque approach could also be nice for being able to do things like indexable regs without having to use scratch mem
18:40jenatali: :P
18:40alyssa: as the original author of an unmainted compiler, I retain NAK rights
18:40alyssa: :p
18:40alyssa: simon-perretta-img: we already have that lol
18:40alyssa: load_reg/store_reg
18:41anholt: plug: for anyone looking at indexable regs, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245 may help you make better choices about what to leave as regs.
18:42simon-perretta-img: Huh, TIL
18:43alyssa: anholt: nir patch is a-b me, not sure you're going to get a good review on that
18:43jenatali: alyssa: One random thought - DXIL is actually adding arbitrary length vectors, so if there were intrinsics to operate on NIR "big vec" to do basic things like add/sub/mul without swizzles, I could translate those and get more compact shaders
18:43alyssa: but feel free to merge
18:43alyssa: jenatali: Yeah there's really two ways we could do this
18:44anholt: alyssa: I've been thinking about taking a stab at tuning someone else's driver about vars-to-scratch to increase interest. but, also, I need to sort out this copy prop vars masking because that's so much bigger of a deal and affects the same area.
18:44alyssa: jenatali: One is "big vec" is an opaque handle in NIR (like how nir reg works) and you need to do an extract/swizzle intrinsic to pass it to ALU
18:45alyssa: The other is having actual vec128 (or whatever) in NIR, and you can pass it into nir_alu_src[], but it's automatically an identity swizzle (and you need a special intrinsic to swizzle non-identity)
18:45alyssa: Both of these are ugly but could be made to work
18:45jenatali: Yeah I like that latter one, that seems useful
18:47alyssa: Both seem like a /lot/ of work to fix up all the producers. Idk
18:48alyssa: The latter has the benefit that it can at least be done incrementally with a nir_validate check
18:48alyssa: (and gives a path to shrinking all the way down to vec4 swizzles if that ends up faster.)
18:49alyssa: anholt: Yeah that's fair
18:50simon-perretta-img: Is the reasoning behind swizzle being integrated into nir_alu_src rather than e.g. a separate swizzle op, to save a level of indirection?
18:50alyssa: historical reasons basically
18:50simon-perretta-img: Mm, fair
18:51alyssa: but see also "we have 20 backends which means any nontrivial change to core NIR data structures costs 10s of 1000s of dollars of engineering time"
18:53alyssa: what I like about the latter approach is that only NIR producers really get changed
18:54alyssa: and only producers that mess with swizzle[] manually, pure nir_builder stuff doesn't change
19:19ukleinek: is it better to report nouveau bugs on the ML or in the bug tracker?
19:38ukleinek: Context is https://lore.kernel.org/dri-devel/E79B534D-6630-4AF3-950D-2391CDABCE16.1@smtp-inbound1.duck.com/ 19:45gfxstrand: How big is an image descriptor on AMD?
19:46pendingchaos: images are 32 bytes, samplers and buffers are 16 bytes
19:46pendingchaos: msaa images with fmask are two image descriptors
19:47gfxstrand: Okay, that explains the sampled image descriptor size
20:37Kayden: what happened here?
20:37Kayden: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37440 20:38Kayden: windows-msvc: fatal error C1083: Cannot open include file: 'util/format/u_format_gen.h': No such file or directory
20:38gfxstrand: Weird
20:38Kayden: no changes related to that...
20:38zmike: I've seen that before
20:38gfxstrand: Maybe a meson dep is missing and someone lost the race?
20:38zmike: it's a build system race condition
20:38zmike: there's a lot of them
20:38Kayden: ah
20:39Kayden: I'll try it again
20:42anholt: Kayden: landed a fix (I hope) https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316 20:43Kayden: anholt: thank you!
20:44anholt: tbh I wish we could convince meson to put deps on mesutil finishing before moving on to anything else. u_format generation dep correctness is just sisyphean.
20:45gfxstrand: Yeah, generated headers are just cursed from a build system POV.
20:45gfxstrand: We have some generated headers for Vulkan as well and eesh.
22:19anholt: ooh, https://reviews.llvm.org/D83032 looks like it could automated detect missing C/C++ header deps for the builds we do in CI.
22:19anholt: someone should do something about that.
22:20anholt: (gives me a distressing number of plausible-looking failures on my build)
22:27mattst88: I haven't tested it myself, but another gentoo dev has a script that finds missing deps in build.ninja files
22:27mattst88: https://codeberg.org/eli-schwartz/eschwartz-dev-scripts/src/branch/master/install-qa-check.d/50ninja-missingdeps 22:27mattst88: it's written as a QA-check for portage, but should be trivially modifiable to standalone
22:28mattst88: he said he's wanted to add it or something like it to Mesa's CI, but hasn't had a chance yet
22:39elibrokeit: eric_engestrom previously suggested I add that script to the mesa CI, yeah. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29115#note_2472998 22:39elibrokeit: I just never found the time to do it yet. :/