04:52Lynne: airlied: I just rebooted, and nouveau works now
04:52Lynne: guess it didn't like hot modprobes
04:56Lynne: it's using ampere's gsp binary from 535, so the binary version must match with the headers
05:19Lynne: got a triangle to render! struggled a bit to find any vulkan 1.0 clients at all
05:21Lynne: vkquake runs too!
05:33dj-death: Looks like a530 runners are running into unexpected passes
05:44HdkR: 4
09:08kode54: I still wonder what the heck is up with the Xe driver and compute causing gpu crashes and resets
09:08kode54: Is there any debug logging for the crash dump interface to pull the exact job queue that caused the crash? I saw something about that
10:07DavidHeidelberg[m]: dj-death: that's nice, link please. It's flake or geniue unexp pass?
10:08DavidHeidelberg[m]: dj-death: also sorry about mess with these, I added like four batches of flakes, but some are still showing up, it's not the driver with most stable results
10:08dj-death: I put the link in the MR
10:09dj-death: seems to be flakes mostly
10:10DavidHeidelberg[m]: I see the MR, thanks!
10:11dj-death: np
10:12DavidHeidelberg[m]: dj-death: at some point (next flakes batch) I think I'll wild card those which are just adding testcases in one group each time
10:12DavidHeidelberg[m]: *wildcard
13:19zmike: mareko: do you have a branch with your linker wip somewhere? I want to see where it fits into the shader lifetime
13:35mripard: daniels: \o/
13:35mripard: congrats
13:37daniels: mripard: thanks!
13:38mripard: (and congrats to everyone involved as well, obviously :))
13:43javierm: daniels: wow, huge news! Congrats to you and all folks involved :)
13:48MrCooper: what's this about?
13:48zmike: https://www.collabora.com/news-and-blog/news-and-events/a-helping-arm-for-panfrost.html I assume
13:49MrCooper: thanks
13:51daniels: javierm: thanks, but yeah, better directed at the ones doing the work :)
13:58MrCooper: nice indeed
14:29mareko: zmike: after varying linking and before uniform linking in the NIR linker of GLSL
14:30zmike: mareko: does running this yield store_output or is it expected that drivers are calling lower_io
14:32mareko: zmike: the linker calls nir_lower_io to lower store_deref to store_output, st/mesa and drivers only get store_output
14:32zmike: ahhh ok
14:32zmike: thanks
14:32mareko: zmike: the varying optimizer requires lowered IO, so the linker has to lower it
14:33zmike: right, I was expecting something like this but I wanted confirmation
14:33zmike: how is bindless texture io handled? e.g., passing nir_var_image in input/output
14:34mareko: bindless textures use 64-bit handles
14:34zmike: kinda?
14:35mareko: exactly
14:35zmike: on receiving them they're image types
14:35zmike: even if really they're 64bit ints
14:39mareko: zmike: they are 64-bit numbers in NIR, and radeonsi uses the numbers to load descriptors
14:40mareko: we literaly replace the 64-bit handle src with a vec4 or vec8 handle src containing the AMD image/sample/buffer descriptor
14:41mareko: *sampler
14:43zmike: I was talking about the variables, but I suppose they don't matter for explicit io
14:43mareko: bindless don't have variables
14:44mareko: also GLSL has this tricky feaature that it can typecast a bindless handle to a variable and vice versa
14:44mareko: we don't implement the typecasting in radeonsi
14:45mareko: or maybe it can only do the typecast from a variable to a bindless handle, not vice versa
15:04zmike: yeah there's piglit tests for that
15:05zmike: which is what I was referencing
15:05zmike: my memory is fuzzy but I expect I'll re-remember most of it in a day or two since I'm attempting an explicit io conversation
15:05zmike: conversion
15:16mareko: radeonsi still uses derefs for textures and images
15:29alyssa: hey NIR people, get excited :3
15:29alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9396
15:33jenatali: alyssa: I'm excited
15:33jenatali: zmike: :O good luck
15:34alyssa: jenatali: TBF nir_builder for big shaders will still suck
15:35alyssa: I really want a common way to write internal shaders as GLSL or OpenCL C
15:35zmike: jenatali: I was optimistic at the beginning and tried to start with xfb
15:35jenatali: alyssa: Embrace C++, have nir SSA defa as classes with proper operators
15:36jenatali: That's what WARP does and it's honestly okay to write internal shaders
15:36alyssa: jenatali: TBH I still would rather have real GLSL or CL C
15:36alyssa: There are a bunch of different approaches to this already used in tree
15:37alyssa: I would like to pick one of them and promote it to common code
15:37alyssa: - qoShaderModuleCreateInfoGLSL (GLSL, single source, aco tests, from Crucible, depends on glslang)
15:37alyssa: - float64.glsl and friends (GLSL, separate source, uses Mesa's GLSL compiler)
15:37konstantin_: GLSL is the one that doesn't require any driver changes
15:38alyssa: - anv's bvh builder (OpenCL C, separate source, uses the OpenCL compiling pipeline at build time, runs their backend compiler at build time, scary)
15:39konstantin_: D:
15:39jenatali: Yeah that's fine for writing full shaders that don't need parameterization
15:39jenatali: Being able to sprinkle builder conditional logic is quite nice
15:39alyssa: - radv's bvh building (GLSL, separate source, glslang at build-time?)
15:40alyssa: jenatali: Ideally spec constants (or, ugh, the preprocessor) could be used for that
15:40jenatali: That still... kinda sucks
15:40alyssa: I mean.. maybe?
15:40alyssa: IDK
15:41alyssa: nir_builder excels at writing lowering passes, munging shaders, inserting little bits of code
15:41konstantin_: RADV BVH building currently uses the preprocessor, I want to change it to spec constants in the future
15:41alyssa: It's considerably less good at building entire shaders out of thin air
15:41alyssa: which is... fine, honestly?
15:42alyssa: I don't want a better nir_builder as much as I want to just write GLSL or CL C when I know I have a large (100s of lines) kernel that doesn't do anything an app wouldn't do
15:43alyssa: the operator overloading sugar C++ provides would be nice for writing lowering passes I admit
15:43jenatali: Yeah I was thinking of lowering
15:43konstantin_: How would that work with a typeless IR?
15:43jenatali: Or mini tini meta shaders that should be parameterized
15:43alyssa: yeah. there are distinct use cases here
15:43jenatali: konstantin_: Add type info just for determining what operators do
15:43alyssa: The cases I have in mind are things like BVH building kernels
15:44alyssa: where you fundamentally work on a different level of abstraction
15:44alyssa: you don't care about the hw instructions or the SSA form or the phi nodes
15:44jenatali: E.g. ssa_uint vs ssa_int vs ssa_float where you can cast freely between them
15:44alyssa: or, ugh, the loops
15:44jenatali: alyssa: Yeah makes sense
15:44alyssa: NV_device_generated_commands is an excellent candidate for this
15:44alyssa: although that's also kinda lowery
15:45alyssa: ideally you could, like.. build a NIR "library" and then you have an easy lowering pass that just inlines the functions you care about
15:45alyssa: I think the float64 lowering works like this already
15:45jenatali: Yeah
15:45jenatali: That's effectively what libclc is too
15:46alyssa: right
15:46alyssa: So the operative questions are:
15:46alyssa: - OpenCL C or GLSL?
15:46alyssa: - If GLSL, Mesa's compiler or GLSLang?
15:46alyssa: - Single-source or separate files?
15:47jenatali: Seems the answers can depend on the use case
15:47alyssa: - Parametrizable? If so, how? Preprocessor or spec constants or something else?
15:47alyssa: jenatali: Yeah, for sure. Which is why there are 4 different approaches in-tree already
15:48jenatali: If I had infinite time I would honestly be interested in trying the c++ approach
15:48alyssa: I don't hate it. But it doesn't solve the same problems
15:49jenatali: Agreed
15:49alyssa: Actually -- how about I list the use cases that I have in mind for this specifically
15:49alyssa: - BVH building (duh)
15:49alyssa: - NV_dgc
15:49alyssa: - software tessellator (someday..?)
15:50alyssa: - Maybe vk_meta
15:53alyssa: One of the concrete problems I see is that... it would suboptimal to build-time depend on glslang for core parts of Mesa
15:53alyssa: however Mesa's own glsl compiler may not be a suitable replacement for Vulkan use cases
15:54alyssa: I assume it doesn't recognize Vulkan-flavour GLSL because why would it? stuff like push constants
15:54alyssa: I'm not too sure how much of a problem this is in practice
15:55alyssa: Is a build-time (not runtime) dependency on glslang a real problem? Probably not? I mean, RADV already has that for raytracing so hopefully distros are used to it by now
15:55alyssa: I don't think there are any portability issues there
15:55alyssa: I don't think anyone loves glslang but also I don't think anyone loves the mesa glsl compiler ither
15:55jenatali: Fwiw on Windows it would be a pain to depend on that
15:56jenatali: So for common code I'd prefer to avoid it if possible
15:57alyssa: Alrght
15:57alyssa: That rules out vk_meta use
15:58alyssa: There's a middle-ground option of having some common solution for driver use
15:58alyssa: So as long as you don't build any drivers that use it you don't need the dep
15:58konstantin_: Or we repurpose the GLSL compiler and store serialized NIR in the driver binary
15:58alyssa: which should alleviate those concerns
15:59alyssa: konstantin_: the problem with using Mesa's GLSL compiler is that Vulkan GLSL is a bit different and IDK how bad the impedance mismatch is going to be
15:59alyssa: no push constants, no descriptor sets, etc..
15:59konstantin_: Yeah, missing BDA support
15:59alyssa: if you declare an SSBO in GLSL fed recognized by Mesa's GLSL compiler, it's going to give you load_ssbo intrinsics which is .. not what your Vulkan driver wants
16:00alyssa: though maybe it's fine? maybe?
16:05mattst88: anholt: are the groups of tests that deqp-runner makes deterministic?
16:06alyssa: konstantin_: jenatali: I think maybe not supporting Vulkan GLSL might be ok
16:07alyssa: In particular, it might be reasonable to write GL-style GLSL for the meta shaders
16:07jenatali: Seems okay to me
16:07alyssa: and then have an optional lowering pass to turn the reuslting GL NIR into Vulkan NIR
16:07alyssa: with some simple mapping
16:07anholt: mattst88: yes, for same input caselists / fraction / deqp-runner version, etc.
16:08alyssa: (uniforms become push constants, each gl resource type gets a corresponding descriptor set with bindings matching the gl-side index, etc)
16:08alyssa: doesn't help with BDA, though
16:08konstantin_: Well at least for BVH building, we need BDA. So there is definitely some frontend work that needs to be done
16:08alyssa: yeah, BDA is the big elephant in the room here
16:08alyssa:whinneys
16:10mattst88: anholt: awesome, thanks
16:10alyssa: If not for jenatali's windows comment I'd be tempting to promote RADV's BVH building infrastructure to common and call it a day
16:10konstantin_: We don't have to support spec compliant Vulkan GLSL. There is potential to hack C-style pointer syntax into GLSL instead of the mess we have with build_helpers.h
16:10jenatali: alyssa: I'm just one voice, I can be overridden
16:11anholt: making glsl_compiler input some of vulkan sounds pretty reasonable to me, if we have existing syntax to follow.
16:12jenatali: But also our driver would never use a common BVH builder
16:12jenatali: Since that's the D3D driver's responsibility
16:13alyssa: jenatali: Fair enough. But not being able to use this for common code might be annoying. IDK. Might be ok
16:13konstantin_: dozen is not the only driver that can be build on Windows, right?
16:14pixelcluster: radv can be built on windows too
16:14pixelcluster: (although it's not very useful at the moment)
16:14alyssa: anholt: it'd be reasonable if the glsl compiler weren't terrifying (-:
16:15konstantin_: I smell an interesting/cursed project for the next month
16:15pixelcluster: the 3rd one?
16:15konstantin_: 3rd?
16:15pixelcluster: I mean I don't have a copy of your todo list
16:15pixelcluster: maybe it's the second
16:15pixelcluster: there's one I know of for sure :P
16:16alyssa: konstantin_: my next interesting/cursed project is FEX-related, alas :p
16:16konstantin_: BH, you're kind of right
16:16konstantin_: *TBH
16:17jenatali: For what it's worth, my concerns aren't really about myself, but about other contributors. I really like to be able to point our customers at the source and let them self-help. The setup instructions for GL are a bit of a pain (download flex/bison/pkgconfig/mako/meson). Instructions for CL are god-awful (build LLVM/Clang/LLVMSPIRVLib/libclc)
16:18jenatali: Vulkan is pretty easy right now, just meson/mako. I'd be mildly frustrated if that had to get more complicated :)
16:18anholt: alyssa: it's bad, but it's also so much smaller than it used to be.
16:19alyssa: valid
16:19alyssa: i don't think i've written any parsers since middle school
16:20alyssa: ("Weird flex?")
16:20anholt: we use standard flex. ;)
16:20psykose: what about lexers
16:20alyssa: anholt: =D
16:20alyssa: I walked right into that one damn
16:24zf: <Lynne> got a triangle to render! struggled a bit to find any vulkan 1.0 clients at all < fwiw, Wine has very lax vulkan requirements, if you're looking for test cases :-)
16:25zf: and we certainly have a lot of test cases :D
16:26zf: vkd3d as well, as a splinter project, and that might be a little friendlier to development
17:50austriancoder: does anybody know if there is a deqp that tests the maximum number of varyings (GL_MAX_VARYING_FLOATS)? Something like piglits' glsl-max-varyings?
18:02everfree: trying to update an out of tree module to 6.4. It's trying to call mutex_lock_interruptible on a dma_buf->lock, but its erroring because dma_buf doesn't have a "lock" member. Did something get moved around here in the kernel
18:03everfree: im not 100% sure it ever compiled against a mainline kernel fwiw, i just found this thing, so if dma_buf.lock was like a vendor kernel extension w/e i can deal i guess
18:03everfree: but the rest of it doesnt seem too vendor-kernely
18:11everfree: huh, friend found a kernel commit 28743e25fa1c867675bd8ff976eb92d4251f13a1 that suggests i can just... delete the lock/unlock, that its not used.
18:23Company: what's faster in GLSL: x < min (min (a, b), min (c, d)) or any (lessThan (vec4 (x, x, x, x), vec4 (a, b, c, d))
18:24Company: and how interesting is it for performance to care about such things these days?
18:26anholt: Company: if it helps you guess, all hardware these days breaks vectors down into scalars. so that probably compiles down similarly.
18:28Company: so no need to pack stuff into vectors anymore
18:28pendingchaos: the former is 3 min and 1 comparison, the latter is 4 comparisons and 3 boolean ops
18:28pendingchaos: on AMD hardware, that's 4 standard cost VALU vs 4 standard cost VALU and 3 SALU
18:28pendingchaos: otoh, the latter might have better ILP, because the 4 VALU are independent
18:36zmike: if you are a gallium driver owner and haven't signed off on the great vertex stride move of '23 (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24117) pls click and review
20:03austriancoder: finding the test source from a deqp name is challenging
20:16pixelcluster: had to find the deqp name from a test source once, that wasn't too fun either :3
23:16Lynne: tried to find some extension I could help with on nvk, but everything's either handled already, handled by the vulkan framework, or would step on someone else's toes
23:20karolherbst: there is some performance work which is needed
23:20Lynne: compiler-wise?
23:20karolherbst: yeah
23:21karolherbst: atm we use global memory for ubos, but we really should use ubos... but that's a bit... non trivial to deal with with the current compiler. Though one part could be to remove the descriptor indirection and promote some of those to direct ubo accesses
23:21karolherbst: we have 8/16/18 slots available depending on the stage and generation
23:23Lynne: speaking of buffers, what does a memory heap with zero flags mean?