IRC Logs of #dri-devel on irc.freenode.net for 2023-07-20

04:52 Lynne: airlied: I just rebooted, and nouveau works now
04:52 Lynne: guess it didn't like hot modprobes
04:56 Lynne: it's using ampere's gsp binary from 535, so the binary version must match with the headers
05:19 Lynne: got a triangle to render! struggled a bit to find any vulkan 1.0 clients at all
05:21 Lynne: vkquake runs too!
05:33 dj-death: Looks like a530 runners are running into unexpected passes
05:44 HdkR: 4
09:08 kode54: I still wonder what the heck is up with the Xe driver and compute causing gpu crashes and resets
09:08 kode54: Is there any debug logging for the crash dump interface to pull the exact job queue that caused the crash? I saw something about that
10:07 DavidHeidelberg[m]: dj-death: that's nice, link please. It's flake or geniue unexp pass?
10:08 DavidHeidelberg[m]: dj-death: also sorry about mess with these, I added like four batches of flakes, but some are still showing up, it's not the driver with most stable results
10:08 dj-death: I put the link in the MR
10:09 dj-death: seems to be flakes mostly
10:10 DavidHeidelberg[m]: I see the MR, thanks!
10:11 dj-death: np
10:12 DavidHeidelberg[m]: dj-death: at some point (next flakes batch) I think I'll wild card those which are just adding testcases in one group each time
10:12 DavidHeidelberg[m]: *wildcard
13:19 zmike: mareko: do you have a branch with your linker wip somewhere? I want to see where it fits into the shader lifetime
13:35 mripard: daniels: \o/
13:35 mripard: congrats
13:37 daniels: mripard: thanks!
13:38 mripard: (and congrats to everyone involved as well, obviously :))
13:43 javierm: daniels: wow, huge news! Congrats to you and all folks involved :)
13:48 MrCooper: what's this about?
13:48 zmike: https://www.collabora.com/news-and-blog/news-and-events/a-helping-arm-for-panfrost.html I assume
13:49 MrCooper: thanks
13:51 daniels: javierm: thanks, but yeah, better directed at the ones doing the work :)
13:58 MrCooper: nice indeed
14:29 mareko: zmike: after varying linking and before uniform linking in the NIR linker of GLSL
14:30 zmike: mareko: does running this yield store_output or is it expected that drivers are calling lower_io
14:32 mareko: zmike: the linker calls nir_lower_io to lower store_deref to store_output, st/mesa and drivers only get store_output
14:32 zmike: ahhh ok
14:32 zmike: thanks
14:32 mareko: zmike: the varying optimizer requires lowered IO, so the linker has to lower it
14:33 zmike: right, I was expecting something like this but I wanted confirmation
14:33 zmike: how is bindless texture io handled? e.g., passing nir_var_image in input/output
14:34 mareko: bindless textures use 64-bit handles
14:34 zmike: kinda?
14:35 mareko: exactly
14:35 zmike: on receiving them they're image types
14:35 zmike: even if really they're 64bit ints
14:39 mareko: zmike: they are 64-bit numbers in NIR, and radeonsi uses the numbers to load descriptors
14:40 mareko: we literaly replace the 64-bit handle src with a vec4 or vec8 handle src containing the AMD image/sample/buffer descriptor
14:41 mareko: *sampler
14:43 zmike: I was talking about the variables, but I suppose they don't matter for explicit io
14:43 mareko: bindless don't have variables
14:44 mareko: also GLSL has this tricky feaature that it can typecast a bindless handle to a variable and vice versa
14:44 mareko: we don't implement the typecasting in radeonsi
14:45 mareko: or maybe it can only do the typecast from a variable to a bindless handle, not vice versa
15:04 zmike: yeah there's piglit tests for that
15:05 zmike: which is what I was referencing
15:05 zmike: my memory is fuzzy but I expect I'll re-remember most of it in a day or two since I'm attempting an explicit io conversation
15:05 zmike: conversion
15:16 mareko: radeonsi still uses derefs for textures and images
15:29 alyssa: hey NIR people, get excited :3
15:29 alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9396
15:33 jenatali: alyssa: I'm excited
15:33 jenatali: zmike: :O good luck
15:34 alyssa: jenatali: TBF nir_builder for big shaders will still suck
15:35 alyssa: I really want a common way to write internal shaders as GLSL or OpenCL C
15:35 zmike: jenatali: I was optimistic at the beginning and tried to start with xfb
15:35 jenatali: alyssa: Embrace C++, have nir SSA defa as classes with proper operators
15:36 jenatali: That's what WARP does and it's honestly okay to write internal shaders
15:36 alyssa: jenatali: TBH I still would rather have real GLSL or CL C
15:36 alyssa: There are a bunch of different approaches to this already used in tree
15:37 alyssa: I would like to pick one of them and promote it to common code
15:37 alyssa: - qoShaderModuleCreateInfoGLSL (GLSL, single source, aco tests, from Crucible, depends on glslang)
15:37 alyssa: - float64.glsl and friends (GLSL, separate source, uses Mesa's GLSL compiler)
15:37 konstantin_: GLSL is the one that doesn't require any driver changes
15:38 alyssa: - anv's bvh builder (OpenCL C, separate source, uses the OpenCL compiling pipeline at build time, runs their backend compiler at build time, scary)
15:39 konstantin_: D:
15:39 jenatali: Yeah that's fine for writing full shaders that don't need parameterization
15:39 jenatali: Being able to sprinkle builder conditional logic is quite nice
15:39 alyssa: - radv's bvh building (GLSL, separate source, glslang at build-time?)
15:40 alyssa: jenatali: Ideally spec constants (or, ugh, the preprocessor) could be used for that
15:40 jenatali: That still... kinda sucks
15:40 alyssa: I mean.. maybe?
15:40 alyssa: IDK
15:41 alyssa: nir_builder excels at writing lowering passes, munging shaders, inserting little bits of code
15:41 konstantin_: RADV BVH building currently uses the preprocessor, I want to change it to spec constants in the future
15:41 alyssa: It's considerably less good at building entire shaders out of thin air
15:41 alyssa: which is... fine, honestly?
15:42 alyssa: I don't want a better nir_builder as much as I want to just write GLSL or CL C when I know I have a large (100s of lines) kernel that doesn't do anything an app wouldn't do
15:43 alyssa: the operator overloading sugar C++ provides would be nice for writing lowering passes I admit
15:43 jenatali: Yeah I was thinking of lowering
15:43 konstantin_: How would that work with a typeless IR?
15:43 jenatali: Or mini tini meta shaders that should be parameterized
15:43 alyssa: yeah. there are distinct use cases here
15:43 jenatali: konstantin_: Add type info just for determining what operators do
15:43 alyssa: The cases I have in mind are things like BVH building kernels
15:44 alyssa: where you fundamentally work on a different level of abstraction
15:44 alyssa: you don't care about the hw instructions or the SSA form or the phi nodes
15:44 jenatali: E.g. ssa_uint vs ssa_int vs ssa_float where you can cast freely between them
15:44 alyssa: or, ugh, the loops
15:44 jenatali: alyssa: Yeah makes sense
15:44 alyssa: NV_device_generated_commands is an excellent candidate for this
15:44 alyssa: although that's also kinda lowery
15:45 alyssa: ideally you could, like.. build a NIR "library" and then you have an easy lowering pass that just inlines the functions you care about
15:45 alyssa: I think the float64 lowering works like this already
15:45 jenatali: Yeah
15:45 jenatali: That's effectively what libclc is too
15:46 alyssa: right
15:46 alyssa: So the operative questions are:
15:46 alyssa: - OpenCL C or GLSL?
15:46 alyssa: - If GLSL, Mesa's compiler or GLSLang?
15:46 alyssa: - Single-source or separate files?
15:47 jenatali: Seems the answers can depend on the use case
15:47 alyssa: - Parametrizable? If so, how? Preprocessor or spec constants or something else?
15:47 alyssa: jenatali: Yeah, for sure. Which is why there are 4 different approaches in-tree already
15:48 jenatali: If I had infinite time I would honestly be interested in trying the c++ approach
15:48 alyssa: I don't hate it. But it doesn't solve the same problems
15:49 jenatali: Agreed
15:49 alyssa: Actually -- how about I list the use cases that I have in mind for this specifically
15:49 alyssa: - BVH building (duh)
15:49 alyssa: - NV_dgc
15:49 alyssa: - software tessellator (someday..?)
15:50 alyssa: - Maybe vk_meta
15:53 alyssa: One of the concrete problems I see is that... it would suboptimal to build-time depend on glslang for core parts of Mesa
15:53 alyssa: however Mesa's own glsl compiler may not be a suitable replacement for Vulkan use cases
15:54 alyssa: I assume it doesn't recognize Vulkan-flavour GLSL because why would it? stuff like push constants
15:54 alyssa: I'm not too sure how much of a problem this is in practice
15:55 alyssa: Is a build-time (not runtime) dependency on glslang a real problem? Probably not? I mean, RADV already has that for raytracing so hopefully distros are used to it by now
15:55 alyssa: I don't think there are any portability issues there
15:55 alyssa: I don't think anyone loves glslang but also I don't think anyone loves the mesa glsl compiler ither
15:55 jenatali: Fwiw on Windows it would be a pain to depend on that
15:56 jenatali: So for common code I'd prefer to avoid it if possible
15:57 alyssa: Alrght
15:57 alyssa: That rules out vk_meta use
15:58 alyssa: There's a middle-ground option of having some common solution for driver use
15:58 alyssa: So as long as you don't build any drivers that use it you don't need the dep
15:58 konstantin_: Or we repurpose the GLSL compiler and store serialized NIR in the driver binary
15:58 alyssa: which should alleviate those concerns
15:59 alyssa: konstantin_: the problem with using Mesa's GLSL compiler is that Vulkan GLSL is a bit different and IDK how bad the impedance mismatch is going to be
15:59 alyssa: no push constants, no descriptor sets, etc..
15:59 konstantin_: Yeah, missing BDA support
15:59 alyssa: if you declare an SSBO in GLSL fed recognized by Mesa's GLSL compiler, it's going to give you load_ssbo intrinsics which is .. not what your Vulkan driver wants
16:00 alyssa: though maybe it's fine? maybe?
16:05 mattst88: anholt: are the groups of tests that deqp-runner makes deterministic?
16:06 alyssa: konstantin_: jenatali: I think maybe not supporting Vulkan GLSL might be ok
16:07 alyssa: In particular, it might be reasonable to write GL-style GLSL for the meta shaders
16:07 jenatali: Seems okay to me
16:07 alyssa: and then have an optional lowering pass to turn the reuslting GL NIR into Vulkan NIR
16:07 alyssa: with some simple mapping
16:07 anholt: mattst88: yes, for same input caselists / fraction / deqp-runner version, etc.
16:08 alyssa: (uniforms become push constants, each gl resource type gets a corresponding descriptor set with bindings matching the gl-side index, etc)
16:08 alyssa: doesn't help with BDA, though
16:08 konstantin_: Well at least for BVH building, we need BDA. So there is definitely some frontend work that needs to be done
16:08 alyssa: yeah, BDA is the big elephant in the room here
16:08 alyssa:whinneys
16:10 mattst88: anholt: awesome, thanks
16:10 alyssa: If not for jenatali's windows comment I'd be tempting to promote RADV's BVH building infrastructure to common and call it a day
16:10 konstantin_: We don't have to support spec compliant Vulkan GLSL. There is potential to hack C-style pointer syntax into GLSL instead of the mess we have with build_helpers.h
16:10 jenatali: alyssa: I'm just one voice, I can be overridden
16:11 anholt: making glsl_compiler input some of vulkan sounds pretty reasonable to me, if we have existing syntax to follow.
16:12 jenatali: But also our driver would never use a common BVH builder
16:12 jenatali: Since that's the D3D driver's responsibility
16:13 alyssa: jenatali: Fair enough. But not being able to use this for common code might be annoying. IDK. Might be ok
16:13 konstantin_: dozen is not the only driver that can be build on Windows, right?
16:14 pixelcluster: radv can be built on windows too
16:14 pixelcluster: (although it's not very useful at the moment)
16:14 alyssa: anholt: it'd be reasonable if the glsl compiler weren't terrifying (-:
16:15 konstantin_: I smell an interesting/cursed project for the next month
16:15 pixelcluster: the 3rd one?
16:15 konstantin_: 3rd?
16:15 pixelcluster: I mean I don't have a copy of your todo list
16:15 pixelcluster: maybe it's the second
16:15 pixelcluster: there's one I know of for sure :P
16:16 alyssa: konstantin_: my next interesting/cursed project is FEX-related, alas :p
16:16 konstantin_: BH, you're kind of right
16:16 konstantin_: *TBH
16:17 jenatali: For what it's worth, my concerns aren't really about myself, but about other contributors. I really like to be able to point our customers at the source and let them self-help. The setup instructions for GL are a bit of a pain (download flex/bison/pkgconfig/mako/meson). Instructions for CL are god-awful (build LLVM/Clang/LLVMSPIRVLib/libclc)
16:18 jenatali: Vulkan is pretty easy right now, just meson/mako. I'd be mildly frustrated if that had to get more complicated :)
16:18 anholt: alyssa: it's bad, but it's also so much smaller than it used to be.
16:19 alyssa: valid
16:19 alyssa: i don't think i've written any parsers since middle school
16:20 alyssa: ("Weird flex?")
16:20 anholt: we use standard flex. ;)
16:20 psykose: what about lexers
16:20 alyssa: anholt: =D
16:20 alyssa: I walked right into that one damn
16:24 zf: <Lynne> got a triangle to render! struggled a bit to find any vulkan 1.0 clients at all < fwiw, Wine has very lax vulkan requirements, if you're looking for test cases :-)
16:25 zf: and we certainly have a lot of test cases :D
16:26 zf: vkd3d as well, as a splinter project, and that might be a little friendlier to development
17:50 austriancoder: does anybody know if there is a deqp that tests the maximum number of varyings (GL_MAX_VARYING_FLOATS)? Something like piglits' glsl-max-varyings?
18:02 everfree: trying to update an out of tree module to 6.4. It's trying to call mutex_lock_interruptible on a dma_buf->lock, but its erroring because dma_buf doesn't have a "lock" member. Did something get moved around here in the kernel
18:03 everfree: im not 100% sure it ever compiled against a mainline kernel fwiw, i just found this thing, so if dma_buf.lock was like a vendor kernel extension w/e i can deal i guess
18:03 everfree: but the rest of it doesnt seem too vendor-kernely
18:11 everfree: huh, friend found a kernel commit 28743e25fa1c867675bd8ff976eb92d4251f13a1 that suggests i can just... delete the lock/unlock, that its not used.
18:23 Company: what's faster in GLSL: x < min (min (a, b), min (c, d)) or any (lessThan (vec4 (x, x, x, x), vec4 (a, b, c, d))
18:24 Company: and how interesting is it for performance to care about such things these days?
18:26 anholt: Company: if it helps you guess, all hardware these days breaks vectors down into scalars. so that probably compiles down similarly.
18:28 Company: so no need to pack stuff into vectors anymore
18:28 pendingchaos: the former is 3 min and 1 comparison, the latter is 4 comparisons and 3 boolean ops
18:28 pendingchaos: on AMD hardware, that's 4 standard cost VALU vs 4 standard cost VALU and 3 SALU
18:28 pendingchaos: otoh, the latter might have better ILP, because the 4 VALU are independent
18:36 zmike: if you are a gallium driver owner and haven't signed off on the great vertex stride move of '23 (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24117) pls click and review
20:03 austriancoder: finding the test source from a deqp name is challenging
20:16 pixelcluster: had to find the deqp name from a test source once, that wasn't too fun either :3
23:16 Lynne: tried to find some extension I could help with on nvk, but everything's either handled already, handled by the vulkan framework, or would step on someone else's toes
23:20 karolherbst: there is some performance work which is needed
23:20 Lynne: compiler-wise?
23:20 karolherbst: yeah
23:21 karolherbst: atm we use global memory for ubos, but we really should use ubos... but that's a bit... non trivial to deal with with the current compiler. Though one part could be to remove the descriptor indirection and promote some of those to direct ubo accesses
23:21 karolherbst: we have 8/16/18 slots available depending on the stage and generation
23:23 Lynne: speaking of buffers, what does a memory heap with zero flags mean?