00:46anarsoul|2: I'm working on adding atan and atan2 opcodes to nir. What whould be the best place to add lowering for them to avoid adding another lowering pass to each backend compiler?
00:47karolherbst: anarsoul|2: nir_opt_algebraic if it's not too complex
00:48karolherbst: anarsoul|2: I hope you mean like glsl atan, right?
00:48anarsoul|2: I would dare to rewrite this amount of C in py
00:48anarsoul|2: karolherbst: yeah
00:48anarsoul|2: why?
00:49anarsoul|2: see https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/compiler/nir/nir_builtin_builder.c#L176
00:49karolherbst: yeah.. I just searched for that..
00:50karolherbst: do we actually have hardware supporting this natively?
00:50anarsoul|2: karolherbst: yeah
00:50anarsoul|2: ARM :)
00:50anarsoul|2: at least utgard (lima) and midgard (bifrost)
00:50karolherbst: .... is it actually faster?
00:51anarsoul|2: karolherbst: yes, see https://gitlab.freedesktop.org/mesa/mesa/-/issues/9323
00:51karolherbst: interesting...
00:51karolherbst: I see that nvidia has native tanh btw
00:52karolherbst: anarsoul|2: maybe... we should just check if the backends have it and either call nir_atan or emit the opcode...
00:52karolherbst: could be the easiest way out
00:52karolherbst: just add a "has_atan" flag to nir compiler options
00:53anarsoul|2: but is backend already initialized when we translate glsl to nir?
00:54karolherbst: glsl_to_nir takes a nir_shader_compiler_options at least
00:54karolherbst: let's see if this is also the driver one...
00:55karolherbst: yeah.. looks like it
00:56karolherbst: NirOptions set inside src/mesa/state_tracker/st_extensions.c, and NirOptions passed to glsl_to_nir
00:56karolherbst: so I think you can actually use it this way
00:56karolherbst: I think we also do something with this with ffma? mhh
00:56karolherbst: maybe that was a patch of mine
01:00alyssa: anarsoul|2: the "proper" answer is nir_lower_alu but I don't know if all backends call it
01:01anarsoul|2: alyssa: unfortunately not all backends call it, I checked
01:01anarsoul|2: that was my first thought
01:01anarsoul|2: karolherbst: thanks, it works!
01:03karolherbst: though I guess nir_lower_alu would work as well.... mhhh.... annoying.. maybe st/mesa should just call nir_lower_alu?
01:04karolherbst: depends if you also want to add fancy optimizations based on atan
01:14anarsoul|2: oof, atan2 in utgard isn't really nir or ppir friendly :)
01:14karolherbst: what does that mena?
01:15karolherbst: it clobbers other registers?
01:15anarsoul|2: it should be lowered to temp.xyz = atan2_pt1(y, x); temp.x *= temp.y; result = atan2_pt2(temp.xyz)
01:15karolherbst: ohh wait...
01:15karolherbst: we have something like that on nvidia as well
01:16anarsoul|2: well, I guess temp is a reg and not ssa
01:16karolherbst: or at least we had....
01:18karolherbst: is atan2_pt1 maybe something like 1/2π?
01:18karolherbst: ehh a * 1/2π I mean
01:19karolherbst: mhh.. though doens't make sense if it takes two values?
01:19karolherbst: guess it's just something funky
01:20anarsoul|2: no, atan2_pt1 produces a vec3
01:20anarsoul|2: and yeah, it takes 2 values
01:21anarsoul|2: atan_pt2 is common for atan and atan2, it takes vec3 and produces actual result
01:22anarsoul|2: for atan2 you also need to multiply temp.x by temp.y
01:24karolherbst: I'm sure deep down all of this makes sense
01:50alyssa: anarsoul|2: what's wrong with translating nir_op_atan2 to that series of 3 ppir instructions?
01:53anarsoul|2: alyssa: nothing, we just never had to create an additional ppir_reg in ppir before (currently each nir_reg corresponds to ppir_reg), so some plumbing is necessary
01:54anarsoul|2: slightly more work than I originally anticipated
02:07alyssa: anarsoul|2: speaking of, can I put you down for the lima bits of https://gitlab.freedesktop.org/mesa/mesa/-/issues/9051 ?
02:07alyssa: I'm optimistc !23089 will land this week
02:10anarsoul|2: alyssa: I can't commit to any reasonable ETA at the moment. Try pinging enunes?
02:10alyssa: ack
02:11alyssa: enunes: ^
02:11anarsoul|2: is there any documentation on your rework?
02:12alyssa: MR description of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089
02:13alyssa: for lima/pp, the path of least resistance will be the nir_legacy.h api https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089/diffs?commit_id=bc29ff93c196249549e0fd76e6a3430dd35b1451#2fd7aea6f6e1a294376aa95196bcc4a119e86c1e_0_1
02:14alyssa: for lima/gp, since gpir needs explicit load/store register instructions anyway, you can just translate nir load_reg/store_reg to gpir load_reg/store_reg without anything else
02:16anarsoul|2: I'd run your lima/gp suggestion through cwabbott if he still remembers gpir details
02:23anarsoul|2: honestly, it'd be easier to rewrite ppir from scratch than writing an optimizer for its current version. Yet it's enormous amount of work and I just don't have that much spare time atm
02:23alyssa: really?
02:23alyssa: ppir looked pretty clean to me
02:23anarsoul|2: alyssa: it's hard to transform it
02:23alyssa: ah..
02:23alyssa: well, nir_legacy will be there for you then :)
02:25anarsoul|2: I guess what we lack is nir_builder equivalent for ppir
02:38alyssa: yeah..
05:14marcan: I figure if anywhere people here might know: what is the "usual"/"intended" mechanism to avoid racing KMS driver probe/hotplug with things like display managers/X?
05:15marcan: we used to have dcp returning from the probe function and initializing asynchronously, which caused problems with gdm. making it synchronous fixed that, but now I ran into a similar thing where the Fedora initial setup GUI tries to start X in the middle of the DCP probe
05:15marcan: I always assumed there had to be some udev settle thing but I guess that's not a thing these days... so how are things supposed to be sequenced to avoid these races?
05:31airlied: marcan: you should just send an updated hotplug even when you find new things connected
05:37airlied: are you still discovering connectors at that stage?
05:37airlied: or just connected/disconnected?
05:53Lynne: emulating raytracing is unvulkan
05:55airlied: all of raytracing is emulated
05:55airlied: it should have been done in a layer :-P
05:57Lynne: if raytracing is emulated, I ask, nay, demand, that the driver exposes multiple queues even if no such exist just to alleviate the overhead of externally synchronized submissions
05:57Lynne: it's literally free fps
05:58airlied: I do wonder how much of raytracing could have been done with custom compute shader launch mechanisms
06:04ishitatsuyuki: raytracing hardware are invoked from shaders, there's no separate queues
06:21marcan: airlied: still discovering connectors because there's no inherent serialization with device probe as far as I can tell?
06:21marcan: it gets even messier with the handoff from simpledrmfb, since that means one drm device goes away and another one replaces it
06:21marcan: which is fine for TTYs but we need to wait until the "real" device comes in before starting any display servers/managers if we don't want things to go really wrong
06:22marcan: the issue here was that X died with "no screens"
06:22Lynne: ishitatsuyuki: I meant in general, not specifically for raytracing, it was a spec mistake, and someone from microsoft here happily pointed out d3d12 made no such errors
06:22ishitatsuyuki: hm
06:23marcan: Lynne: just wait until you discover how much of vulkan will be emulated on Apple hardware :p
06:24Lynne: you mean moltenvk or asahi?
06:24marcan: asahi
06:25marcan: transform feedback? emulated. geometry shaders? emulated. tessellation? depending on what you do, emulated. blending? emulated.
06:27marcan: (but at least emulated on the GPU, not like macOS which does half of those things on the *CPU* for OpenGL. yes really.)
06:30airlied: marcan: how did you make it sync to fix gdm?
06:31airlied: like you could block on the first kms interaction until you have at least one connector found
06:31marcan: the kernel stuff got reworked to block driver probe until all the initial stuff is ready, that seemed to fix it. my assumption was there would be serialization between that modprobe and gdm, which at least on ALARM is the case I believe (since the modprobe happens in the initramfs while gdm obviously runs from the rootfs)
06:32marcan: previously it was just going off and kicking the init asynchronously and returning from probe quickly, before the actual DRM device got created
06:33airlied: you could do that and then just block on the first open or get resources
06:33marcan: but in this case it looks like X isn't being serialized with anything, and I'm not sure we can do much on the kernel side about that? if it doesn't at least wait until the real KMS DRM device is created, it's either going to find simpledrm and get it yanked out from under it a second later, or find no device at all
06:34marcan: airlied: not if the DRM device doesn't even exist yet
06:35airlied: well when you start modprobing it should kick out simpledrm straight away
06:35airlied: but yeah you'd need to create the device to stop something finding no devices
06:35marcan: yeah but if nothing is waiting for the probe func to return at all then there's no way to close that race
06:35marcan: which seems to be the case here I think
06:37marcan: AFAIK we do create the device before returning from probe, but clearly X isn't waiting for that
06:37airlied: X doesn't know to wait for anything
06:37marcan: and if it's not waiting for the module load to complete (which probes synchronously as far as I know), then it's always going to be racy
06:37marcan: right, which is why I thought distros had to serialize this somehow
06:38marcan: in traditional initramfs setups where you do your thing, wait for it to be done, then pivot to the real root, that point is suitably serializing
06:38marcan: but it seems these days systemd does some handoff something something and udev doesn't actually wait for settling in the initramfs or anything, it just all gets parallelized
06:38marcan: and now I'm wondering what the serialization point is, if any (or are all systems just broken and just not hitting the race by chance? :p)
06:40marcan: https://bugs.launchpad.net/ubuntu/+source/gdm3/+bug/1958488 looks like I'm not the only one running into this...
06:41marcan: https://gitlab.gnome.org/GNOME/gdm/-/commit/ecbd9694458194e24f356ba9d37fc4f3a515c3dd#note_1402863 apparently there's a CanGraphical thing...
06:43emersion: systemd CanGraphical is deprecated
06:43emersion: userspace needs to wait for a KMS device with a timeout
06:44airlied: where is the deprecation mentiond?
06:46emersion: https://gitlab.freedesktop.org/wlroots/wlroots/-/issues/2093
06:47emersion: well not deprecated, but... not the right tool
06:48airlied: does CanGraphical signal for this case with simpledrm or something?
06:50marcan: SUBSYSTEM=="drm", KERNEL=="card[0-9]*", TAG+="seat", TAG+="master-of-seat"
06:50marcan: looks like it will fire for any DRM device
06:51marcan: # Allow efifb / uvesafb to be a master if KMS is disabled
06:51marcan: SUBSYSTEM=="graphics", KERNEL=="fb[0-9]", IMPORT{cmdline}="nomodeset", TAG+="master-of-seat"
06:52marcan: I'd say that udev rulefile needs to be updated to ban simpledrm... but under what conditions? doesn't "nomodeset" also break simpledrm since it's KMS?
06:53marcan: the problem though is ideally we *do* want simpledrm fallback to work... and again the only way I can think of to make that work is to just wait for all drm drivers to probe
06:54marcan: (the idea being that dcp might not exist or fail to probe under some situations, but you still want a dumb GUI to work)
06:55airlied: yeah it's the age old problem, of when to take the fallback path as the only path
06:55marcan: https://github.com/systemd/systemd/issues/10435 looks like systemd basically gave up on this
06:56marcan: and then there's the renderonly GPU/kmsro issue to deal with, if DCP probes before the GPU that's also going to break
07:20mripard: jani: if it's still relevant, you have my acked-by
07:20mripard: (and an additional: yes, finally!) :)
07:33tzimmermann: marcan, it's not really uerspace' job to second-guess the DRM devices; just use whatever /dev/dri/card is there (even if simpledrm isbeing removed shortly afterwards); and if nothing's there userspace should wait
07:34tzimmermann: wrt Xorg: just give up, even if Xorg can be made to work correctly, it's probably either a hack or a rewrite of the DRM backend
07:34emersion: ^
07:34emersion: the renderonly case is a bit shaky though
07:35tzimmermann: i think, i missed half of the discussion
07:35emersion: i don't have good ideas to fix that one, except a list of KMS drivers which are expected to have a separate render node…
07:35emersion: in mesa
07:36emersion: so that mesa can block for a bit
07:36emersion: but that breaks as soon as there's a system with just a display device and no render device
07:37emersion: well, s/breaks/times out/, which is not a nice UX
07:37tzimmermann: i missed the first half of the discussion. what was the initial problem? there's a render node and the modesetting node comes up late?
07:37mripard: instead of a KMS drivers list, can't we use the driver features which shouldn't have DRIVER_MODESET?
07:40emersion: mripard: the compositor will wait for a display device to come up
07:41emersion: so the compositor already waits for the DRIVER_MODESET device to come up
07:41emersion: the problem is when the render device is late
07:41emersion: the compositor doesn't know when a KMS device is renderonly or not, the only way to know is to try to create a GL context
07:42emersion: at that point, we're in mesa and mesa sees only the KMS device it's been initialized with, but no render node
07:42mripard: sorry, I meant DRIVER_RENDER
07:42emersion: some systems don't have a render device
07:42emersion: we can't stall these systems for 5s just in case a render device shows up later
07:43mripard: sure, but if it causes an issue, I'd expect the list of platforms that would run a compositor without a render node to be fairly small
07:43mripard: so maybe we can have a list of devices that are known not to have GPU?
07:43mripard: and just assume they do
07:44emersion: i'd really rather have the contrary…
07:44mripard: but I'm not even sure that the driver name is the right tool for that
07:44emersion: renderonly requires mesa changes anyways
07:44mripard: I'm pretty sure we have drivers that run on both platforms
07:44emersion: right
07:45mripard: not platforms, but systems with and without GPUs
07:45emersion: that's what i was scared of
07:45emersion: is devicetree of any help here?
07:48javierm: marcan: how plymouth worked around this was to threat simpledrm as a fallback and only use it without a timeout if modeset or plymouth.use-simpledrm is in the cmdline
07:48javierm: marcan: https://gitlab.freedesktop.org/plymouth/plymouth/-/merge_requests/163/diffs?commit_id=83b385061ccbf5a46ea77f7f12c1c7bfc72a09f2
07:49jannau: javierm: s/ modeset / nomodeset / ?
07:49javierm: jannau: yes, sorry. A typo
07:49mripard: emersion: could be, we could link the GPU to the display through the device tree, and even expose that through sysfs if it makes sense
07:50mripard: also, we have platforms using ACPI that show up, so I'm not sure it's the proper solution
07:50tzimmermann: javierm, OMG i really wish they had at least tried to fix this in simpledrm
07:50emersion: iirc sima was not a fan of having the kernel involved
07:51emersion: anyways, we can have whatever logic we want in the mesa loader
07:51javierm: tzimmermann: yeah... but I guess this is inevitable. When I first replaced {efi,vesa}fb with simpledrm, a lot of stuff broke due efifb behind harcoded all over the stack
07:51javierm: tzimmermann: I guess now we have simpledrm instead :)
07:52emersion: :/
07:52javierm: *being hardcoded
07:52mripard: worst case scenario, we use the driver name like you suggested, and if some platforms break we can always use the platform name as a secondary check
07:53emersion: is the platform name different when there is no GPU?
07:53emersion: or do you mean something else than the KMS device platform name?
07:54emersion: there is a mesa MR somewhere to use driver names in the loader iirc
07:55tzimmermann: javierm, they should have looked whether the information is there (edid, orientation, etc) rather than hard-coding driver workarounds
07:56javierm: tzimmermann: yes, I don't disagree but what I tried to say is that user-space will always workaround the kernel if needed
07:57javierm: tzimmermann: another example is mutter having a deny list for atomic KMS (i.e: virtgpu due missing cursor hotspots support in atomic)
07:58emersion: ;_;
07:58emersion: fwiw, in wlroots we have a policy of no workarounds
07:59javierm: pq: speaking of which, I see that you are still discussing with vmware folks. But IIRC you just asked for more documentation ?
07:59tzimmermann: i'm beginning to look at this as a challenge >:‑)
08:00emersion: lol
08:00javierm: haha
08:02sima: marcan, airlied yeah I think the entire driver load sync is a) busted and b) the bucket is just passed around because everyone says it's too hard to solve
08:03sima: emersion, not really getting the full context from scrollback?
08:04emersion: sima, discussion about how to solve the renderonly problem, where user-space needs to figure out when a KMS device should be used with a separate render device
08:04emersion: we discussed at some point to have a kernel API "is this renderonly?"
08:04emersion: but iirc it was deemed user-space's problem to solve
08:05emersion: not sure that's enough context
08:05sima: emersion, don't you have that already? if it's not supporting kms or not listing any outputs at all it's probably a renderonly drm node?
08:05emersion: i mean, i can detect renderonly nodes just fine
08:05sima: and vice versa, if you can't instantiate a gl or vk render context on a drmfd, then it's a kmsonly one
08:06emersion: llvmpipe :-)
08:06sima: emersion, I guess then I'm not seeing the issue?
08:06sima: oh
08:06emersion: the problem is usually, i have a KMS-only node, should i try to use GL or not
08:06sima: who knows?
08:06sima: :-/
08:06emersion: and now there's the additional issue that the render node might show up late
08:06sima: yeah
08:06sima: or the kms node might show up late and you're stuck on simpledrm
08:06sima: or some of these might never show up
08:06sima: and there's not really a way to tell the difference
08:07emersion: and there are cases where the KMS-only node cannot be used with a render node
08:07emersion: vkms, displaylink stuff, etc
08:07emersion: right
08:07sima: in general we should fix these by patching the prime support I think
08:08emersion: well, if PRIME is supported, that doesn't mean i can scan out PRIME-imported buffers
08:08sima: yeah that's why xorg has lots of copy fallbacks :-/
08:08sima: I think ideally the compositor can fully cope with hotplug of anything
08:09emersion: that's… not going to work very well i think
08:09sima: so maybe you boot up with simpledrm + llvmpipe, and then more or less seamlessly upgrade to kmsonly + renderonly once these show up
08:09emersion: that's a *lot* of work
08:09sima: yeah reality probably means it's a compositor reset
08:10sima: and I have no idea how wayland clients would cope
08:10sima: probably equally badly
08:10javierm: sima, emersion: it's even more complicated because now there's a timeout for probe deferral that defaults to 10 secs :/
08:10emersion: tbh i don't think it can really work in practice
08:10sima: I guess the cope-out could be a compositor event/popup that tells the user to restart their session
08:10sima: javierm, where's that timeout? in the kernel?
08:10emersion: once some formats have been advertised, you can't take it back
08:11javierm: sima: in the kernel, yes
08:11sima: javierm, ugh ... sha1?
08:12javierm: sima: it's a long history, I tried to disable it by default and mention all the commits involved here https://lore.kernel.org/lkml/20221116120236.520017-1-javierm@redhat.com/
08:12javierm: but was told that disabling it would break other use cases...
08:13javierm: but it was added by commit 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
08:14sima: wtf why are people trying to make optional dependencies work
08:14sima: they. just. dont.
08:15sima:sighs
08:16javierm: sima: the worst thing is that Rob added it as a debugging mechanism and then somehow got "promoted" in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a longer default if CONFIG_MODULES is set")
08:16sima: javierm, ah you're trying to make this a debug hack, that looks reasonable
08:16javierm: sima: that was the original intention, see commit 25b4e70dcce9
08:16sima: this feels like i915 and the preliminary hw support
08:16javierm: but commit e2cec7d68537 changed from "this is a debug option" to "make 30 secs the default"
08:16sima: every few years you need to rename the knob
08:17javierm: for me makes no sense to have this. Either is a best effort or not, you can't have both
08:18sima: yeah, and the default must be to reprobe forever
08:18javierm: sima: exactly, otherwise you are breaking for instance modules loading
08:18javierm: anyways, I got tired to argue and gregkh attitude of we know better than you let me drop that series
08:19javierm: *led me to
08:19sima: hm gregkh not supporting this is a bit funny, since he's the one that usually insists that the kernel can't tell userspace when it's done probing all the drivers
08:19sima: (because it fundamentally can't)
08:19sima: javierm, so you just carry the last patch as a fedora fixup?
08:20javierm: sima: I don't. I just built-in some drivers to make it work and moved on
08:20pq: javierm, yes, I was failing to communicate that I want some vague doc about how input ties to output in order to define how hotspot works. And not a vmware code level spec of what their stack does.
08:21javierm: pq: cool, agreed that it would be useful (as said, it took me some time to understand what cursor hotspots and cursor commandeering was or worked)
08:21mripard: emersion: I meant the SoC name
08:22emersion: ok, i don't know about this, but this sounds more reliable
08:22emersion: but potentially a pain to list all SoCs
08:22mripard: like I said, only for drivers that support both drivers with and without GPUs
08:24javierm: mripard: but even in that case you never known whether let's say v3d will eventually show up for vc4 to use it or fallback to llvmpipe / SW rendering
08:24mripard: sure
08:25javierm: sima: to wrap up the probe deferral and time out dicussion, IMO the only sensible option would be for user-space to tell the kernel to give up
08:26sima: javierm, yeah that might be one option
08:26javierm: because only user-space knows when it doesn't have for example more modules to load
08:26javierm: and if people want to do it earlier, they can do in the initramfs
08:32pq: javierm, happy to see someone at least skim my emails, because usually no kernel dev confirms nor disagrees with my comments so I can't know if what I'm saying makes sense. And it's difficult to comment when you doubt if you are not leading someone in a wrong direction.
08:34javierm: pq: I did read the emails in that thread but didn't have anything useful to say :)
08:34javierm: I mentioned earlier in that thread that agreed with you about having proper uAPI documentation about cursor hotspots since is not a trivial concept to wrap your head around
08:35sima: +1 on that, especially since with cursor hotspot defacto it was just a lot of "hacked until it did what we wanted with Xorg"
08:35pq: the last hotspot thread did run off in a tangent, but I mean also other threads like solid_fill and...
08:35sima: for some vague notion of "this makes the user feel like the entire thing is more responsive than it really is"
08:36sima: pq, imo kms uapi needs at least one ack from a compositor person (especially for the documentation part)
08:37pq: sima, that's cool :-)
08:38pq: ...VKMS stuff
08:39sima: https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#requirements hm maybe we should clarify the docs here that it's about documenting the semantics for compositors?
08:39javierm: pq: I usually read the threads you are involved and basically learn from your answers :)
08:40javierm: pq: since you are mostly focused on the uAPI side, it's a useful POV
08:40pq: heh, thanks, and yeah except for VKMS
08:40sima: or maybe a "Documentation requirement for userspace api" section is needed here https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.htm
08:41sima: I think we're more or less ready to make "cross driver interfaces, especially anything for the modeset api, must be documented" line :-)
08:41sima: emersion, ^^ ?
08:41sima: since you've typed up a lot of the missing bits in this area recently
08:41emersion: pq, i agree with all you've said
08:42pq: emersion, thanks!
08:44emersion: ah, an "l" was cut off that URL
08:44emersion: (funny, because ".htm" is a widely used extension)
08:44pq: sima, btw. "what even is atomic" doc hidden in here: https://lists.freedesktop.org/archives/dri-devel/2023-July/412530.html
08:45emersion: ty for writing this btw!
08:45emersion: i should review, but haven't had the time yet
08:45pq: it also defines "modeset" which people might want to bikeshed about
08:45pq: :-)
08:46emersion: so, in terms of new uAPI requirements, there is https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
08:46emersion: and then the link above as well
08:47sima: pq, note that the no-op out is not entirely true, it does have an impact on locking and sync
08:47emersion: ah, but that section linked above is for props only
08:47sima: so if you e.g. including a different crtc, even if you change nothing, you'll sync/block against that
08:47pq: FYI, I practically never look at the userspace coming with new UAPI, so I might not even notice it is missing.
08:47emersion: and there are no links between the 2, because we don't know how to make these in Sphinx
08:48sima: emersion, we've not been requiring docs for uapi formally, so this gap is not an oversight
08:48emersion: "Each new property introduced in a driver needs to meet a few requirements, in addition to the one mentioned above"
08:48emersion: above where?
08:48sima: but I think consensus has moved that we should require docs for at least new uapi
08:48pq: sima, is there a good reason to sync/block there?
08:48pq: rather than just check the actual change comes out to nothing, so drop it
08:49sima: pq, the kernel doesn't support dropping only part of a state after validating it
08:49pq: you can't drop it before validating?
08:49sima: I guess we could do some opt and check the current prop value, and if it matches not get that state
08:50sima: pq, we'd need to not acquire the kernel-side state struct to begin with I think
08:50sima: everything else is a bit too brittle
08:50pq: ok, sounds too complicated for any benefit
08:51sima: emersion, maybe an earlier version of that patch added this section in the uapi file?
08:51sima: I think a link to the userspace requirements and the test/validation requirements would be good
08:51sima: pq, I think checking and not grabbing the state might work without causing issues
08:51sima: hm
08:51emersion: oh right we also have a test/validation section
08:52sima: not sure, because "what is the current value of this prop" works by a) grabbing the current state b) converting the value from kernel state representation to property value
08:52sima: so we can't even really do that
08:52sima: pq, I guess adding a note that it might lead to oversynchronization across crtc would be good
08:53sima: emersion, but yeah more links for all this would be good, and then maybe 2nd patch to put down a formal requirement for new uapi to be documented and that doc being acked by userspace folks?
08:53emersion: yeah, sounds good
08:54emersion: i need to learn how to do links now :<
08:54sima: I think we're essentially there now, and the leftover gaps in existing uapi docs isn't too bad to make this an onerous requirement for a small addition
08:54sima: emersion, I have to google it every time too :-P
08:54pq: sima, indeed - will you reply on the patch, or should I deliver your greetings?
08:57sima: pq, I replied
08:58pq: thanks!
09:44dviola: bisected it: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9321
09:48dviola: gerddie: hi, I think it's your commit :)
11:43emersion: thanks for pushing the patch jani!
12:04jani: emersion: np!
12:17alyssa:regrets touching blend enums
12:20alyssa:regrets touching C
12:21alyssa: that was a long time catching the bug of "header guard collision".
12:21alyssa: great language.
12:21penguin42: alyssa: #pragma once
12:21alyssa: penguin42: in C?
12:22penguin42: alyssa: Technically non standard but everyone supports it https://en.wikipedia.org/wiki/Pragma_once
12:22alyssa: Interesting
12:22alyssa: I would love to convert Mesa to it..
12:23alyssa: if even MSVC supports it I see no reason not to use it
12:23alyssa: jenatali: ^
12:23jenatali: Sure
12:24penguin42: Libreoffice is moving to it
12:24alyssa:thought it was a C++ only thing
12:25penguin42: hmm, I'll admit to not trying it in C, although I see a few examples of it in qemu's generated headers
12:27pq: Weston uses it in C in couple of headers. The old style guards are still prevalent.
12:28pq: gcc and clang are happy with it
12:37emersion: it was rejected in C23
12:37penguin42: oh great; for making life too easy?
12:38emersion: tbh i never got bitten by a duplicate #ifdef guard
12:43CounterPillow: I've never been in a car crash, airbags clearly aren't needed
12:44penguin42: emersion: I think I've been bitten by it once in a long time; but I've seen a few close calls - typically where people start out by duplicating a header and modifying it and then forgetting to change the guards
12:45penguin42: emersion: Or where there are multiple headers of the same name and people don't agree on a scheme
12:46emersion: CounterPillow: not a good argument for thsi discussion
12:46emersion: penguin42: yes, agreeing on a scheme is important indeed
12:47penguin42: emersion: I think it's rare enough not to need someone doing a global replace, but common enough that it's probably a good idea for new headers or if you're already making a close change
12:47emersion: i think standards are important
12:49penguin42: fair
12:50pq: but if #pragma once is not it, then there is no standard?
12:53emersion: i personally think the standard way (ifdef guard) isn't too much of a hassle to justify diverging from the standard
12:53emersion: but yeah, some people have no problem writing gcc-specific or gcc-and-friends-specific code
12:54pq: If you're used to it, sure, but not having to do it and just slap a #pragma once feels sooo goood, and guaranteed to not silently fail causing wtf moments.
12:55daniels: emersion: tbf, 'and friends' here does seem to include every compiler I've heard of and many I haven't
12:56emersion: i don't really care, i want someone writing a compiler from just the standard to be able to compile my code
12:56pq: if the pragma is ignored, the build failures will be very obvious - and frustrating
12:56emersion: pq, no
12:56emersion: https://en.cppreference.com/w/cpp/preprocessor/impl
12:57emersion: > Any pragma that is not recognized is ignored.
12:57pq: for C? haa haa. for Rust? There is no standrd? What about Go?
12:57emersion: Go has a standard
12:57emersion: and it's quite short
12:57emersion: Rust doesn't have a standard and that's a major flaw
12:57psykose: language standard debate hour woo
12:58emersion: yeah, sorry about that
12:58emersion:goes back to writing C
12:58psykose:hands a + to emersion, just the one
12:58emersion: :D
12:58daniels: psykose: not a #?
12:58psykose: that's four +'s
12:59penguin42: emersion: I sympathise, the '#pragma once' seems to be a little unusual in being a bit wider than 'gcc and friends' but as you say still not a standard
12:59penguin42: the four plusses of the apocalypse
12:59daniels: psykose: or two ✝
13:00lumidify: https://old.reddit.com/r/C_Programming/comments/f81zyz/what_is_the_story_of_gcc_v134_playing_games_when/
13:00lumidify: In case someone is using GCC 1.34 :P
14:01alyssa: bbrezillon: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076
14:01alyssa: ended up reworking a pile of panfrost blend code, cc'ing you for review since you probably understand it better than I do tbh
14:06bbrezillon: alyssa: I'll try to have a look tomorrow
14:07alyssa: thx
14:26Pierce[m]: I thought I'd ask here rather than making an issue on the mesa Gitlab, but where would be the best place to ask about the pvr driver supporting different chips in the same family?
14:28kisak: (from not-a-mesa dev) Looking for support on newer hardware than what the driver supports or older hardware. There's a high chance there's a massive pile of technical restrictions on hardware going older.
14:30Pierce[m]: Same family, so not older or newer per se
14:33Pierce[m]: Afaict the driver only supports the AXE-1-16M and GX6250, meanwhile there are a couple RISC-V devices that use the BXM-4-64 and the BXE-4-32
14:33kisak: Looking at https://netsplit.de/channels/?chat=mesa , I'm thinking #powervr
14:34Pierce[m]: Sweet, thank you
14:55alyssa: kisak: Given the MASSIVE improvements going from RGX to AGX... I shudder to think what the older PowerVR was like...
14:58daniels: you definitely don't get Vulkan on SGX, but given it's roughly contemporary with r4xx, that's fair enough really
14:59alyssa: fair
14:59alyssa: DavidHeidelberg[m]: Any idea why my panfrost manual jobs aren't triggering? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076
14:59alyssa: Usually I just click the containers and then all the hw jobs run, for some reason panfrost doesnt want to do that anymore
15:01alyssa: so unless panfrost ci got switched to alpine when i wasn't looking..
15:01DavidHeidelberg[m]: alyssa: u need also trigger `alpine/x86_64_lava_ssh_client`
15:01DavidHeidelberg[m]: (that's needed for every LAVA based job)
15:01daniels: but also .gitlab-ci/bin/ci_run_n_monitor.py --target '^panfrost.*'
15:02alyssa: daniels: I wanted a full run actually
15:02alyssa: (or at least, a significant one)
15:02daniels: --target '.*' :P
15:02alyssa: ...Ah, well
15:02alyssa: :p
15:03alyssa: anyway
15:03alyssa: back to vulkan~
15:21emersion: sima, gfxstrand: please correct me if i'm saying inaccurate/nonsense again https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/90#note_1996334
17:09dottedmag: Is DRM_IOCTL_MODE_DIRTYFB still relevant? I see wlroots does not use drmModeDirtyFB.
17:11sima: dottedmag, only if you do frontbuffer rendering, it does nothing when you're pageflipping
17:11sima: well, nothing useful :-)
17:12dottedmag: sima: Thanks
17:12sima: emersion, makes sense to me
17:14emersion: ty
17:22austriancoder: alyssa: if you have time it would be nice if you could have a second look at !24054
18:17zmike: anholt / DavidHeidelberg[m]: what do you think about having a couple frames from steam in ci trace jobs?
18:20DavidHeidelberg[m]: zmike: the steam overlay? Make sense. (I guess it'll be small)
18:21zmike: like the main steam client
18:21zmike: it's pretty small
18:21DavidHeidelberg[m]: Yeah. I recall some long (long) time ago I feel like I saw some corruption there, so I guess it eould be nice.
18:22zmike: unfortunately a lot of issues in steam come from glx
18:22zmike: which can't be tested through traces
18:23karolherbst: penguin42: I hit a weird fail with device and host timer today, and the CTS complains about this: deviceStartTime: 3577608128229, deviceEndTime: 3469253489
18:23karolherbst: and... I have no idea how that can happen...
18:25karolherbst: took like 200 runs to hit it once
18:33everfree: could anyone point me at where id start trying to figure out why my graphics hardware can do 1920x1080 output but not 1280x1024? (embedded stuff because of course it is). I have run into this before with a different chip and I vaguely remember finding a table of supported dispaly clocks hardcoded into the drm driver source code or something like that. i also remember not being able to figure out how
18:33everfree: to properly figure out all the numbers to add a new clock, and im not sure if thats how every drm driver works
18:35everfree: the hdmi output is added via an out of tree patch so that's extra fun (i.MX cadence)
19:10airlied: everfree: usually there are pll calculations
19:10airlied: instead of mode tables
19:50penguin42: all very pretty
20:27benjaminl: ooh, shader trap handlers sounds like a huge improvement. what I've been doing currently is binary-searching the shader by cutting it off at different points
20:27benjaminl: works, but kinda miserable
20:44gfxstrand: zmike: Mind quick looking at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089/diffs?commit_id=16e34a3caab13447f53a77712c2382de138b9717
20:44gfxstrand: zmike: The gallivm patch in the nir register MR
20:44gfxstrand: zmike: I think it looks okay but IDK what's going on with thet AOS stuff. Passes CI at least.
20:47zmike: probably want airlied for a more comprehensive review
20:47zmike:-> couchmode
20:48airlied: int nae = nir_intrinsic_num_array_elems(decl);
20:48airlied: int num_array_elems = nir_intrinsic_num_array_elems(decl);
20:48airlied: consistent :-P
20:48airlied: you want airlied in a few hours maybe, this airlied isn't useful
20:49gfxstrand: I think we might actually ship this mess today
20:54airlied: okay rb for the gallivm supplied
20:56karolherbst: I need a soltuion for initializers on global memory I think...
20:59karolherbst: or global global mem in general I think :D
21:13gfxstrand: Yeah...
21:13gfxstrand: TBH, I don't think it should be that hard.
21:13karolherbst: well
21:13karolherbst: the details are hard
21:13karolherbst: like imagine you have 5 variables
21:13gfxstrand: Biggest problem will be linking
21:14karolherbst: and one entry point only accesses three of them
21:14karolherbst: and nir happily DCEing the others
21:14karolherbst: though
21:14gfxstrand: Don't DCE them
21:14karolherbst: we could also just bind a buffer per variable
21:14karolherbst: okay, but does vtn also ensure they are processed in the same order?
21:14gfxstrand: No, let's not bind a buffer per variable unless we really need to
21:15gfxstrand: Same order? It should
21:15karolherbst: right
21:15gfxstrand: The bigger question, IMO, is linking. What happens if there's more than on SPIR-V used to create a program?
21:15karolherbst: it doesn't matter
21:15karolherbst: it only matters for like fully linked programs
21:16karolherbst: maybe spirv-link might need some fixing there, but I think it handles it just fine
21:16karolherbst: there is also no way to actually extract the data besides through a kernel part of the final binary
21:17karolherbst: "This information is only available after a successful program executable has been built for at least one device in the list of devices associated with program." and other sentences like that
21:17karolherbst: (though that's for ctors/dtors)
21:17tleydxdy: is there a way to tell which process hold a reference to a dma-buf? I'm running into a problem unloading amdgpu because there's one dma_buf not releases
21:18tleydxdy: I did a trace for module_put/get and it's from a dma_buf_export did by Xorg
21:19tleydxdy: but xorg is dead already so I wonder what's keeping the dma_buf alive
21:23karolherbst: gfxstrand: I think the biggest open qusestion for me atm is, who is responsible for extracting the initializer? Should we just accept our fate and let vtn do it on any entrypoint and provide the full thing or should we have a helper function just for extracting it?
21:24robclark: tleydxdy: /sys/kernel/debug/dma_buf/bufinfo has some info, but it looks like not which processes have it imported.. perhaps lsof and/or `ls -l /proc/*/fd/|grep dmabuf`
21:25robclark: actually just `lsof | grep dmabuf` would work
21:25karolherbst: maybe we have to split spirv_to_nir into two functions... one for general processing with a parsed result which one then can crate nir_shaders from... or something.. dunno
21:25karolherbst: might also allow for some reduction of CPU overhead
21:26tleydxdy: yeah, I tried listing /proc/*/fd but none of the fd is a dmabuf
21:26karolherbst: I suspect almost everything before "vtn_build_cfg(b, words, word_end);" could be done per spirv, and then everything after that is per entry point
21:26gfxstrand: karolherbst: We have `nir_gather_explicit_io_initializers()`
21:26karolherbst: right.. but what if you have a spirv with like 10 entrypointers, do you do it for every nir containting the same data?
21:26gfxstrand: So if we don't dead-code and we know all the SPIR-Vs in have the same global variable order, that should do the trick.
21:26karolherbst: or do we have a special case for ctors/dtors?
21:26karolherbst: mhhh
21:27karolherbst: actually...
21:27karolherbst: we could allow spirv_to_nir to create ctor nir shaders and only that would contain the initializer
21:27karolherbst: need it anyway if we want to support ctors
21:28karolherbst: Initializer execution mode I mean
21:28tleydxdy: fwiw the dma-buf have a write fence by drm_sched on gfx queue
21:29karolherbst: sooo.. from a cl runtime perspective we could either make clc report if we have Initializers (either constant data or an actual Initializer entry point) and then ask spirv_to_nir to build that one for us, or we just try it always and spirv_to_nir might fail to give one to us...
21:30karolherbst: the only constant data case might look weird then as there are no instructions...
21:30robclark: tleydxdy, hmm, sounds a bit like amdgpu needs to block unload until scheduled jobs complete or are successfully canceled, because it is deadlocking itself..
21:30karolherbst: anyway.. that's kinda the part I'm the most concerned about atm
21:30tleydxdy: the fence is signaled tho
21:31robclark: sure, that doesn't really matter
21:31robclark: hmm, or maybe something in the job cleanup path hangs
21:31tleydxdy: also if I don't see Xorg doing dma_buf_release wouldn't that be a problem in itself?
21:32karolherbst: gfxstrand: do you think it would be weird to create a nir_shader without any instructions/functions, but only variables + some global init data?
21:33karolherbst: though I guess the runtime could also just check if the function body actually has any instructions
21:33karolherbst: and like not run it if there isn't any
21:34gfxstrand: If we're going to have an init shader, we can make it do the initialization
21:34gfxstrand: Then it would have stuff
21:34gfxstrand: Or we can do the initialization with a memcpy from the CPU
21:34gfxstrand: :shrug:
21:34karolherbst: yeah well.. the thing is, those init kernels are expected to run single thread...
21:34karolherbst: *threaded
21:34karolherbst: but yeah.. we could also just always do it on the GPU side
21:35karolherbst: and then from a runtime perspective it would look the same
21:35karolherbst: yeah... maybe just do this, it's sucks a little, because you'll probably end up binding a constant buffer which copies the content...
21:36robclark: tleydxdy: not sure.. but if Xorg process has exited all it's file handles should be closed.. I trust the kernel process tear-down codepaths a lot more than any driver unload codepath
21:36karolherbst: I'll play around with it and see where it leads me to
21:36robclark: process cleanup paths actually get tested
21:38gfxstrand: karolherbst: A little constant buffer doesn't sound like a bad plan.
21:39tleydxdy: yeah, therefore I was thinking something must be keeping the dma-buf alive
21:39tleydxdy: but what
21:39karolherbst: yeah.. it just fails pointless to copy the exact some content to another buffer on the gpu side :D
21:39karolherbst: *feels
21:40karolherbst: gfxstrand: how about.. if the Initializer nir_shader has a constant buffer and no instructions, the runtime just uses the constant buffer to init the global one
21:40karolherbst: ...
21:40karolherbst: feels dirty, but...
21:41karolherbst: maybe I shouldn't worry about it too much
21:43karolherbst: let's see...
21:45karolherbst: ahh.. most impls these days report 64k as the max size.. whatever then
21:45karolherbst: though they used to report 1G-10G
21:46karolherbst: AMD PAL and APP.. whatever
21:55karolherbst: I think I'll look into this once I'm done with non uniform work groups :D
22:58alyssa: robclark: what's the current state of a2xx?
22:59alyssa: trying to figure out who's going to do the a2xx portion of https://gitlab.freedesktop.org/mesa/mesa/-/issues/9051
22:59alyssa: !23089 is finally reviewed and in the Marge queue
22:59alyssa: which means it's time for me to start figuring out who's doing what so we can keep things moving :)
23:00alyssa: are you maintaining a2xx? would you be up for writing the patch?
23:00alyssa: if not, do you have some way to test a2xx if I write the patch with my eyes closed?
23:00alyssa: if not, is now the time to move a2xx to amber?
23:01alyssa: karolherbst: ^ friendly ping wrt nv50, should be easy enough to do the direct translate approach
23:03alyssa: samuelig: I assume the usual crew will take care of vc4 and v3d conversion?
23:18penguin42:would hope from precisely one thing from a NonNull pointer type
23:45karolherbst: alyssa: ahh right, I'll take a look and see how painful it would be
23:45karolherbst: or do you mean just using the new stuff for now?