IRC Logs of #dri-devel on irc.freenode.net for 2023-07-21

05:13 ishitatsuyuki: good question, and apparently a 0 propertyFlags is explicitly allowed
09:09 linkmauve: “09:35:55 emersion> but dumb BOs are not for clients…”, what would be the recommended way for clients to do software rendering into buffers they expect to be used as dmabufs? I’m thinking about software image/video decoding to maximise the chances it gets promoted to a plane.
09:10 emersion: someone, compositor or client, needs to upload to the GPU
09:11 emersion: the compositor can upload to a scanout capable buffer if it wants to
10:04 linkmauve: emersion, does that mean it’s useless to try to optimise software decoding with a dumb buffer allocated from the kms device, and instead we should use GL or so to upload it anyway?
10:04 emersion: hm, so, maybe you'd want to write directly to the dumb buffer to avoid copies?
10:05 linkmauve: That’s my reasoning yes.
10:05 emersion: although shadow buffers are preferred most of the time
10:06 linkmauve: Let’s say I’m a very simple video application, using ffmpeg or similar library in software-only mode, and I want to get the best possible path towards the screen.
10:07 linkmauve: I do not use GL yet.
10:07 linkmauve: I would expect dumb buffers to provide said best possible path, at least scanout.
10:08 linkmauve: It won’t be tiled or anything, but for scanout that’s sensible on most hardware.
10:14 mripard: linkmauve: another related topic is that dumb buffers (on most ARM platforms at least) is mapped non-cacheable, so software access will be painfully slow
10:15 linkmauve: Hmm, so it’d be better to decode into a normally-allocated (e.g. malloc()) buffer, and then use wl_shm to let the compositor do what it thinks is best with the buffer?
10:17 linkmauve: That way memory access won’t be too slow on decoding.
10:28 MrCooper: certainly if there's a non-0 chance of CPU reads from the destination buffer
10:30 dottedmag: There's a capability that tells that dumb buffer is slow on reads/non-sequential writes. I'm not sure how reliable this cap is.
10:33 MrCooper: DRM_CAP_DUMB_PREFER_SHADOW
10:43 doras: emersion: do you happen to know if any DRM display driver currently implements a slew rate for variable refresh rate in kernel or firmware?
10:44 emersion: amdgpu has LFC
10:45 emersion: intel hw supports some stuff but not sure it's enabled on Linux
10:46 doras: emersion: do you have any pointers regarding Intel's support where I can learn more about it?
10:47 emersion: only heard about it in this channel iirc
10:47 emersion: maybe ask in the intel channel
10:48 doras: emersion, zamundaaa: have you tried implementing a slew rate in your respective compositors?
10:48 emersion: nope
10:48 zamundaaa[m]: Not yet. I should be able to soon though
10:51 doras: zamundaaa: do you plan to scope it only to passively updating apps through some heuristic, or otherwise only in case of extreme changes in refresh rate?
10:52 MrCooper: doras: during the hackfest, Manasi mentioned that i915 developers are considering adding LFC, but there seemed to be consensus that we'd rather handle it in the Wayland compositor
10:56 zamundaaa[m]: Dor Askayo: scope what? Slew rate limit has to happen whenever vrr is active
10:56 doras: MrCooper: I feel that we're missing opinions from people with knowledge about the kernel implementation of LFC for amdgpu.
10:58 doras: zamundaaa: so the second option I mentioned, then. The slew rate would kick in regardless and prevent extreme changes in refresh rate.
11:02 doras: I mean, you likely don't want the slew to affect the change in refresh rate in the case of small changes that wouldn't otherwise result in significant flicker. It would result in visual stutter in games (for example) without any benefit.
11:05 doras: MrCooper: I'm asking these questions with VRR ranges between 1-N Hz in mind, where LFC won't ever be needed anyway.
11:05 zamundaaa[m]: Oh yeah, there's definitely different use cases for the default settings. I hope though that a properly configured maximum rate of change would work gor both
11:05 zamundaaa[m]: So that you limit change enough to prevent flicker but not so much that it makes it useless for games
11:06 doras: zamundaaa: yes, exactly
11:19 doras: zamundaaa: how would you handle ramping refresh rate down? Send the same frame for presentation multiple times if necessary?
11:29 zamundaaa[m]: Yes
13:17 karolherbst: does any hardware have group operations on a workgroup level? Like all that subgroup stuff, just one level higher
14:32 Hazematman1: OpenGL provoking vertex question. Is the provoking vertex supposed to affect the order in which vertices are processed during primitive assembly? I can't find any mention of this in the spec but I'm getting different behavior between different GPUs and the piglit test spec@!opengl 3.2@gl-3.2-adj-prims line cull-front pv-first seems to test for this
14:39 cmichael: this sounds oddly familiar ;)
14:41 Hazematman1: cmichael: Haha! When you're losing your mind trying to understand something you have to try and get all the help you can :P
14:41 cmichael: Hazematman1, oh I know that all too well ;)
14:45 mattst88: eric_engestrom: I'd kind of forgotten about it, but would you mind making a release of glu as well?
14:52 robclark: does something about dma_fence confuse kmemleak? I've got a kmemleak report of a bunch of random fence leaks (and a few sync_file).. but it's a mix of drm/sched hw_fence and user_fence and kms atomic out-fence
15:14 tleydxdy: at least some of those are real, I'm having a hard time tracking them down tho
15:19 pac85: Since somebody brought up how provoking vertex mode affects primitive assembly I wonder if somebody could shed light into the behaviour I described in this MR https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22309
16:09 eric_engestrom: mattst88: could you send me a link to the procedure for glu? I'll try to do it tomorrow :)
16:09 eric_engestrom: (the https://docs.mesa3d.org/releasing of glu)
17:08 zmike: with nir_intrinsic_store_output what does src1 mean: is that like the varying slot offset?
17:22 mattst88: eric_engestrom: I don't think there is one -- IIRC it's just the standard procedure for X packages
17:22 mattst88: e.g. `modular/util/release.sh .` and send the email
17:45 eric_engestrom: mattst88: ack, I'll do that then; what should the version be? 9.0.3, 9.1.0, 10.0.0?
17:45 eric_engestrom:has not been following the glu repo _at all_
17:47 eric_engestrom: and the tag is on master directly I assume, since I see no branhces?
17:49 mattst88: eric_engestrom: yeah, tag on master. I think 9.0.3 is fine -- very few changes
17:49 mattst88: thanks very much!
17:53 eric_engestrom: mattst88: there's still autotools next to meson.build, perhaps it's time to drop autotools?
17:54 mattst88: eric_engestrom: totally fine with me
17:59 eric_engestrom: mattst88: https://gitlab.freedesktop.org/mesa/glu/-/merge_requests/11
17:59 eric_engestrom: I'll make the release after that
18:24 mattst88: looks good to me, thanks!
18:26 eric_engestrom: glu 9.0.3 out :)
18:41 jenatali: zmike: It's the driver_location of the variable that the write was lowered from
18:42 zmike: 😬
18:42 zmike: I don't think that's accurate?
18:43 zmike: or maybe it just doesn't match up in this case
19:06 jenatali: Oh no I'm wrong, it's a 0-based offset from nir_intrinsic_base, nir_intrinsic_base comes from driver_location
19:06 zmike: okay that sounds more likely
19:06 jenatali: Which is used if a varying uses multiple vec4s
19:07 zmike: yeah I'm battling enhanced layouts for the trillionth time
19:07 jenatali: FWIW we end up lowering IO in the backend as a late step so that we can keep the variables
19:08 zmike: I'm deleting the variables
19:08 jenatali: That might be an easier first step (if that's not already what you're doing)
19:08 zmike: almost there...