01:31 yayiguess: So what would the best way for someone to get involved in the EVoC?
01:33 imirkin: send patches
05:02 tjaalton: jekstrand: anything beyond 20.0.x is up to the user, this is what 20.04 will ship with and is backported for 18.04.5
05:03 tjaalton: or will be
10:15 MrCooper: cwabbott: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4204 accidentally built the x86_build:2020-03-13 from scratch in your project, because it wasn't available in the main project yet; please delete it on https://gitlab.freedesktop.org/cwabbott0/mesa/container_registry
10:17 cwabbott: MrCooper: just to be sure, you mean 2020-03-13-built-by-job-1951069 on that page?
10:17 MrCooper: both
10:18 MrCooper: the *-built-by-* tag is only created when building from scratch and references the job number which built it
10:18 MrCooper: but either of them can keep the docker storage alive
10:19 cwabbott: ok, they should be deleted, I don't see them anymore
10:19 MrCooper: (identified by the hash)
10:20 MrCooper: thanks! bentiss has ideas which will hopefully allow this to be automated soon
10:45 daniels: tomeu, MrCooper: can we drop the meson-arm64-build-test job? its last execution was 1m52s, whereas the full meson-arm64 job only took 2m6s, so I don't see what this gains us at all
10:54 MrCooper: daniels: that job was added for keeping LLVM out of the ramdisk, though the ramdisk needs to be moved out of artifacts anyway
10:54 daniels: yeah, working on fixing artifacts as we speak
10:56 MrCooper: great
16:25 jekstrand: danvet: The more this e-mail goes on the less convinced I become that anyone except i915 actually implements dma-buf implicit sync properly.
16:25 jekstrand: danvet: Maybe it's all a lie and we should just burn down the house?
16:25 jekstrand: krh: How do you feel about flame-roasted noodles?
16:26 danvet: jekstrand, I haven't dared looking at any thread with you involved yet :-)
16:27 danvet: but yeah amdgpu had some really funny interpretations of what this was supposed to do
16:27 danvet: xexaxo1, burried alive
16:27 danvet: and distracted somewhat
16:28 jekstrand: danvet: Well, it sounds like V4L ignores it entirely
16:28 danvet: oh v4l is garbage
16:28 jekstrand: amdgpu is weird
16:28 danvet: you're supposed to block
16:28 danvet: it's the v4l way
16:29 jekstrand: I have no idea what freedreno, nouveau, or the rhaspbery pi do
16:29 danvet: yeah amdgpu is just weird nowadays, they did fix the cross driver dma_resv issues
16:29 danvet: so kms side should be correct
16:29 danvet: because I wrote a helper :-)
16:29 jekstrand: That's good
16:29 danvet: I do think v4c/v3d and freedreno rendering are also fairly correct
16:29 jekstrand: And it's caused some stirr in the Wayland world where it looks like I may have been successful at motivating most major compositors to think about implementing the Wayland explicit sync extension.
16:29 danvet: at least the raspi side of things, since people use those rendering to external panels
16:30 jekstrand: (Or maybe they were already motivated and I just poked them a bit)
16:30 danvet: and noticed that stuff teared (because the kms side helper was missing)
16:30 danvet: nice
16:31 xexaxo1: danvet: hope not literally?
16:31 jekstrand: X11 is a bit TBD at the moment. It looks on the surface like syncobj is actually a pretty good fit for XSync but it's X11 so there be dragons. ajax seems pretty convinced it's doable; daniels is convinced it isn't. I'm unconvinced of anything whatsoever. :-)
16:32 xexaxo1: danvet: although certain world situation doesn't help either I imagine
16:33 lynxeye: danvet: v4l is broken right now, but should be easy to fix
16:34 danvet: xexaxo1, yeah not literally
16:34 danvet: jekstrand, uh so trying to convince myself that nouveau/amdgpu don't have broken atomic implicit sync
16:34 danvet: not suceeding
16:35 jekstrand: :(
16:35 lynxeye: etnaviv also implemented implicit sync from day 1, as we need it to sync to the kms side, which is a different devices and across possibly multiple GPUs (separate 3D and 2D cores)
16:35 danvet: lynxeye, oh all the soc drivers are really good at this now
16:36 danvet: since we have consistent gem or cma helpers
16:36 danvet: and most wired them up correctly even!
16:36 danvet: but i915 thinks that it needs its entire hand-made commit machinery (I'm not sold, but whatever)
16:37 danvet: nouveau hand-rolls it too
16:37 danvet: and amdgpu resues the generic one, but fails to set up the implicit fences so that the generic one hits it correctly
16:37 danvet: so uh
16:38 danvet: bbrezillon, btw do we have patches somewhere to convert vc4 over to drm_atomic_helper_commit?
16:40 daniels: jekstrand: it does seem completely doable, but it needs someone who can see through dealing with the internal present scheduling
16:42 lynxeye: jekstrand: I don't really understand what kind of hacks you need to do in Vulkan userspace to make implicit sync work. Doesn't i915 still attach all the fences to the buffers you use in the execbuf? You may ignore them if you have explicit fences, but if you share them across the wire to the winsys they should be taken into account, no?
16:43 danvet: tzimmermann, just realized that drm_gem_vram_plane_helper_prepare_fb is missing a call to drm_gem_fb_prepare_fb
16:43 jekstrand: lynxeye: The problem is that Vulkan is inherently explicit sync so there's no good point in the API to tie implicit sync in.
16:43 danvet: or is there a reason it's not there that I've forgotten
16:43 jekstrand: lynxeye: And if we just implicit sync everything, we can end up with significant over-synchronization
16:44 jekstrand: lynxeye: Particularly because we can't always track which buffers are used by a given command buffer so we have to be aggressive and just include them all in the list we pass to the kernel.
16:45 lynxeye: jekstrand: Hm, why not just override implicit sync with explicit fences when they are available? As long as you stay in your cuddly explicit world you get no oversyncing as explicit always wins, but if you cross the wire to something that doesn't know about explicit you still end up doing the correct thing.
16:47 lynxeye: How would you end up with oversync on the compositor side? It's a single buffer (maybe with aux plane) you give it to present, surely you did kick off some rendering to that buffer before, so you need wait for the fence to signal before presenting the buffer, be it implicit or explicitly.
16:48 jekstrand: lynxeye: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2834
16:49 jekstrand: lynxeye: One of the big problems is that, because we may not know what buffers are in use, we have to include all of them. This includes all the bufers in your swapchain.
16:49 jekstrand: We don't, like OpenGL, have the luxury of always knowing everything
16:49 jekstrand: The MR I posted above cut probably 3 frames of latency out of DOOM 2016
16:50 jekstrand: It was hitting the over-synchronization issues really bad
16:51 jekstrand: lynxeye: The problem is that in the case where we have to include all buffers all the time, it looks (from a sync perspective) like the 3D client is rendering to the buffer while the compositor is trying to texture from it so the composite gets blocked by rendering for future frames.
16:51 jekstrand: It's not a good situation
16:51 imirkin: GL has bindless too.
16:52 jekstrand: imirkin: Yeah, it does
16:53 jekstrand: imirkin: But at least with GL, you have better (I think) guarantees about your window-system buffers because they're magic
16:53 imirkin: (wrt not really knowing what is used when)
16:53 jekstrand: I think in Vulkan, you're not allowed to touch it even to read from it after you've given it to the compositor but that's always been a bit unclear to me.
16:53 lynxeye: jekstrand: Ah, I see. So you always attach exclusive fences on all your swapchain buffers, I see how this is going to hurt you.
16:54 jekstrand: lynxeye: Yeah.
16:54 jekstrand: lynxeye: It's not a good state of affairs, I'm afraid. :-(
16:54 imirkin: jekstrand: quite magic. you can even glReadPixels them even if they're MSAA.
16:54 jekstrand: We've figured out ways around it in ANV (such as the above MR) but they're very invasive to the driver.
16:54 jekstrand: In RADV, it's still a signifiacant issue
16:54 jekstrand: Due to amdgpu weirdness about how it handles implicit fencing
16:56 lynxeye: yea as far as I understand amdgpu has a "interesting" interpretation of exclusive
16:58 MrCooper: wait what, a Vulkan driver doesn't know which buffer will be used for presenting what it's drawing?
16:59 jekstrand: I'm not sure how to answer that question.
16:59 jekstrand: I think the answer to the question as asked is no. The client has checked out some images from the swapchain which it may or may not choose to use at any given point.
17:00 MrCooper: that sucks
17:00 jekstrand: I think we could possibly do something where we implicit sync on all images currently owned by the client and not sync on ones owned by the display server.
17:01 jekstrand: But that can still lead to over-sync if the client has acquired more than one
17:03 lynxeye: jekstrand: can't you simply track which buffers you handed to the compositor and like not add them to the following execbufs?
17:03 danvet: imirkin, skeggsb [PATCH 0/2] drm: encoder_slave: some updates <- I think nouveau is the only serious user of this, can you pls ack?
17:04 imirkin: probably Lyude's domain -- i believe that Lyude spent some time lookign at the whole encoder situation, esp as it concerns mst
17:04 jekstrand: lynxeye: Possibly. I think that's probably ok but I would need to go deeply investigate the Vulkan WSI ownership stuff.
17:04 Lyude: Yeah I'll be happy to look
17:05 Lyude: will get to it today
17:05 jekstrand: lynxeye: That said, I'm getting very tired of trying to figure out how to shoe-horn implicit sync into Vulkan and would like to push stuff towards explicit. Doing explicit in an implicit API like OpenGL is much easier than doing implicit in an explicit API.
17:05 jekstrand: lynxeye: Also, even if we can come up with a "good" solution for Vulkan WSI, that doesn't fix interop with anything else such as video encode/decode libraries.
17:06 lynxeye: jekstrand: the compositor cares about the fences being correctly attached to the buffer _before_ you hand them over the protocol. Until you receive the buf_done over the wire there should be no reason for you to touch them ever again, including adding new fences.
17:07 jekstrand: lynxeye: Yes, I guess that's true.
17:07 jekstrand: lynxeye: But also, I'm trying to get to the point where we stop passing a list of buffers to the kernel at execbuf time at all. :-P
17:08 lynxeye: jekstrand: so you got reliable restartable faults yet? ;)
17:08 danvet: imirkin, drm_encoder_slave is nv04 tvencoders
17:08 jekstrand: And, unlike GL, Vulkan allows an app to start rendering on the next frame before it hands the current frame off to the compositor so we can still run into over-sync issues even if we do some sort of ownership-based thing.
17:08 danvet: Lyude, ^^
17:08 jekstrand: lynxeye: :P
17:09 danvet: this is as far away from mst encoders as you can think
17:09 danvet: like ... about 20 years or so :-)
17:09 danvet: ok maybe just 10
17:11 imirkin: danvet: oh. heh. i have a nv34 with tvout.
17:12 imirkin: but it's sadly not presently plugged in
17:12 danvet: imirkin, if I understand nouveau correctly, you really need the nv04
17:12 imirkin: the nv04 is, but no tv out :)
17:12 danvet: or maybe nv10
17:12 danvet: past that tv encoder is on the chip
17:12 imirkin: nv4 didn't have tvout
17:12 imirkin: ahhh... it's the external encoder stuff
17:12 imirkin: then we're in trouble
17:13 imirkin: i think rm -rf may be the only solution, i doubt any such boards are still in the hands of developers && plug-in-able
17:13 jekstrand: lynxeye: The plan is to have a resident set that's explicitly managed by userspace. Not faulting.
17:13 imirkin: actually lemme check, i have a nv17 which might have tvout
17:13 danvet: ah yes code hints at nv17
17:14 danvet: and yup this is for external tv encoders
17:15 imirkin: yep, got it - nv17 and there's s-video
17:15 imirkin: and naturally i have a monitor which consumes s-video
17:15 danvet: I think nouveau is the last user of this stuff
17:15 danvet: but tbh not sure
17:16 imirkin: well i can't test anything right now, but i may try to swap it in tonight
17:16 imirkin: although i'm about to move, so timing isn't ideal
17:17 imirkin: (the timing of the move itself is also not exactly fortuitous, but ... not much to do)
17:18 danvet: imirkin, but yeah deleting would be kinda neat too
17:18 danvet: or moving it into drm/nouveau
17:19 imirkin: if nothing else needs it, moving to nouveau is fine
17:21 danvet: well there's another driver in there which isn't even a drm_encoder_slave i2c driver
17:21 danvet: and then there's some random other thing that rmk stuff in there
17:21 danvet: which isn't even a drm driver
17:21 imirkin: this is the ch7006 and whatnot stuff, right?
17:22 imirkin: there's actually another external encoder (for dvi) that we don't know how to drive on 6150SE motherboards. that one may be lost to time though.
17:28 imirkin: people have all kinds of crazy setups, so i'd rather not remove functionality if we don't have to
17:28 imirkin: i use the s-video output coz the monitor supports PIP, which is nice
17:28 imirkin: (but not with digital inputs)
17:29 tzimmermann: danvet, i see, thanks for the info. but i've got very little time this week. if you send a patch, i'll review; otherwise i'll send out a patch ASAP (next week)
17:30 danvet: tzimmermann, it's not super important, since afaiui none of these drivers actually share buffers with anything that renders
17:30 danvet: or at least not yet
18:25 Venemo: jekstrand: is there a good way to tell whether a given output store is the last that stores that output?
18:26 jekstrand: Venemo: No, not really.
18:27 Venemo: :(
18:44 jekstrand: Venemo: If you use nir_lower_io_to_temporaries (doesn't work on TCS!) then every store will be the last one.
18:44 Venemo: yeah, TCS is the problem
18:45 jekstrand: With TCS, there's just not much you can do without significant analysis
18:46 jekstrand: Well, that's not quite true. You could, in theory, do shadow copies and sync everything at barriers.
18:46 jekstrand: But that seems worse than not having the shadow
18:46 imirkin: yeah, i was going to say that barrier is the "write" point
18:46 imirkin: and in TCS, barrier must be at the outermost level
18:46 imirkin: so there's no funny business
18:47 imirkin: however be careful with patch variables
18:47 imirkin: since you can definitely have code like
18:47 imirkin: if (invoc == 0) { patch var = 5; } barrier;
18:48 jekstrand: Yup
18:49 jekstrand: So you have to track all patch variables (at the component granularity!) that have been written between the two barriers and flush exactly those.
18:49 imirkin: but my point is ... "flush" isn't always going to be an exact science
18:49 imirkin: since it may be one or another invoc's responsibility to do the flushing
18:50 imirkin: if (invoc % 13 == 0) { patchout = gl_in[invoc-7].out } -- this sort of thing is hard to "flush" later
18:52 Venemo: jekstrand: I got a very interesting tip from mareko - he suggested to schedule all tess factor calculations as far up as possible and kill the threadgroup when the tess factors are 0. I imagine this could be done by wrapping the rest of the shader after the last tess factor write in a big if-else
18:54 jekstrand: Venemo: Interesting... That sounds like a very useful general NIR pass. :-)
19:01 Venemo: jekstrand: yeah, I guessed so
19:02 jekstrand: On Intel, I think we can just HALT instead of having an if
19:02 jekstrand: But we may have to do a dummy URB write or something to make that work....
19:03 jekstrand: it'd get interesting. :/
19:03 jekstrand: Venemo: I've got an MR outstanding which moves discards. A similar scheme could be used to move tes factor calculations higher
19:05 Venemo: jekstrand: yeah, I had that in mind, too.
19:05 jekstrand: It really is a discard like thing
19:05 jekstrand: mareko: How often have you seen that in TCS?
19:09 airlied:is sure I saw heaven doing it at some point, but I may also have just been broken
19:10 imirkin: jekstrand: very common to have like if (invoc == 0) { do stuff } barrier;
19:10 imirkin: the thing i said ... probably a bit less common :)
19:10 imirkin: oh wait. you're talking about something else. nevermind.
19:12 Venemo: jekstrand: I was told that this is a technique used for primitive culling, so it is not a rare pattern
19:13 jekstrand: Fun....
19:17 Venemo: all that being said, I still have some lower-hanging fruit than this. and then tess is unlikely to be a bottleneck
19:19 Venemo: but I think it's nice to keep this in mind for after we got the more important stuff done
19:26 jekstrand: yeah
19:26 dcbaker: airlied: I can't reproduce your meson 0.53.2 build errors. By chance are you using gcc 10?
19:32 airlied: dcbaker: yes gcc-10.0.1-0.9.fc32.x86_64
19:39 Venemo: jekstrand: what's gonna happen to that discard patch, btw? I think a few weeks ago I tried to add it to aco, but I didn't see a notable improvement and sadly, didn't have time to play around with it
19:42 jekstrand: Venemo: It's just sitting for now. I've seen both significant help and significant hurt with it. :-(
19:44 Venemo: jekstrand: I was hoping for a quick silver bullet kind for the witcher 3, and when I saw you mention dxvk in the mr, I just had to try
19:44 Venemo: that game seems to resist all the optimizations that we made
19:45 jekstrand: Venemo: Yeah, Witcher 3 isn't kind to drivers. :-(
19:45 jekstrand: Venemo: I think it was Skyrim that that helped
19:46 Venemo: haven't checked that one
19:46 jekstrand: Venemo: But the situation is way better now that we have demote_to_helper_invocation
19:46 jekstrand: Venemo: At the time I originally wrote it, DXVK was doing ALL discards at the very end of the shader.
19:46 jekstrand: Which is pretty much pessimal
19:48 Venemo: and I assume dxvk has improved since then
19:48 jekstrand: Yup
19:49 jekstrand: We have VK_EXT_demote_to_helper_invocation now
19:49 jekstrand: Which provides a D3D-style discard so it can just use that rather than having to move it to the bottom to emulate D3D semantics with GL-style discard.
19:51 Venemo: what exactly does a discard do? I was always too afraid to ask
19:58 dcbaker: airlied: can you try adding -fcommon to your c arguments and see if that fixes it?
20:01 airlied: dcbaker: trying onw
20:05 airlied: dcbaker: had to add it too cpp_link_args
20:06 dcbaker: but adding it c_link args and cpp_link_args resolved it?
20:07 airlied: dcbaker: seems to
20:07 dcbaker: okay, gcc 10 swtiched the default from -fcommon to -fno-common
20:07 airlied: dcbaker: why did meson suddenly break it though?
20:08 dcbaker: it's not meson, it's gcc
20:08 airlied: it builds with gcc 10 and meson previous version
20:08 dcbaker: gcc switch it's default in gcc 10
20:08 dcbaker: really?
20:08 dcbaker: that -fcommon thing is a gcc 10 change
20:08 airlied: yes I was running gcc 10 fine with meson-0.53.1-1.fc32
20:08 dcbaker: hmmm
20:09 airlied: upgrade to 0.53.2-1.git88e40c7.fc32 broke it
20:09 airlied: like I've just changed that one package locally to test
20:11 dcbaker: hmmmm
20:13 airlied: dcbaker: moving libdri to the start of the list seems to help
20:13 airlied: in src/gallium/targets/dri/meson.build
20:14 dcbaker: yeah, I think for right now I'll just send a patch to add -fcommon, which should fix it. Then we can figure out whether we want -fcommon or not
20:17 airlied: dcbaker: I've filed https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220
20:17 airlied: seems to fix it here
20:34 mareko: jekstrand: a lot of games cull in TCS, but a lot of games also do: if (tess_level[0] != 0) compute_outputs;
20:35 airlied: dcbaker: seems to make it through CI builds, I'd rather no force no-common unless we really have to
20:35 dcbaker: no-common does have binary size and performance benefits
20:36 dcbaker: and apparently common isn't compatible with C20, though I could be misreading what the patch to switch the default says
20:38 airlied: dcbaker: sory I meant I'd rather no force common
20:39 airlied: just make mesa work with gcc defaults, though I do wonder what meson changed here
20:39 airlied: maybe it reordered some linker line
20:39 dcbaker: I'm running gcc 9 with -fno-common now + your patch, if that looks good I'll send a PR to turn it on
20:40 dcbaker: I'm not sure either
20:40 dcbaker: I don't track what goes into the point releases very closely
20:42 danvet: sumits, [PATCH] MAINTAINERS: Better regex for dma_buf|fence|res <- ack on this?
20:43 sravn: mripard: I have two patches that should go into 5.7. They fix stuff from drm-misc-next. If I get it right I push to drm-misc-next-fixes - right?
20:43 sravn: mripard: This was how I understood: https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html
20:46 mripard: sravn: yep
20:52 sravn: mripard: thx, will process tomorrow.
21:51 jekstrand: anholt_, mareko: If I add a new PIPE_FORMAT is there a pile of metadata I need to fill out?
21:55 jekstrand: Found u_format.csv
22:11 anholt_: jekstrand: I think u_format.csv and the define should be it.
22:12 jekstrand: anholt_: cool.
22:12 anholt_: possibly some helper functions about properties of it, but we've been adding some unit tests
22:12 anholt_: (srgb, for example)
22:18 imirkin: svga used to be sensitive to new formats, but i think they fixed it