IRC Logs of #dri-devel on irc.freenode.net for 2023-03-20

03:23 airlied: DemiMarie: I think virtualbox has bits of one
03:24 airlied: and don't forget https://github.com/microsoft/graphics-driver-samples
10:15 javierm: danvet, tzimmermann: about https://lists.freedesktop.org/archives/dri-devel/2023-March/396405.html, I remember having the same discussion a few months ago and IIRC there were even patches posted?
10:17 tzimmermann: javierm, that issue was reported, but it don't remember about patches. last time this was related to nvidia.ko, so priority was low. meanwhile nvidia had some patches for a console somewhere
10:18 javierm: tzimmermann: ah, I'm misremembering then
10:42 Venemo: eric_engestrom: I added a guess based comment here, can you please take a look and see if I got it right or wrong? https://gitlab.freedesktop.org/mesa/mesa/-/issues/5772
10:51 eric_engestrom: Venemo: I didn't know about the PLT until I read that issue, but your guess of that "lazy binding" being related to the weak symbols that we use is reasonable
10:52 Venemo: I spent some time googling but I couldn't find a satisfactory explanation anywhere
11:26 javierm: tzimmermann: so you were right from the answer, it happens with the nvidia driver and the problem is that is relying on efifb/simpledrm
11:28 tzimmermann: javierm, you told me what happened there: somehow one instance of the driver killed the efifb that was used by the other instance IIRC
11:29 javierm: tzimmermann: yeah, I remember now
11:53 kusma: Sigh. Seems we're unable to merge MRs touching the root gitlab-ci.yml file now without timing out: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21953#note_1830512
12:00 javierm: tzimmermann: it seems I wasn't misremembering after all, danvet posted https://patchwork.kernel.org/project/dri-devel/list/?series=711019&archive=both
12:02 tzimmermann: javierm, oh, ok
12:03 tzimmermann: i remenber now. several of the cleanup patches had regressions, so the patchset didn't make it yet
12:04 javierm: tzimmermann: yeah and since is nvidia's fault to not set up their own emulated fbdev it was considered not worth the effort to keep pushing that series
12:13 javierm: tzimmermann: answered in the list. If someone wants to fix this then is free to take over danvet's effort https://patchwork.kernel.org/project/dri-devel/patch/20230111154112.90575-11-daniel.vetter@ffwll.ch/
12:14 javierm: but since only the nvidia proprietary driver is affected I would not call it an upstream bug
12:14 tzimmermann: javierm, thanks. maybe danvet's patches can be reduced a bit to be applicable
12:15 javierm: tzimmermann: needed to dig on my mail archive and irc logs to remember all this
12:15 javierm: probably will forget again everything in about a day or two :)
12:18 javierm: tzimmermann: there are some that I think that just fell through the cracks, like https://patchwork.kernel.org/project/dri-devel/patch/20230111154112.90575-1-daniel.vetter@ffwll.ch/ that you said that worked
12:18 DemiMarie: airlied: thanks for the link, it looks like it might be somewhat usable for Qubes.
12:21 DemiMarie: jenatali: so the context is that Qubes OS wants to be able to get at the contents of each window, but that requires writing a full WDDM driver (as opposed to a display-only driver) and so far the cost of that has been prohibitive.
14:13 eric_engestrom: kusma, DavidHeidelberg[m]: for the MR splitting the docs out of the main .gitlab-ci.yml, should we just re-assign it to marge, or is there something else that should be done first?
14:20 eric_engestrom: ^ my bad, I missed that it's already re-assigned
14:49 DavidHeidelberg[m]: eric_engestrom np, it was passing but one gitlab job didn't got to the runners..
15:28 jenatali: Demi: I'm so confused as to what Qubes has to do with Windows/WDDM
16:00 DemiMarie: jenatali: Qubes has support for Windows, but the current Windows tools don’t have good GUI integration.
16:01 jenatali: Oh I see as a guest OS
16:22 lina: I wrote a blog about explicit sync, if anyone's interested ^^ https://asahilinux.org/2023/03/road-to-vulkan/
16:25 jenatali: lina: "In Vulkan, there is no explicit synchronization of buffers." I think you meant implicit?
16:28 mivanchev: Mesa folks! Just stopping by to self-promote my newest addition to https://github.com/MIvanchev/static-wine32: a fully static Vulkan loader and Mesa Vulkan drivers :D
16:28 mivanchev: I was very nice experience making it happen and LTO gives us more than just confidence
16:28 lina: jenatali: Whoops, thanks! Fixed (once CI runs)!
16:31 jenatali: Good read. I'm super interested in this space because WDDM is all explicit sync. Except we don't have a way of tying together a buffer and fence, the fence needs to be marshaled separately from the buffer :(
17:26 jenatali: Weee apps ignoring reported device limits
17:37 DavidHeidelberg[m]: mivanchev: nice, btw. we originally kept patches against the wine for gallium-nine, there is chances these days, we could integrate nine into d3d9 layer again (seems like wine people are willing to talk about it)
17:37 DavidHeidelberg[m]: *chance
18:55 ngcortes: NB: Intel Mesa CI will be down momentarily for an automation update. We'll let everyone know when it's back online.
19:12 DemiMarie: jenatali: yeah, right now the support for Windows guests is not very good
19:15 jenatali: Anybody (gfxstrand) have opinions on doing a native surface -> vk surface map in the vk instance?
19:16 jenatali: Looks like some apps like to create/leak surface objects for a single native surface, and we can't really support multiple vk surface objects (that actually get used for swapchains) bound to the same native surface
20:49 gfxstrand: robclark: Ok, I think I may be missing something.
20:49 gfxstrand: robclark: What's the behavior when no deadline is set? Does no deadline mean ASAP or "whenever you get to it"?
20:50 robclark: no deadline set means "same as current behavior, ie. driver that created the fence doesn't get any deadline hint
20:50 gfxstrand: Ok...
20:50 robclark: for i915, that would mean no rps boost... which is current bhavior
20:50 gfxstrand: Right
20:51 robclark: basically it was way to opt-in
20:51 robclark: and.. if you are doing some "housekeeping wait" in userspace, you don't want to use deadline
20:51 robclark: (like waiting to clean up after some gpu work)
20:54 gfxstrand: Okay
20:58 gfxstrand: So it's not the same as an immedate deadline or an infinite deadline?
20:59 gfxstrand: It's a secret third thing?
21:06 robclark: gfxstrand: yeah, I guess it would amount to a secret thrid thing.. and actually kinda worse because I suppose a lame driver implementation could ignore the deadline and boost just because there is a deadline
21:07 robclark: (like my implementation of it for drm/i915... although I expect that patch to be replaced by someone who actually knows their way around i915)
21:10 gfxstrand: robclark: And we want vkCmdWaitForFences etc. to have a deadline of ASAP because userspace is waiting? I guess that makes sense.
21:11 robclark: I think so.. for same reason glFinish() did back in the day :-P
21:12 robclark: could make sense to have extension to expose that decision to the app
21:12 robclark: but first step is get kernel part in place ;-)
21:22 gfxstrand: robclark: So, what userspace is exercising the part where we have interesting deadlines?
21:22 gfxstrand: robclark: Or is that happening through KMS automatically?
21:23 robclark: there is some automatic kms deadline setting happening.. assuming compositor isn't sophisticated enough to be waiting for fence to signal in userspace (in which case compositor would need to use the SET_DEADLINE ioctl)
21:24 robclark: inside a vk or gl driver, we really only have choice between "gimme ASAP" and "no deadline"
21:24 robclark: tho it could be interesting to expose that via some sort of extension
21:26 gfxstrand: Yeah
21:27 robclark: fwiw, there is some intel gitlab issue about this, which I'm completely failing to find atm.. basically clvk was slower than native cl just because it was doing vkCmdWaitForFences() but native cl was doing I915_GEM_WAIT
21:27 robclark: the later boosts freq, the former does not (before my MR+patchset)
21:29 gfxstrand: Right...
21:30 robclark: https://gitlab.freedesktop.org/drm/intel/-/issues/8014
21:32 gfxstrand: Yeah
21:34 robclark: technically for that we don't need deadline.. just a pls make go fast flag... but deadlines are useful when you have things like vsync.. and I figured we could kill two stones with one bird.. or something along those lines :-P
21:35 gfxstrand: Yeah
21:35 gfxstrand: It certainly seems better than some of the other options
21:36 gfxstrand: It may also help v3dv. Have you talked to those folks?
21:37 gfxstrand: When I reworked the v3dv sync code, I fixed a bug in their WaitForFences implementation and now they need some sort of boost.
21:38 robclark: hmm, I have not talked to them specifically.. I assume someone follows dri-devel?
21:39 clever: ive been digging thru radeontop, and ran into GRBM_STATUS, as best as i can tell, its a bitfield representing what hw blocks are in use?
21:39 clever: and to get a usage%, your only option is to poll it at a relatively high rate, and measure it yourself?
21:39 gfxstrand: robclark: probably. IDK if they're paying attention to this particular thing.
21:41 robclark: hmm, I'm not even seeing devfreq or anything like that on kernel side for v3d..
21:43 clever: robclark: v3d runs off the CM_V3DDIV/CM_V3DCTL divider
21:43 clever: drivers/clk/bcm/clk-bcm2835.c:#define CM_V3DDIV 0x03c
21:43 clever: it is mentioned in the clock driver, but not actually used, in the kernel version i checked
21:44 clever: ah, // CLOCK_V3D is used for v3d clock. Controlled by firmware, see clk-raspberrypi.c.
21:44 clever: RPI_FIRMWARE_V3D_CLK_ID then, on a different DT device
21:45 robclark: hmm, ok
21:46 clever: the 2711 DT says v3d: v3d@7ec04000 { clocks = <&firmware_clocks 5>; }
21:46 clever: and then linux should do the rest?
21:46 clever: vc4 DT i think lacks that attribute
21:47 clever: and i'm not sure where the freq limits are defined, linux might only use this for on/off control?
21:47 robclark: I'm just trying to see how boost/deadline hints would fit in with v3d... but I guess that would require the CPU to be directing the fw somewhere..
21:48 robclark: (for a6xx we aren't really controlling the freq directly, but just tellin gthe fw what to do
21:48 gfxstrand: Yeah
21:48 gfxstrand: In this particular case, they were spinning in userspace or waking up every so often and it was causing things to CPU boost.
21:48 gfxstrand: It was all ver indirect but it lead to a perf regression.
21:49 robclark: _ahh_.. good 'ol spacebar heater ;-)
21:49 gfxstrand: Yup
21:49 gfxstrand: At least with deadlines they maybe have "correct" way to do it.
21:49 clever: robclark: the v3d is fairly dumb, and i would say it lacks firmware, you just generate a control list that can render a frame, and then the hw just steps thru the bytecode and runs shaders as directed
21:50 clever: robclark: the smartest thing ive found in that area, is that you can put the control list into a ringbuffer, and if the hw hits the end (your write ptr) it will PAUSE, and can automatically resume when you append (update the end/write ptr)
21:51 clever: so you can just blindly append jobs to the ring, and not have to deal with race conditions
21:51 robclark: but if something fw is controlling the freq, I guess fw is doing some sort of utilization based governor?
21:51 clever: yeah, when using the official firmware, the VPU (a dual core cpu) manages the v3d clock, v3d power, and many other things
21:52 clever: but thats mostly been down-graded purely to a throttling manager
21:52 clever: in the event of overheating or undervoltage, it will reduce various clocks, to prevent a crash
21:52 robclark: so, hurry-up-and-wait unless thermal/etc
21:53 clever: in the old days, the entire opengl stack ran on the VPU, and the linux side was just an RPC shim
21:53 clever: but mesa has since taken over that job, and linux is writing the control regs in v3d directly
22:04 robclark: mareko: I guess PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION and PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE don't really imply any more than what is already needed for TC?
22:08 clever: robclark: another complication, is that CM_V3DDIV is just dividing down from one of the PLL taps, *looks*
22:09 clever: robclark: https://elinux.org/The_Undocumented_Pi#Clocks
22:10 clever: V3D is in the core muxes group, and can pull from plla, pllc, plld or pllh
22:10 clever: but it can only get something that is an integer fraction of those, i think
22:14 robclark: As long as there is more than one possible freq, there is a choice to make ;-)
22:16 clever: yep
22:16 clever: some of these clock blocks support fractional division
22:16 clever: where it basically just alternates between /5 and /6
22:16 clever: so the average clock is somewhere between the 2
22:17 clever: but that means some of the clocks can be as short as /5, and you still need to keep things within the rise-time and propagation limits
23:05 DavidHeidelberg[m]: NOTE NOTE: If you get stuck pipeline on specific job, GitLab doesn't sent TIME to TIME the job to the runner. HOW TO: If you notice this type of stall in Marge queue, just re-trigger any finished job on the same runner group, then GitLab flush the job and send it to both runners !!! <-
23:09 zmike: robclark: I think those are glthread, aren't they?
23:10 zmike: for the subdata optimization
23:44 binhani: robclark: last Saturday I was asking about adriconf enhancements project listed on GSoC and you told me that the Wiki needed updating in general. With that said, can you help me get on track with up-to-date projects for this GSoC