IRC Logs of #dri-devel on irc.freenode.net for 2023-12-01

00:00 Kayden: yeah, I think that would break compatibility with super legacy drivers that never transitioned
00:00 Kayden: all of mesa switched over long ago and should be fine
00:01 illwieckz: For the Unvanquished game we noticed GLVND ABI broke compatibility with legacy Nvidia drivers and current AMD oglp drivers (it's the proprietary OpenGL driver in amdgpu-pro, though discouraged, it is still shipped today)
00:02 illwieckz: but it looks like the glvnd ABI this also breaks Wayland with Prime with current Mesa drivers: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9282#note_2188755
00:03 illwieckz: so, I assume the good thing to know for a developper is: don't use GLVND ABI unless you know why you need it
00:04 illwieckz: Maybe we may report to CMake that setting the GLVND ABI the default is probably a mistake.
00:04 Kayden: interesting, hadn't heard of a bug there, but sounds believable. looks like pepp has a fix in progress in that issue
00:05 illwieckz: that patch was not enough for some user though
00:06 illwieckz: anyway, if glvnd allows to install multiple GL libraries, I guess we can select a specific library at runtime, a bit like OpenCL or Vulkan ICD? How does that work? (I mean what can I do to test it)
00:07 illwieckz: I currently have a setup with Mesa radeonsi and amdgpupro oglp installed so I can test.
00:08 Kayden: for EGL, there's a json ICD enumeration system sort of like vulkan: https://github.com/NVIDIA/libglvnd/blob/master/src/EGL/icd_enumeration.md
00:08 Kayden: for GLX - it gets the driver from your X server. so whatever X is running on, so will clients
00:09 Kayden: or should
00:09 illwieckz: okok
00:09 Kayden: have to head out now, sorry, but good luck! hope that was a bit of help at least
00:10 illwieckz: thx
09:32 tzimmermann: javierm, thanks for reviewing. there's still screen_info code at the end of efifb.c.
09:33 tzimmermann: it tries to find the framebuffer's device on the pci bus, so that efifb can avoid power management
09:33 tzimmermann: on the framebuffer
09:34 tzimmermann: as a quick fix, maybe we can move that code into sysfb.c and use the pci device as parent for the platform device ?
09:51 javierm: tzimmermann: you mean efifb_fixup_resources() ? Yes, that makes sense. Is something that simpledrm would need or not because doesn't do PM ?
09:52 javierm: there's already sysfb_apply_efi_quirks() in drivers/firmware/efi/sysfb_efi.c so makes sense to have other fixups in sysfb
09:53 tzimmermann: javierm, our platform devices don't support PM. that should mean that if we set the pci dev as parent, the parent won't do PM as well
09:53 tzimmermann: maybe that's a bug with the current simpledrm
09:54 tzimmermann: i have to verify that DECLARE_PCI_FIXUP_CLASS_HEADER runs at the right time
09:54 tzimmermann: after screen_info setup, but before sysb
09:54 tzimmermann: sysfb
09:58 javierm: tzimmermann: that's what I thought. Hence my comment that efifb_fixup_resources() would also be beneficial for simpledrm
09:58 javierm: in other words, it makes sense to move that logic to sysfb_efi
09:59 javierm: tzimmermann: btw, did you ever post the patch to move sysfb to a different initcall level ?
09:59 javierm: I remember we discussed something about that
10:00 tzimmermann: BTW, i tried to figure out how to limit screen_info exposure within the graphics code. i don't quite know where to start. so i only have few more patches to get screen_info removed from regular device drivers
10:03 javierm: tzimmermann: yes, I like to limit the exposure to that global variable
10:06 javierm: tzimmermann: if I'm reading the code correctly the pci fixups happen at fs_initcall_sync()
10:06 tzimmermann: urg, that's too late
10:07 tzimmermann: maybe we should move sysfb back to device_initcall
10:07 javierm: tzimmermann: yeah... I think we discussed about moving sysfb_init() to device_initcall_sync(), which is what we agreed IIRC
10:07 javierm: let me look at my irc logs
10:07 tzimmermann: i dont remember TBH
10:09 tzimmermann: but the drivers use module_init(), which resolves to device_initcall() IIRC
10:09 javierm: 10:24 < javierm> | tzimmermann: can you mention in the thread and propose a patch? And also update the comment that should not only happen after PCI but after all the devices have been registered
10:09 javierm: 10:25 < tzimmermann>| well, it's just an idea for now. i'd like them to first debug this further
10:09 tzimmermann: so we won't get any output sooner than that
10:09 javierm: 10:25 < javierm> | tzimmermann: Ok
10:09 javierm: 10:26 < javierm> | tzimmermann: regardless, I think that you are correct that should happen after device_initcall()
10:10 tzimmermann: indeed
10:10 tzimmermann: and for now, we can sync with native drivers via sysfb_disable() as a workaround
10:11 javierm: tzimmermann: yeah
10:11 javierm: which reminds me that I need to answer to that OF/EFI thread. I'll do that now
10:11 tzimmermann: thanks
10:14 pq: emersion, I almost asked if a driver needs to pull in a CRTC into an atomic commit even when userspace made absolutely no reference to it, will that also trigger a VRR scanout cycle on the unexpected CRTC. :-p
10:14 emersion: pq, yes, this can happen!
10:14 pq: yeah...
10:14 emersion: i've seen several of these in the wild
10:14 pq: but was that gated by needing ALLOW_MODESET?
10:15 emersion: hm, i'm not sure…
10:15 pq: I decided not to mention it, because things were getting hairy enough already
10:34 MrCooper: pq: my understanding is that pulling in additional CRTCs must happen only with ALLOW_MODESET
10:39 emersion: hm i remember something about a patch adding WARN_ON w/ affected_crtc checks…
10:39 emersion: not sure it got merged
13:54 sima: jani, I guess I should forward your pr?
13:54 jani: sima: up to you, otherwise it'll be next week
15:46 vsyrjala: jani: sorry about that. i did realize the dependency when i tagged the patch, but figured i'd come up some sane way to highlight it before pushing. but then promptly forgot the whole thing and pushed anyway
16:21 pq: Apparently KMS does not work at all how I imagined.
16:21 emersion: note that this is ville's view
16:21 emersion: other drivers behave differently
16:22 emersion: pq: also note that drivers don't get the "list of things that changed"
16:22 emersion: they only get "this object changed, here are all of the properties for the new state"
16:23 emersion: so it doesn't make a difference to optimize the diff'ing per-property
16:23 emersion: if any object's property changed, then it doesn't make a difference to also include all other properties for that object
17:02 kisak: Safeties off ... fingers crossed
17:09 kisak: karolherbst: I need to revert https://gitlab.freedesktop.org/mesa/mesa/-/commit/1cc26e8b6657b5097995470ced9ae9cc7b6f01b9 if you want rusticl on any of my PPA builds. Just to check, this was put in to fail early on a build time failure, right? If I can get to build, it's fine to not have that safety?
17:10 kisak: from memory, I think it's fine because the trouble around that started with llvm 16 and I'm stuck on llvm 15 for the time being.
17:10 karolherbst: yeah.. it's probably fine... I just know that there was a bindgen bug some hit
17:11 karolherbst: or something
17:11 kisak: okay, thanks
17:13 kisak: If the libtcmalloc drama around llvm 16+ gets fixed, I'll probably need to prioritize radeonsi/llvm over rusticl/opencl, but that's a future me problem
17:16 kisak: tjaalton: the rust / bindgen packaging is extremely painful. Good luck with that. If you happen to figure out a procedure to take care of that in the course of your routine maintenence, I'd appreciate a copy of your notes, but I don't expect anything. I'm not going to do that heavy lifting on my own.
17:16 kisak: ^take care of backporting for mesa updates
17:23 tjaalton: kisak: I'm not backporting rusticl to 22.04
17:23 tjaalton: because of that
17:24 kisak: the applies to all current ubuntu releases, not just 22.04
17:24 kisak: *the note
17:24 tjaalton: well, 22.04 is the only one I backport to..
17:24 kisak: oh, okay
17:24 tjaalton: because there are others doing that ;)
17:28 sima: pq, emersion I think it's roughly the same in practice: there's a few things that have documented side effects (like a crtc generates and even if it's otherwise a no-op and a plane generates a full damage even if it's otherwise a no-op)
17:29 sima: but drivers do not add random state updates that have an observable impact, like just always change the full resource/scaler assignment, which on some hw would mean more modesets than you've asked for
17:29 sima: also there's some stuff like if you add a plane/connector, that pulls in any crtc it's connected to
17:30 sima: pq, emersion I guess what could clarify if you do s/try to avoid no-operation changes/try to avoid _expensive_ no-operation changes/
17:31 sima: since just always blasting the usual registers really doesn't matter, but unconditionally uploading the ctm/gamma tables does tend to matter
17:31 emersion: but drivers check whether the blob ID changed before updating the LUTs?
17:41 sima: emersion, yeah they should
18:08 ptrc: hey, did anyone here try testing the new Imagination DRM driver with a GX6250 GPU? i'm trying to do just that on an MT8713, and in the initial patch they say they tested it on an Acer Chromebook R13, but i can't find the right GPU device tree node for it
18:13 vsyrjala: luts are a special case. for those we have one of these ad-hoc _changed booleans
18:15 vsyrjala: though that's not super optimal either since it covers both luts and ctm, and on i915 changing the ctm is generally fairly cheap vs. chaging luts is expensive. so currently if you change the ctm we also reprogram the luts which is not the best thing to do
18:18 vsyrjala: i suppose we could stop blindly trusting that flag and instead checking if the blobs changed or not. but we'd also have to make sure we don't need to change the format in which they are being fed to the hardware
18:20 zamundaaa[m]: vsyrjala: how expensive is programming luts in general? As in, how long does it take?
18:24 vsyrjala: some dozends of microseconds. but we also have to do it during the vblank, so we spin up a high priority thread in an effort to make it in time, and we have to try to make sure the cpu is fully awake to handle the interrupt asap to schedule the thread. otherwise you are pretty much guranteed to spill over into the next frame
18:24 vsyrjala: even that isn't 100% guaranteed to work unfortunately
18:26 vsyrjala: anywyas, i think psr/etc. is where the main problems are due to the implied damage requiring us to wake up the hardware. otherwise we can let it sleep and save a ton of power
18:27 vsyrjala: having to invalidate the compressed buffer for fbc is also not very optimal, but relatively not as expensive as waking from psr
19:03 gfxstrand: daniels: Something wrong with the Windows runners?
19:03 gfxstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26407
19:41 daniels: gfxstrand: yeah, alatiera has been rebuilding one of them (the one which was faulty last week)
21:39 JoshuaAshton: Hey, has anyone been using epoll + dmabufs?
21:40 JoshuaAshton: I found that I can NULL pointer BUG a kernel if you do epoll + wait on a dmabuf + close while in epoll_wait
21:40 JoshuaAshton: :l
21:41 JoshuaAshton: I can workaround it by doing epoll_ctl + EPOLL_CTL_DEL before `close`, but this is a pretty bad bug. It brings the whole system down :S
21:46 JoshuaAshton: https://lists.freedesktop.org/archives/dri-devel/2023-December/433308.html
21:46 JoshuaAshton: sent email to ML
21:48 gfxstrand: Uh... That's a bug and a pretty bad one at that.
21:49 JoshuaAshton: Yeah... Happy to help debug stuff there. I peeked at the code and my IDE said "last edited 12 years ago" so now I'm more scared :P
21:50 gfxstrand: A repro IGT test would probably help
21:50 gfxstrand: But if I had to take a stab, I'd say it was a reference counting bug or something where we're supposed to be removing ourselves from the epoll but aren't.
21:51 JoshuaAshton: I'll look at doing that
21:53 gfxstrand: Yeah, turning it into a dma-buf FD and passing that into epoll should hold a reference at least until it signals. Someone is dropping a reference early.
21:54 gfxstrand: That or it's supposed to auto-delete on close. I'm not sure exactly what epoll's semantics are there.
21:55 JoshuaAshton: It's supposed to auto-delete from the epoll on fd close
21:56 JoshuaAshton: I'm guessing the order something is happening there is wrong
21:56 gfxstrand: Yeah, then someone hasn't hooked that up for dma-buf properly
21:56 JoshuaAshton: such a shame, epoll is such a great mechanism for buffer latching
21:56 gfxstrand: Which is believable because dma-bufs are weird
22:02 DPA: My man pages say it's to be removed on fd close only if there are no more references to the file description of the file descriptor.
22:03 JoshuaAshton: FWIW this is also a dup'ed fence
22:03 JoshuaAshton: so maybe there's even more schenanigans there :-P