IRC Logs of #dri-devel on irc.freenode.net for 2025-02-06

09:09 sima: just rolled drm-fixes to -rc1
09:09 sima: mripard, airlied ^^
09:13 mripard: sima: awesome, thanks
09:16 airlied: sima: did I not push that out? oops
09:16 airlied:had a disk fail in the raid1 I do builds on, just just got some new disks and been rsyncing
09:17 chaos_princess: is this the correct chat channel for gpu native context, and virtio stuff? if so - i am working on a vmm which supports different host and guest page sizes, and need the guest to align it's gpu allocations to host page size. i thought of just aligning to the maximum supported page size on the architecture, but ran into concerns that it could bloat memory usage in case of many small
09:17 chaos_princess: bos. the suggestion was to expose host page size via "extended virtio protocol". i wasn't able to figure out what specifically that is, and while i can introduce a vmm-specific side channel, that sounds wrong. how exactly that info should be communicated, and does it mean that i would need to talk to whoever maintains the virtio protocol to standardize all that?
09:18 mripard: airlied: sure, let's blame RAID :)
09:18 airlied: the disk has been gone for weeks, I just blame the new disks arriving distracting me from doing whatever I was doing :-P
09:52 karolherbst: mhh.. maybe I hate the "I'd need 100 different variants for my copy kernels" aspect of my plan enough to go straight to resource_copy_region and just pipe_cap it....
12:31 lumag: mlankhorst, mripard, tzimmermann: could you possibly merge back drm-misc-fixes into drm-misc-next (or just pick up current -rc1 into it)? It would be nice to get all the fixes for drm/display/* into the drm-misc-next branch
12:48 mripard: lumag: done
12:48 lumag: mripard, thanks!
12:49 mlankhorst: ah just got beat to it :)
13:04 tzimmermann: mripard, thanks for taking care of it
13:08 tzimmermann: mripard, mlankhorst, i'll take over drm-misc-next this cycle. there are still 2 patches left in drm-misc-next-fixes. i'll send them today, and drm-misc-next withing the next days. ok?
13:12 mripard: tzimmermann: I sent them as part of the drm-misc-fixes PR today
13:13 tzimmermann: mripard, great. thanks a lot!
13:20 mlankhorst: tzimmermann: can't remember if it's me this cycle
13:21 mlankhorst: but I think I have a break until next -next
13:21 tzimmermann: mlankhorst, it's my turn with drm-misc-next
13:21 mlankhorst: ah go right ahead, it's all yours. I think we need to disable pushing to next-fixes again
13:23 mlankhorst: seems to be already protected :)
13:24 tzimmermann: looks like it
13:24 mripard: yeah, I did it this morning as well :)
13:24 tzimmermann: thanks mripard
15:06 bbrezillon: sima: don't know if you're the right person to ask, but I'm trying to make sense of drm_syncobj_array_wait_timeout(). One thing in particular struck me: we're setting the task in an interruptible state, waiting for a dma_fence callback to wake us, but we're only registering our fence callback after going to sleep
15:06 bbrezillon: https://elixir.bootlin.com/linux/v6.13.1/source/drivers/gpu/drm/drm_syncobj.c#L1131
15:13 sima: bbrezillon, that's standard open-coded linux wait loop
15:14 sima: first you need to set your task state to sleeping, then you must re-check the wait condition, then you call schedule()
15:14 sima: if you do this in a different order, you might miss a wakeup and be stuck forever, which isn't great
15:15 heat: note that you only go to sleep in schedule() (or schedule_timeout()), despite your state being INTERRUPTIBLE
15:15 sima: the wake-up has the inverse order: 1. set the wait condition 2. call wake_up, which should set your task back to TASK_RUNNING to make sure that if the wakeup happened right between steps 2 and 3 (3 is calling schedule()) you don't miss the wakeup
15:15 sima: yeah schedule() is just a call to the schedule to figure out whether anything changed, it's not a "suspend me now" call
15:16 heat: a few years back i was really confused on how set_current_state never had a problem where a timer irq would arrive between set_current_state and schedule, and that's because there are simply safeguards for it, you can't get preempted in the middle of those
15:18 sima: yeah this predates preemption
15:21 bbrezillon: except that's not what I'm seeing here
15:22 bbrezillon: my thread enters sleep before registering the fence cb
15:22 bbrezillon: that is, when set_current_state(TASK_INTERRUPTIBLE) is called
15:23 heat: set_current_state(TASK_INTERRUPTIBLE) does not enter sleep
15:24 bbrezillon: okay, I need to check my tracing then
15:24 heat: set_current_state(TASK_INTERRUPTIBLE) expands to, roughly, smp_store_mb(current->__state, TASK_INTERRUPTIBLE);
15:25 sima: I think I've stumbled over a hilarious race on the wake-up side and there's a very special function for that, but I think that doesn't apply here
15:26 sima: bbrezillon, ok I think there might be a bug here
15:26 sima: you need to install the callback/put yourself onto the waitqueue before calling set_current_state
15:27 sima: otherwise again races
15:27 bbrezillon: yeah, that looks like what I'm seein
15:28 sima: yeah so that order looks wrong
15:28 sima: it must be dma_fence_add_callback(); /* sufficient memory barrier, set_current_state() is easiest */; dma_fence_is_signaled();
15:29 sima: except we want a fast path that just checks first once (due to lazy signalling semantics of fences) before we do the expensive dance of adding callbacks
15:29 bbrezillon: I guess we're better off using wait_event_interruptible_timeout() with a waitqueue we attach to fence slots then
15:30 sima: bbrezillon, same issue, just one indirection more
15:30 sima: that's not going to help I think
15:31 sima: and we still have the issue of having to listen to multiple wait queues
15:31 bbrezillon: hm, not sure I follow
15:31 bbrezillon: if the waitqueue is on the stack, you just wait on this one?
15:32 bbrezillon: but if you want to do a dummy run to avoid registering the callbacks upfront
15:32 sima: yeah so there's only one, but you still have to deal with the complexity of multiple fences and having to check them and add callbacks as needed
15:33 bbrezillon: I think the "are there any remaining unsignaled fences" check should be moved to a sub-function
15:33 sima: https://elixir.bootlin.com/linux/v6.13.1/source/drivers/gpu/drm/drm_syncobj.c#L1139 I'd untangle this monster if into clean control flow, and then just add another set_current_state() + rechecking every time you add a new fence cb
15:33 sima: that's probably simplest
15:33 sima: yeah that maybe too, this is a bit too messy
15:33 bbrezillon: which serves both as preliminary check, and then as a condition for the wait_event_interruptible_timeout()
15:34 sima: I really don't see how wait_event helps here
15:34 sima: because if your check function adds callbacks you again have the same order issue, except now it's hidden even more
15:34 bbrezillon: it helps in that you have the condition tested as part of the wait (before going to sleep)
15:34 bbrezillon: no, the check function wouldn't add the callbacks
15:35 sima: hm right, fences don't get unsignalled
15:35 bbrezillon: let me experiment with this, and I'll come back to you
15:35 bbrezillon: just glad I wasn't hallucinating this
15:35 sima: you still need both a waitqueeu and a waiter on-stack, feels a bit silly
15:36 bbrezillon: no, the waitqueue is your stack waiter
15:36 sima: yeah this is superb work spotting this frankly
15:36 sima: that's not enough or I'm too confused
15:36 bbrezillon: or maybe I'm mixing waitqueue with a different construct
15:39 sima: waitqueue is the list that you call wake_up() on
15:40 sima: that's usually in a datastructure
15:40 sima: on-stack waiter is the think wait_event puts onto that list
15:40 sima: s/think/thing/
15:44 bbrezillon: right, the wait_queue_entry is hidden in __wait_queue()
15:45 bbrezillon: ___wait_event(), sorry
15:46 sima: yeah it's one of those where the underscores stack up real bad :-/
16:08 mivanchev: hey, developer of static-wine32 here, I wanted to ask if there's any general interest in making mesa statically compilable?
16:09 mivanchev: it's not so much effort and IMO a huge benefit through LTO
16:09 mivanchev: on my side this would also make everything simpler by not having to resolve duplicate function definitions through prefixing :D
16:11 mivanchev: anyhow for platforms like Rpi this provides a significant benefit
17:41 Mis012[m]: which Rpi are we talking about, you can't fit all that many copies of Mesa in 2GiB of RAM
18:19 mivanchev: Mis012[m], 2GB is quite an old model :D
18:19 mivanchev: but also it's often enough to have 1 Mesa in RAM
18:20 Mis012[m]: if you use dynamic linking like god intended
18:21 mivanchev: Blasphemy!
18:21 mivanchev: he intended 11/10 performance through LTO
19:30 FL4SHK[m]: does mesa require dynamic linking?
19:31 mivanchev: yes
19:31 FL4SHK[m]: hmm
19:33 FL4SHK[m]: That's something I'm gonna need to support in one of my future Binutils ports
19:35 FL4SHK[m]: But porting an OS to the hardware is already going be a lot of work :)
19:35 FL4SHK[m]: the hardware is gonna be a lot of work as well
19:35 FL4SHK[m]: at least a lot of that is done
19:35 mivanchev: I already use static Mesa for static wine but it takes a lot of patching
19:35 mivanchev: it's doable
19:36 FL4SHK[m]: oh?
19:36 FL4SHK[m]: that'd make it easier
19:36 FL4SHK[m]: It's gonna be a while before the hardware is done, but I'll come back to this channel for help when I get to the point of porting mesa
19:38 heat: are doing your own hardware/architecture?
19:39 FL4SHK[m]: yes
19:39 heat: linux practically requires dynamic linking (for glibc, for mesa, for various things that require plugins, python, etc)
19:39 FL4SHK[m]: And I've done quite a bit of work on that front
19:39 FL4SHK[m]: hmm then I'll just do dynamic linking
19:39 heat: you can untangle yourself from that, but it's obviously a bit of a PITA
19:39 FL4SHK[m]: Right
19:40 heat: like statically linked glibc can't do DNS and whatnot
19:40 FL4SHK[m]: well, I'm willing to do the work
19:40 FL4SHK[m]: I was gonna use musl or something
19:40 heat: yeah musl might work better
19:41 FL4SHK[m]: I might go for musl anyway even if I have support for dynamic linking
19:42 heat: in any case if you're ever in a position to run mesa with your custom hardware, that would be really impressive
19:46 mivanchev: heat, linux doesn't require static linking, there are quite some Musl distros out there
19:46 mivanchev: the problem is the C++ dependency, not glibc, C++ is hard to get in a static form sadly
19:46 heat: which is why i said practically requires
19:46 heat: glibc is most definitely a big problem
19:46 mivanchev: yeah, that's true :/
19:47 FL4SHK[m]: C++ is hard to get in a static form?
19:47 heat: libstdc++ can be statically linked (with no problems, i believe), glibc can't without taking away big features
19:47 FL4SHK[m]: libstdc++ can be compiled statically
19:47 mivanchev: oh? I gotta look into it
19:47 mivanchev: thanks
19:47 FL4SHK[m]: I've done so for my game console project
19:47 FL4SHK[m]: That's my second GCC port, and the most complete one
19:48 FL4SHK[m]: That's part of my second GCC port*
19:48 FL4SHK[m]: I designed a custom instruction set
19:49 mivanchev: woah!!!
19:49 mivanchev: nice FL4SHK[m]
19:49 FL4SHK[m]: the CPU and GCC ports are almost complete btw
19:49 FL4SHK[m]: But well, that's off topic for this channel, I guess
19:50 FL4SHK[m]: You can check out my GitHub if you want to see the code though
19:51 FL4SHK[m]: libsnowshouse, gcc, those repos
19:51 FL4SHK[m]: thanks mivanchev:
20:13 mivanchev: FL4SHK[m], I'll check it out
20:14 mivanchev: this one https://github.com/fl4shk ?
20:21 FL4SHK[m]: yes
22:23 tnt: I'm a bit lost with GL and linking and stuff. So for a bit of context, I'm working with the intel-compute-runtime OpenCL runtime and trying to maintain the CL/GL sharing interop extension. And in general it's been working fine.
22:23 tnt: However I've encountered a weird issue where I can't get the symbols I need.
22:24 tnt: So the CL runtime itself doesn't link directly to any GL library. It tries to find the symbol it needs at runtime and assumes the client app will have loaded GL and created context as needed before it tries to create a CL context.
22:25 tnt: And then I just use dlsym() to find the few symbols I need. And in general, it's been working fine.
22:26 tnt: But now I have an app for which dlsym() of gl symbols ( like glGetString or such ) works fine. But I get NULL for both glXGetProcAddress and eglGetProcAddress ...
22:35 HdkR: tnt: Sounds like it isn't linking to libGL but instead libOpenGL, which means it doesn't get GLX symbols, and also not linking against libEGL. So effectively surfaceless
22:37 tnt: HdkR: Before that call to CL init, I used glfw to create an actual window on-screen so I definitely have a surface.
22:39 soreau: libepoxy?
22:43 karolherbst: tnt huh.... I mean rusticl also just uses dlsym
22:43 karolherbst: though.. mhh
22:43 karolherbst: if the gl lib is bound locally, that might cause issues
22:43 karolherbst: but then all bets are kinda off to be fair
22:44 tnt: karolherbst: yes, I basically modelled that behavior after what rusticl does.
22:44 karolherbst: sounds like a problem I'll have to deal with as well...
22:44 tnt: Nope, rusticl works fine for that application.
22:44 karolherbst: hah
22:44 karolherbst: then do whatever rusticl does :D
22:45 karolherbst: though let me see if we do something special for glXGetProcAddress ...
22:45 karolherbst: nope...
22:46 tnt: Do you even use it at all ?
22:54 tnt: So the issue is definitely GLX seem to be loaded as local because if I somehow force loading it as RTLD_GLOBAL at the beginning of the non working app, it starts working. I'm not sure why it works for rusticl though.
22:55 karolherbst: what do you mean use it at all?
22:55 karolherbst: glXGetProcAddress? yeah, it's used
22:56 tnt: karolherbst: Yeah sorry, I had some memory of long ago where I think you were accesing the interop function directly without going through glXGetProcAddress.
22:57 karolherbst: yeah.... it might also be, that rusticl ends up with a local copy of that stuff.. I haven't actually checked if that could happen or not, but I kinda doubt it, because it's not linking against those bits at all