IRC Logs of #dri-devel on irc.freenode.net for 2025-03-18

10:56 dolphin: mripard, sima: Any continuation to the above discussion about cgroups?
10:56 mripard: dolphin: none that I'm aware of
10:57 dolphin: My understanding also aligned with that dmem is only for device memory and system memory usage should come via the existing tracking.
10:59 mripard: can you define "system memory" ?
11:00 dolphin: I think it's bit biased by my understanding is anything that would contend for same backing store resources as malloc()
11:01 dolphin: If it's carved out and can never be utilized as regular system memory, then it's not system memory but special memory.
11:02 sima: mripard, was pondering our discussion a bit more, and I feel like looking at this from the pov of what a distro would need to auto-config limits could help move this forward?
11:03 sima: and it should work for dgpu, igpu with shmem, igpu with split display/render and also dma_alloc with cma or true carve-out
11:03 sima: and also not fall over stuff like userptr I guess
11:03 mripard: dolphin: the issue is whether an allocation goes through dmem or memcg in such a case is defined by the platform, the (physical) device, the firmware version, and the (struct) device
11:04 sima: so starting with the use-case and not starting with a specific solution for tracking/enforcing limits
11:04 mripard: so it's impossible to get stable accounting across devices and across kernel or firmware upgrades
11:05 sima: yeah, hence why I feel like stable account won't cut it, and we need to solve the auto-config situation
11:05 sima: or at least be able to
11:05 sima: while still allowing custom-tailoring ofc, but that's always possible simply by having an enforcement control
11:05 mripard: I mean, it's possible to have stable account if we use dmem for it, no matter what the backing store is
11:06 sima: nope
11:06 dolphin: but the whole point of cgroups is to avoid the backing store contention?
11:06 sima: not if you also want to solve memcg enforcement, which you have to because userptr is a thing
11:06 sima: userptr or svm or hmm or whatever
11:06 dolphin: if you have configured your system memory limits nicely, can't just have somebody wreak a havoc by allocating system memory bypassing?
11:06 sima: so trying to solve this with dmem only wont cut it
11:06 sima: yup exactly this issue
11:07 sima: "everything dmem" just falls apart in other ways, but it also falls apart
11:07 mripard: I mean, I did say we had to account for it in both memcg and dmem for system memory to avoid that issue in particular
11:08 mripard: so I agree, and it's something we can fix
11:08 dolphin: I think double-accounting was shot down as a concept very early
11:08 sima: yeah, that's really messy
11:08 sima: especially with big gpu where malloc vs gem_bo alloc is somewhat interchangeable
11:09 sima: and so for the same workload but different allocator approach you'd need wildly varying dmem limits
11:09 sima: plus I think tejun was fairly strongly opposed to a dmem that just equals system memory as a general thing that malloc() also uses
11:10 sima: I think it was more murky for tracking cma regions, but there you still have the double-accounting mess
11:10 mripard: it's really awesome how everybody complains now that it works for Xe
11:10 sima: xe doesn't do enforcing yet I think, and it still doesn't track system memory
11:10 mripard: but you know, were dead silent when I was very vocal about what I wanted to do but it wasn't merged yet
11:10 sima: so only "works" in a pr sense :-)
11:11 sima: it was a tiny, tiny first step
11:11 sima: mripard, xe has not precluded any decision on system memory tracking at all, that's all still open
11:11 sima: at least wrt upstream no regressions promises
11:12 mripard: it evidently did if Tejun was strongly opposed to it, and if it was completely obvious that this approach wouldn't fly
11:12 sima: well it's not completely obvious
11:13 sima: but over the years there developed a fairly strong consensus that at least for "normal" system memory, double accounting is not good
11:13 sima: except defining what "normal" is and how to do the accounting of it precisely without gaps is still really hard and unsolved
11:13 sima: I don't think we've ever gotten to a consensus whether cma is "normal" or not
11:14 sima: there's also the mess of ttm swapping out stuff to system memory
11:14 sima: which practically needs a memcg aware shrinker of gem_bo caches
11:14 sima: which ... is also a really hard problem
11:14 sima: that's why for now we didn't try with an enforcing limit for dmem, because currently we just cannot
11:15 sima: at least for ttm drivers
11:15 sima: or maybe more correct, dgpu drivers with vram that do swap-out to system memory
11:15 mripard: I'm sorry, I feel like I'm being gaslighted here. Is what I want to do welcome or not?
11:16 mripard: because if it's isn't, I'm not going to waste any more time than I already did on it
11:16 mripard: and if it is, I don't want to discover *again* a year from now that I was used to rework the DMA API but that everybody knew that it wouldn't work
11:17 sima: mripard, so still haven't caught up on mails, but it is
11:18 sima: it's just a really hard issue with ongoing discussions among various people over years
11:18 sima: and not really anything documented anywhere as a todo
11:18 sima: we should probably fix that
11:19 sima: I think we've gone over the "dmem for everything gpu" design about pre-covid over 1-2 years until consensus moved towards "eh, nope"
11:19 mripard: then if it's welcome, what are we discussing here really?
11:20 sima: ah now, sunsetting "dmem for everything" happened at lpc dublin in 22
11:20 mripard: (and ftr, I definitely doesn't want it to be limited to GPUs)
11:21 sima: mripard, I guess sharing the lore?
11:21 sima: which really should be documented somewhere instead of just shared in convos
11:21 mripard: limiting it to GPUs is pointless to me anyway
11:21 sima: oh yeah we agreed to that long ago
11:22 sima: I more meant dmem to track these kind of memory allocations
11:24 sima: T.J. Mercier was involved in a lot of these discussions years ago, but I think he moved on from this
11:24 sima: but helps finding some of the old rfcs
11:26 sima: android has the added fun of having to transfer the charges
11:26 sima: https://lore.kernel.org/dri-devel/20230123191728.2928839-3-tjmercier@google.com/ for binder
11:27 mripard: I'm sorry, I don't understand what that whole story is about then. I want to be able to track and limit DMA allocations consistently, so we can enforce those limits at the application level. You've been telling me yesterday and this morning that we can't and everybody disagrees. But now you're telling me that no, of course it's doable?
11:27 sima: neither
11:27 sima: I'm trying to tell you that it's _really_ hard
11:27 sima: and that there's like 5 years of design discussion in this space
11:28 sima: so it's neither a yes, nor a no, and definitely not a "of course it's doable" because I'm not sure what it should even look like exactly
11:29 sima: hence my suggestion that maybe we should start documenting the discussions and ideas thus far somewhere, maybe in a todo upstream rst doc
11:29 sima: since currently that's just floating around in a bunch of heads
11:31 sima: wrt the overall goal, that's at least a 15 year old todo item
11:31 sima: that = tracking dma allocations of heavy users like gpu
11:32 sima: or even just allocations really, since the entire dma-api or not thing is a bit an internal implementation detail
11:33 mripard: again, what is the message there? I evidently wasn't at those discussions, nor was I told what the content was, so it's not like I can do it myself. And if it's a "it's hard don't try" message, it was effective. If it was a "it's hard, you got this", it wasn't.
11:34 sima: I'm aiming for "it's really hard but worth to try"
11:34 sima: and I can help with digging out all the details that we should press into a todo doc
11:35 sima: essentially I've been trying to backfill you on these past years of discussions
11:35 sima: but yesterday I was really busy, so it was probably too stressed on my side
11:38 sima: mripard, heading out for lunch now, but my proposal would be I dig through all the opens
11:38 sima: fish out m-l references where they exist
11:39 sima: and you type it up since that's the best way to make sure I've successfully explained it?
11:39 sima: or I guess plan B is I typed it up, but in the past that was just really long chats with people individually ...
11:40 sima: from the top of my head tj mercier, tejun, mlankhorst, thellstrom, christian könig were the main folks involved thus far in the various discussions
11:44 sima:quickly scribbled a page full of notes
12:34 tomba: Could someone have a look at "[PATCH v10 00/13] drm/bridge: cdns-dsi: Fix the color-shift issue"? I think it's ready to merge, and it's mostly small cdns dsi bridge related stuff, but there's the "drm/atomic-helper: Re-order bridge chain pre-enable and post-disable"...
13:15 alyssa: glehmann: ah, right
13:15 alyssa: yeah so that should help with dxvk
14:19 sima: tomba, looks reasonable, but might be good to update the kerneldoc for these hooks to explain that they wrap both encoder and crtc enabling and disabling? with that a-b: me
14:20 sima: but would be good to have an ack from mripard and pinchartl too
14:26 tomba: sima: hmm you mean the kernel docs for drm_atomic_helper_commit_modeset_enables and drm_atomic_helper_commit_modeset_disables?
14:29 sima: nah, for drm_bridge_funcs.(atomic_)_pre_enable|post_disable docs
14:29 sima: maybe also the enable/disable ones
14:29 sima: right now they only reference the encoder hooks, and very intentionally not the crtc hooks too
14:29 sima: since we kinda assumed encoder hooks are what enable the pipe to the bridge, and crtc is a driver implementation detail
14:30 sima: so essentialyl change all these to "both $encoder_hook and $crtc_hook" instead of just "$encoder_hook" across the boarde
14:30 sima: plus probably some rewording since if you do this mechanically the sentences likely become unreadable
14:31 sima: tomba, the two functions you mentioned don't talk about bridges specifically, so I don't think those need to be updated
14:39 tomba: sima: ok, so essentially also describe in the bridge enable/disable hook docs how the bridge hook is called in relation to the crtc hooks?
14:40 tomba: aradhya shows it in the patch desc, but indeed it's not really visible in the docs afaics
14:40 sima: tomba, yup, we need to keep the docs accurate to the new reality
14:43 tomba: sima: and just to be sure we have the same understanding, the "new reality" means something that happened before this series, as this series only switches the order how the hooks are called?
14:47 sima: tomba, well that switch itself is substantial and is what I think should be documented
14:48 sima: since we switch from just wrapping the encoder hooks to wrapping both crtc and encoder hooks
14:55 tomba: sima: no disagreement on the documentation. but I'm having some trouble understand what you mean here. what do you mean with "wrapping"? don't we "just" change the sequence of the hook calls?
14:55 sima: I think we're saying the same
14:56 sima: currentl the bridge hooks are only before/after encoder hooks
14:56 sima: with that patch they're before/after both crtc and encoder hooks
14:56 sima: so they "wrap" a different set of hooks
14:56 tomba: alright =)
14:57 sima: maybe I'm too math brained, it's all funny spaces to me anyway
14:57 sima: so maybe I see a lot more things you can wrap than others
14:58 sima: maybe surround instead of wrap is more what you're thinking of?
14:59 tomba: yes =)
15:00 tomba: sima: Thanks for review, I'll ping Aradhya to do the updates.
15:04 sima: tomba, thx
15:07 tomba: I'm not sure why I didn't get what you meant with "wrap", now it sounds logical. In my defense, I have been debugging DSI for 10 hours today...
15:16 Ermine: the more docs the better
15:39 marmelademan: We are getting old enough to not deal with rooster violators like you offer on your channels. Undoudedly a biggest case of stupidity using abuse against me, i am glad to see you all thrown out of the window panes soon to follow up our local "heros" who delegated the scams and abuse. They seem to be as terrible as you in that regard. They get all captured one after another and treated by
15:39 marmelademan: force.
15:54 marmelademan: It's entirely rage and maddness based events, i could say that even one day i could not rest or breathe air without roosters picking a worldwide conflict from my lines, but at the same time you should realize you were entirely incorrect on the memory management and execution of code anyhow, and it's not fair to say anything other good to those persons either, since the buthcering abuse
15:54 marmelademan: was their biggest so called achievements, cause such people do not work neither they performed in other events, it was such a bullshit absurd, that they did not even had resillience to try or say capability a total bogus, by wank spam and abuse scam spam that is not how champions or rulers should be added to our society.
15:55 marmelademan: I suppose before you talk to another person on such tone of critical abuse, you might as well got to be someone too.
16:52 tjaalton: amber is still trying to dlopen libglapi, but it's gone with the move to libgallium.. should amber build it's own libglapi then?
16:59 tjaalton: or disable shared-glapi perhaps
17:02 tjaalton: well that doesn't build
17:25 Company: dj-death: (pinging you because you wrote that:) https://github.com/KhronosGroup/Vulkan-ValidationLayers/blob/main/layers/state_tracker/state_tracker.cpp#L197 - can fmt_drm_props.drmFormatModifierCount be 0?
17:25 Company: dj-death: because I think I'm getting a crash in the next line (no good debug symbols atm)
20:51 analogradiotune: Western world more accurately Europe was in fact behind this steroid abuse, as i said, they have human carriers for more purer substances of such. So you are bragging with things that never belonged to you and are illegal by laws, no wonder that wars happen with such betrayals. Once the source of steroids is taken from you, you are nobody again.
21:08 dj-death: Company: I guess it should be drm_mod_props.data()
21:09 dj-death: c-ism :)
21:13 Company: dj-death: https://paste.centos.org/view/5b8362d0 if you want the stack trace and the assertion
21:14 Company: this is on a semi-broken experimental branch of mine running ycbcr experiments against lavapipe
21:17 Company: so it's either my branch or lavapipe or both being buggy and the validation layers inbetween trying their best and failing