IRC Logs of #dri-devel on irc.freenode.net for 2024-02-22

07:47 mripard: mlankhorst: we got build failures in a driver we merged yesterday and we probably don't want to send that as part of the PR. I'm working on fixing them up
09:08 pq: Does anyone else think that driving keyboard and mouse RGB leds through DRM KMS UAPI is... not a good fit at all? Even when you could show patterns individually addressed key lights.
09:09 pq: *patterns through individually
09:09 emersion: isn't there a new LED subsystem proposal?
09:10 emersion: https://lore.kernel.org/lkml/247b5dcd-fda8-45a7-9896-eabc46568281@tuxedocomputers.com/
09:11 pq: that one got a reply "arbitrary data backdoor, should use actual display interfaces instead"
09:11 pq: which I then replied to
09:13 javierm: pq: I haven't followed that long thread, but there's also this auxdisplay subsystem for segment LCDs and other kind of "auxiliary displays"
09:13 emersion: oh i see, i'm behind then
09:13 emersion: i agree with your reply, pq
09:13 javierm: pq: maybe it can fit in that subsys ?
09:14 pq: I did see auxdisplay mentioned, but I know nothing of it
09:14 pq: emersion, thanks
09:14 javierm: pq, emersion: there was someone in this channel that mentioned that had a driver for a dot-matrix panel controller that was in fact a matrix of servos to control a robot :)
09:15 emersion: lol
09:15 javierm: where each monochrome pixel was to turn on and off the servo. I can't decide if is brilliant or crazy
09:18 emersion: oh monochrome only? they should use RGB for the next version, and use the color channels as (X, Y, Z) coordinates for the robot arms orientation
09:21 javierm: emersion: haha, maybe it was 2bpp? Since at least you need 3 states: motor off, motor on in one direction, motor on in the inverse direction
09:21 emersion: nice
09:24 pq: don't servos usually position by PWM duty cycle? So a pixel value can control the duty cycle, which translates to servo position.
09:28 javierm: pq: indeed
09:28 javierm: emersion, pq: fyi, I found the day this was discussed: https://oftc.irclog.whitequark.org/dri-devel/2022-03-04#30695417;
09:29 javierm: it certainly is thinking outside the box :)
09:30 mripard: there was someone that was using the sun4i DPI output as a glorified DAC at some point too
09:30 mripard: I can't remember for what application it was exactly, but something similar :)
09:30 pq: oh, steppers
09:32 pq: glorified DAC; since pixel clocks can be 300 MHz or more, I've been thinking that you could generate a radio signal directly without any oscillator or anything
09:33 Company: I think it's a great idea that Matt Parker's Christmas tree can soon be supported with this
09:43 javierm: Company: haha, I didn't get that reference and had to search online
09:56 Company: it could finally be a reason to introduce 3D dmabuf formats (or do those exist already?)
09:56 Company: see also: Vegas Sphere
10:39 pac85: pq: yes people have turned USB VGA adapters into cheap SDRs, they even have a low enough impedance to just attach an antenna directly to them
10:42 emersion: hm
10:42 emersion: airlied: so… i pushed a patch to drm-misc-next but should've pushed it to drm-misc-fixes
10:43 emersion: but i have a more general question about the branches now
10:43 emersion: patch 1 is a fix, patch 2 is not, both need to be pushed, patch 2 depends on patch 1
10:43 emersion: what is the best way to push these?
10:50 javierm: emersion: both patches are in drm-misc-next now right ?
10:50 emersion: yeah
10:50 javierm: emersion: so I would just dim cherry-pick #1 in drm-misc-fixes
10:51 emersion:is at loss without sima showing the way
10:51 javierm: then the fix could make into the current release and #2 would be in the next one
10:52 emersion: ok, thanks for the advice! i'll wauit for a bit to leave the chance for someone else to protest
10:56 mripard: emersion: for your first question, the ideal way would have been to merge the first in drm-misc-fixes, ask us to backport it into drm-misc-next, and then apply it to drm-misc-next
10:56 emersion: i see
10:58 javierm: yeah, or wait for drm-misc-next to get a backmerge and then push #2
10:58 emersion: yeah, since it's not super important patches, it could've waited for a backmerge
10:58 javierm: emersion: but it happens, I also got confused a couple of times and had to dim cherry-pick... the problem there is that it pollutes the git log since the commits are duplicated
10:59 emersion: indeed
11:00 mripard: javierm: committers shouldn't cherry-pick, and we shouldn't encourage them to
11:00 mripard: it's not a big deal when it happens
11:00 mripard: because, well, it happens on a regular basis indeed
11:00 mripard: but it's not something to encourage
11:01 emersion: yeah, i think javierm was strictly replying to my "oops i messed up" message
11:01 emersion: not saying it's a good thing to do in general, just saying it's a solution for my mistake
11:02 mripard: jani, rodrigovivi: ack to merge https://lore.kernel.org/all/20240219131423.1854991-1-tvrtko.ursulin@linux.intel.com/ through drm-misc?
11:13 javierm: mripard: yes, my intention wasn't to encourage emersion, but rather to assit him on how to fix when it happens
11:13 javierm: or do you mean that cherry-pick should only be done by drm-misc maintainers and individual devs ask them to do so ?
11:14 javierm: (when they make a mistake I mean)
11:15 pq: pac85, ha! awesome
11:48 emersion: mripard: ^ let me know if that's what you meant or not
11:48 emersion: will hold off cherry-picking until i hear back from you
12:18 mripard: emersion: I'm not entirely sure what the policy is anymore, so go ahead and cherry-pick it with my Ack :)
12:19 emersion: ty
12:26 emersion: ah yes of course…
12:26 emersion: dim: FAILURE: Could not merge drm-misc/drm-misc-next
12:36 emersion: alright i hope i didn't mess anything up when resolving the conflict…
12:37 daniels: airlied: do you need to do any DRM PRs or anything in the next couple of days?
12:37 daniels: airlied: you can take it as heavily implied that the ideal answer is no :P
12:51 mripard: daniels: iirc airlied sends the drm fixes PR on fridays
12:57 tzimmermann: mripard, do you have comments to the v2 of the renesas fix? i'd like to have it in the drm-misc-next PR today
13:08 daniels: mripard: aha
13:08 mripard: tzimmermann: no, that's fine, do you want me to merge it?
13:10 KetilJohnsen: I'm investigating user submission of GPU work (Panthor), and was looking for a mechanism to notify user space about certain GPU events, like HW queue errors. I stumbled across drm_send_event(), which seems like a good match. The only "problem" I have is that it doesn't seem to be much used, so I'm worried this isn't the right way to go about. Any comments or pointers?
13:13 daniels: KetilJohnsen: it depends how often you need to sync back
13:13 daniels: the usual way to report things like HW queue errors is an error back from your CS ioctl
13:14 daniels: but I guess with userspace submit, unless you're doing memory (un)bind ops, the only time you'd need to check in with the kernel would be for fence management
13:14 tzimmermann: mripard, yes i merge it in a bit. i'm going to add a Fixes tag as well
13:15 daniels: so if you do have one of those regular sync points then you could use that, otherwise a DRM event could be apt - but the problem with DRM events is that they're not in any way isolated and will appear as readable to anyone with the drm_file description, so you'd have to figure out how that isolation should work in the presence of e.g. multiple independent queues
13:30 KetilJohnsen: I don't think I have regular sync points which I can rely on.
13:35 KetilJohnsen: Not sure I quite understood your worry about isolation. I mean, I only plan to report HW queue issues to the process (fd) owning the queue. Shouldn't that give you the isolation needed?
13:45 mripard: tzimmermann: thanks!
13:53 daniels: KetilJohnsen: I guess you'd just report DEVICE_LOST for the whole device rather than trying to make errors granular to individual queues?
14:14 rodrigovivi: mripard: ack!
14:31 mripard: tzimmermann: this was the last drm-misc-next PR, right?
15:09 KetilJohnsen: daniels: you got a fair point there, reporting errors would not need to be very granular. But I would still need a reliable mechanism to provide that, given that user space (at least in theory) could be submitting work without ever interacting with kernel driver.
15:14 KetilJohnsen: But it is not just about errors, I would also want a way to notify user space about "memory fences" when they are updated. This would probably be the main thing to handle.
15:16 DemiMarie: KetilJohnsen: why do you want userspace submit? I know that getting rid of dma-fence is good, but userspace submission to firmware queues seems like something to avoid unless there is a very good reason for it, and IIUC Panthor doesn’t require it.
15:21 daniels: perf
15:39 tzimmermann: mripard, there should be an rc6 on sunday, so yes, it's the final -next PR for this cycle
16:20 sravn: tzimmermann: If I ack/rb the patched for atmel-hdlc "Add support for XLCDC to sam9x7 SoC family", can you then push them to drm-misc?
16:20 tzimmermann: sravn, sure
16:20 sravn: I have lost my local infrastructure to push to drm-misc
16:20 sravn: thanks!
16:20 tzimmermann: oh
16:28 DemiMarie: daniels: does the perf matter in practice for reasonable frame rates, as opposed to demos that display frames much faster than a human can detect?
16:28 DemiMarie: For virtio-GPU native contexts I am thinking of forcing a copy of the commands, mostly because of TOCTOU concerns.
16:29 daniels: sure, if you need to ban userspace submission or add more copies or stick validation in the middle, those are all reasonable approaches
16:30 DemiMarie: This copy would be in virglrenderer.
16:30 daniels: but yeah, the impact is actually measurable, it's not just something people are doing because they're bored
16:30 DemiMarie: Do you have a rough idea of how big the impact is?
16:30 alyssa: ..guess we don't have hash_table_u64_foreach?
16:31 daniels: DemiMarie: depends on the workload and the hardware and ...
16:32 DemiMarie: daniels: hence the “rough”
16:32 DemiMarie: Are we talking a few percent or a factor of 2?
16:33 DemiMarie: I’m trying to figure out if “measurable” means “noticeable by real users”.
16:34 alyssa:guess she's writing one
16:34 DemiMarie: Or is this a “we don’t know”?
16:35 DemiMarie: I’m not too worried, BTW, as per prior discussion in this channel the firmware code should be quite simple.
16:36 sravn: tzimmermann: All patches are a-b, have sent a mail to the list where I mention that I asked you to apply them. Thanks again
17:06 daniels: DemiMarie: it is noticeable by real users when you're doing demanding workloads, yeah - on less demanding workloads you're obviously not going to be limited by job submission rate
17:33 robclark: DemiMarie: the problem is you'd need to parse the cmds to find what other cmds/shaders/buffers they point to.. but OTOH if TOCTOU is a problem, then it is an overall GPU kernel issue.. in fact there can be legit cases for userspace (guest or not) modifying buffers passed
17:35 robclark: opencl and vk with some extension even has raw pointers, so you pretty much need to have process isolation correct
17:36 DemiMarie: robclark: Ah, I see. So userspace command submission doesn’t allow userspace to do anything it could not do already, because of self-modifying code?
17:37 robclark: right
17:38 robclark: I think for older hw where cmdstream validation was required, the (host) kernel must have copied the cmdstream.. but fortunately most of those old GPUs are gathering dust these days
17:40 DemiMarie: robclark: I did not realize that userspace had write access to the command stream the kernel was sending to the firmware.
17:41 DemiMarie: robclark: Asahi GPUs absolutely require that the firmware’s command stream is not writable from userspace. Apple’s drivers let userspace write it, and Lina used that and a firmware bug to get root privileges.
17:42 robclark: yeah.. not sure about other drivers, but freedreno maps cmdstream as gpu readonly when we don't expect it to be modified (mostly just for sanity, so a driver bug doesn't scribble over cmdstream and make things harder to debug)
17:43 robclark: DemiMarie: so on adreno side, there is kernel controlled ringbuffer cmdstream, which has more privs like being able to set the pgtables... that is absolutely not writable or even visible to userspace
17:44 robclark: (or, well, I guess it is visible to userspace via the gpu but not writable.. it is also mapped gpu r/o)
17:44 robclark: in the asahi case, it sounds like a kernel bug, at any rate
17:44 DemiMarie: robclark: It was a (macOS/iOS) kernel bug indeed
17:45 DemiMarie: What I’m wondering is if userspace command submission is something to worry about
17:45 DemiMarie: The other problem is that MMIO doorbells must not be mapped into guests under any circumstances.
17:45 DemiMarie: I don’t even think Xen allows that.
17:46 DemiMarie: So the doorbells will need to be proxied.
17:47 robclark: there are certainly things that you could do incorrectly in host kernel to cause security holes.. but I don't think it is anything that virglrenderer could stop (and those are all things that would be a big problem even if VM was not in the picture, so I think driver dev's give it some attention)
17:47 DemiMarie: robclark: I’m mostly concerned about TOCTOU or buffer overflow in firmware command buffer parsing
17:51 robclark: it shouldn't cause any more problem than just a gpu crash... userpace (guest or host) is always allowed to shoot it's own foot, as long as it can't shoot another process
17:51 DemiMarie: What about memory corruption in the firmware?
17:53 robclark: fw memory gets reset when we reset the gpu to recover from gpu crash..
17:58 soreau: I see with apitrace, a program calls eglMakeCurrent(dpy, 0, 0, 0); followed a bit later by a call to eglMakeCurrent(dpy, 0, 0, ctx);. Does this mean context switching is in play? or is it only expensive to switch between two valid contexts
17:58 robclark: DemiMarie: btw, I'm not as familiar with the apple gpu+fw, but I'd make the general argument that _if_ the cmdstream needs to be copied to prevent TOCTOU issues, it should be done in the host kernel driver
18:04 Lynne: airlied: how did you test the av1 decode mr?
18:07 alyssa: apple doesn't do userspace submission, at least not on m1 + 13.x
18:07 alyssa: what lina found was a bug, not a fundamental design hole
18:25 DemiMarie: robclark: that assumes one cannot use FW exploits to compromise other contexts on the GPU
18:27 abhinav__: rodrigovivi jani Can we please get a review from you on https://patchwork.freedesktop.org/patch/579121/?series=130145&rev=1 ? We have a dependent series
18:42 airlied: Lynne: cts
18:50 airlied: daniels: i send fixes pull in next 8 hrs usually
18:50 airlied: then i have to wait for Linus
18:55 daniels: airlied: aight, will try to move the repo to gitlab on the weekend then
18:55 airlied: daniels: early next week is fine also
18:56 airlied: although i do some stuff on monday its not important
19:09 rodrigovivi: abhinav__: done
19:14 Lynne: airlied: ah, cts has some issues
19:17 robclark: DemiMarie: my point is more that if there is a bug like that, it needs to be handled in host kernel (and/or fw), GPUs are sufficiently flexible that if you could trigger a bug like that by evil guest userspace modifying cmdstream after submission then you could also trigger it by evil guest userspace using a shader that generated cmdstream or something like that.. I kinda group possible exploits into "gpu exploits" and "cpu
19:17 robclark: exploits" (the latter being more frequent, like a UAF in host kernel driver).. nctx gives you some reasonable protection against the latter category, but the former category nothing in userspace is really going to be able to save you, you need to fix in host kernel. Fortunately the former category is much more rare.
19:18 DemiMarie: robclark: I see, thanks! I did not realize that GPUs had self-modifying code like that.
19:21 robclark: it varies a bit, not all do, but I know some do. But I think anything modern has general purpose load/store instructions in the shader, for example. Once you have that, you need to have proper isolation..
19:23 airlied: yeah there a vulkan exts go generate cmds on the gpu
19:25 abhinav__: rodrigovivi thanks!
19:28 DemiMarie: airlied: wow! I wonder if at some point we will see applications mostly written in shading languages.
19:30 robclark: I remember a few yrs back someone got a shader to do disk i/o via io_uring
19:31 robclark: in general, it's maybe not the ideal fit for many apps (which would prefer faster single threaded performance over massive SIMT thruput), but kind of a fun demo
19:48 jenatali: Demi: command buffer formats vary per-GPU and based on firmware, so unlikely to see user-generated command buffers shipped in an app. Shipped in a driver though is a different story
19:50 robclark: the concern is less well behaved normal app and more malicious app which decides it wants to bypass driver and build evil cmdstream directly, etc
19:55 airlied: Lynne: yeah we can fix it up later, just want to merge it to avoid rebasing forever