02:58 zzxyb[m]: excuse me, I created and processed gbm_bo on the first drm device, how to display it on the output of the second drm device after completion
07:13 MrCooper: zzxyb[m]: what are the two GPUs the BO is being shared between? Asking because scanning out from a shared BO doesn't work except for some special cases
08:42 pq: zzxyb[m], sounds like the kernel DRM drivers you are using are not checking everything they should, or rejeceting everything they should. That may cause things like gbm_bo_import or createFB to succeed when they should not.
08:43 pq: zzxyb[m], such sharing and scanout depends completely on the hardware, so it is not a surprise you get different results on x86 vs. ARM.
08:44 pq: zzxyb[m], the various API calls you are doing are supposed to fail, when the hardware does not support the operation. Failing those calls correctly is the drivers' responsibility.
08:45 pq: zzxyb[m], there many different ways on how arrange one DRM device to render and another to scan out, and there is no single way that both works everywhere and is performant.
08:45 pq: *there are many
08:47 pq: zzxyb[m], the choices involved are: on which device you allocate a buffer, in which direction is the buffer import done, and which device is doing a copy if a copy is needed.
08:49 pq: zzxyb[m], the best choice depends on the hardware architecture and the exact graphics chips or cards chosen, and how they are connected.
08:50 pq: zzxyb[m], also, multiple different ways can work, but not all of them may have an acceptable performance, so you cannot solely rely on API calls failing to find the best solution.
08:53 DMJC: hi, I've got a question regarding Mesa/GL Drivers.
08:54 DMJC: I've got an application which I was trying to run in WINE, when I ran it on X11/AMD drivers it comes up with a black screen, but when I run it in Xephy
08:54 DMJC: with LLVMPipe it renders
08:55 DMJC: Is there an easy way to determine whether it's AMD's Open source driver that's causing the problem?
08:55 kunzite: LLVMPipe is usually the source of truth compared to other drivers so that is concerning
08:58 DMJC: yeah if I run LIBGL_ALWAYS_SOFTWARE=1 /home/james/development/wine-git/wine secretops.exe
08:58 DMJC: the game renders/works
08:58 DMJC: but on both DRI_PRIME=1 and DRI_PRIME=2 I get a black screen
08:58 DMJC: Intel and AMD Drivers
08:59 pq: I don't think DRI_PRIME=2 is a thing, is it?
09:04 DMJC: it is when you have 3 GPUs...
09:05 DMJC: on my laptop PRIME=1 is coming up as intel, PRIME=2 and PRIME=0 come up as AMD
09:05 DMJC: there's also an NVIDIA GPU in this laptop (Intel/NVIDIA), but I've disabled it atm and use an eGPU (5700XT)
09:06 pq: that's undocumented then, the last I looked. Personally I'd use DRI_PRIME with pci device identifications to be sure I get exactly the card I want.
09:06 DMJC: hmm, so at least I know whatever it is in AMD/Intel that's broken is working on closed source NVIDIA
09:06 DMJC: and on LLVMPipe
09:08 pq: arrrh, one day I will remember to go through the dri-devel email cc list and remove all skynet.ie addresses.
09:09 DMJC: I've got a list of all the GL Extensions the game uses, don't suppose there's a way to override them individually so I can eliminate the working ones?
09:10 pq: DMJC, I think Mesa had some environment variables for that.
09:12 linkmauve: DMJC, https://docs.mesa3d.org/envvars.html#envvar-MESA_EXTENSION_OVERRIDE
09:14 emersion: pq, yeah, i get these annoying notifications too :/
10:35 zzxyb[m]: <MrCooper> "zzxyb: what are the two GPUs the..." <- amd and displaylink(udl)
10:37 pq: displaylink is a "software" device, there is no hardware behind it, and no GPU. A very special case.
10:39 noralf: "dim push-branch drm-misc-next" failed for me: "dim: FAILURE: Could not merge drm-intel/drm-intel-next", could someone please have a look. Fixing this is beyond my abilities.
10:40 zzxyb[m]: In fact, I encountered an arm computer, which is a mali gpu, but cannot create gbm_surface on displaylink (driver defect). So I thought about creating multiple gbm_surfaces on mali and sharing them to displaylink display
10:41 DMJC: Can't find in GLX Info:
10:41 DMJC: GL_TEXTURE_MAX_ANISOTROPY_EXT, GL_DRAW_FRAMEBUFFER_EXT, GL_READ_FRAMEBUFFER_EXT, GL_FRAMEBUFFER_EXT, glMultiTexCoordPointerEXT, glBindFramebufferEXT ,wglSwapIntervalEXT ,glBlitFramebufferEXT ,glBindRenderbufferEXT ,glFramebufferTexture2DEXT, glGenFramebuffersEXT, glRenderbufferStorageEXT, glRenderbufferStorageMultisampleEXT
10:41 DMJC: Can find in GLXinfo
10:41 DMJC: GL_EXT_framebuffer_multisample, GL_EXT_framebuffer_blit, GL_EXT_framebuffer_object, GL_ARB_texture_non_power_of_two
10:41 zzxyb[m]: pq: I have implemented this technical architecture, but the copy performance is not the best, so I am looking for an optimization solution
10:42 pq: zzxyb[m], the description of https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/810 has some background information.
10:42 pq: I've never worked on any Mali device, though.
10:43 pq: the difference between integrated and discrete graphics card matters too, whether it has dedicated VRAM or not. CPU access to dedicated VRAM directly is slow.
10:44 zzxyb[m]: pq: I have encountered a lot of mali driver bugs, unfortunately, it has greatly increased the difficulty of my work, I work on wayland
10:46 pq: zzxyb[m], gbm_surface is a collection of gbm_bo's. You don't share surfaces, you share buffers. But yes, creating a gbm_surface on the actual GPU, taking gbm_bo out, exporting it, then importing it to displaylink is how a zero-copy path might work.
10:47 pq: We can't help with proprietary drivers though, if you use one.
10:47 zzxyb[m]: <pq> "displaylink is a "software..." <- Yes, the displaylink device seems to have the DRI function, but it is an independent DRM device. Maybe I am used to calling it a GPU, but in fact it is not. The main reason is that this type of technology belongs to the multi-GPU range. Forgive me for the wrong name.
10:50 zzxyb[m]: pq: I am trying on the amd open source driver, using gbm_bo_import, but the result is not very friendly, let me show you the display effect
10:51 pq: zzxyb[m], I think much of it depends on the exact details of how you use the GBM API. E.g. do you use explicit modifiers. Do you tell AMD to allocate linear. Etc.
10:54 pq: I would guess that udl cannot handle anything that is not using DRM_FORMAT_MOD_LINEAR.
10:56 zzxyb[m]:uploaded an image: (121KiB) < https://matrix.org/_matrix/media/v3/download/matrix.org/wagaCUScXvYyugroGEzXsrqg/IMG_20230704_185613.jpg >
10:58 pq: zzxyb[m], what should the correct image look like?
10:59 zzxyb[m]: pq: Similar to this, just with a lot more stripes
11:01 pq: can you show it? or try with a more meaningful and less abstract image, so it would be easier to see how the pixels are getting rearranged.
11:02 pq: My current guess it that the actual and expected format modifiers do not match.
11:02 zzxyb[m]:uploaded an image: (198KiB) < https://matrix.org/_matrix/media/v3/download/matrix.org/WkSunRGtvRmrIlweqpixLRVm/mmexport1688354268246.jpg >
11:04 zzxyb[m]: pq: format or stride is very likely, because it is very close to the original image
11:05 pq: it's really weird the white dots are spread, that looks like an analog problem, not digital. Or is there some scaling doing on too?
11:05 zzxyb[m]:uploaded an image: (40KiB) < https://matrix.org/_matrix/media/v3/download/matrix.org/SCPhEwTPZFWUJvZkQULDdtQV/IMG_20230704_190457.jpg >
11:06 zzxyb[m]: pq: I'm sure the image size is correct, this displaylink device is just one resolution
11:06 linkmauve: Ah, I had approximately the same kind of artifacts trying to import an INVALID dmabuf exported from amdgpu to i915.
11:06 pq: are you sure the displaylink device is driving the panel correctly?
11:07 linkmauve: Make sure you allocate it as LINEAR.
11:07 pq: linkmauve, really? It can cause pixel strething like that?
11:07 linkmauve: pq, it uses tiling, so tiles become linear in memory.
11:08 zzxyb[m]: pq: Yes, if I don't use gbm_bo_import, there is a way to light up the displaylink normally, but there will be a memory copy
11:08 pq: linkmauve, yeah, but if you look at the correct image, the white dots are very small an round. In the incorrect image they are very stretched horizontally.
11:09 linkmauve: And if the importer isn’t aware of it, or interprets this INVALID as linear, it will take the whole tile and spread it on a <width×height>×1 line.
11:09 pq: but yeah, mismatch of format modifiers is the most likely cause
11:09 linkmauve: pq, if a white dot is 2×2, it might look 4×1.
11:10 pq: hmm, right
11:11 linkmauve: I do not know the exact tiling format of this particular GPU, but I would expect some morton order which puts close pixels close in memory.
11:12 linkmauve: I don’t remember if I ever fixed that issue on my end, I wanted to play a game rendered with radv, encode that buffer with vaapi on the Intel card, and then stream that out with waypipe.
11:15 zzxyb[m]: If I copy it to dumbbuffer, it can be displayed normally. Should I seek a solution to reduce copying from gbm_bo to dumbbufer?
11:15 linkmauve: Have Genshin Impact for instance: https://linkmauve.fr/files/wayland-screenshot-2022-02-21_20-20-34.png
11:18 linkmauve: Or NieR: https://linkmauve.fr/files/wayland-screenshot-2020-08-21_21-31-44.png
11:18 zzxyb[m]: zzxyb[m]: But this cost is very high, first map gbm_bo to user memory, and then copy the memory to dumbbuffer, the performance is not high
11:19 linkmauve: zzxyb[m], the issue is that your buffer is tiled, that is pixels are not ordered like you would expect in memory pixel per pixel and row per row.
11:20 linkmauve: So you have to either allocate it linear, and take the perf hit on every draw to it, if your GPU is even able to draw to a linear buffer.
11:20 linkmauve: Or do a copy, untiling the tiled buffer into a linear layout.
11:21 linkmauve: On my Mali-400 GPUs, the tiling/detiling is always done on the CPU, and detiling can be extremely slow if you map the buffer as uncached memory, this can be something to look for.
11:21 pq: zzxyb[m], note: gbm_bo_map() internally makes a copy when necessary in order to give you a linear pixel arrangement.
11:22 linkmauve: I once reported an issue with that in Lima, I don’t know whether it got fixed, but glGenerateMipmap() would access the image that just got submitted, tiled, from an uncached map.
11:22 pq: mmap() does not do such conversions, but it can be ever slower.
11:23 linkmauve: It would take something like 15s on my phone to generate the mipmap for a 4096×4096 texture. ^^'
11:23 zzxyb[m]: linkmauve: I'm wondering if it can be solved in the udl driver?
11:24 linkmauve: I don’t know anything about udl, sorry.
11:28 pq: zzxyb[m], I believe udl would not accept CPU de-tiling code for Lima tiling modes, no.
11:29 pq: zzxyb[m], if Lima cannot render to linear, then you have to do a copy into linear in some way. I think such embedded systems often have some additional hardware that can do that de-tiling copy more efficiently than the CPU.
11:30 pq: zzxyb[m], did you ever benchmark glReadPixels straight into udl dumb buffer?
11:31 pq: no dmabuf, no imports, no exports, just the plain old glReadPixels
11:31 zamundaaa[m]: pq: Afaik Mesa has workarounds for hardware that can't render to linear - it resolves the image to linear on glFlush / eglSwapBuffers. So using a linear buffer should work fine
11:33 linkmauve: Looking at my my phone’s drm_info, it doesn’t even support tiled formats for scanout. :|
11:34 pq: zamundaaa[m], ah, that helps. But I'm not sure it's Mesa...
11:35 daniels: specifically kmsro has code to handle it for drivers which use kmsro (not Intel/AMD) when using GBM as a target
11:41 pq: zzxyb[m], I keep getting confused with the AMD vs. Lima cases. They aren't comparable, especially if Lima is proprietary.
11:42 pq: zzxyb[m], if you need it to work on Lima, solve it on Lima. Making it work on AMD is a new problem, not the same problem.
11:43 _jannau__: Lima (mesa) or Mali (proprietary)?
11:44 pq: oh Lima was the Mesa one? I got that confused.
11:46 daniels: pq: lima is the mesa driver for old (pre-Panfrost) Mali GPUs
12:40 pcercuei: You guys are okay with a huge patch touching 85 files with a tiny change in each file, or should I have a huge patchset of 85 tiny patches each touching a file?
12:41 emersion: i'd personally favor the former, as long as no other changes are mixed in
12:44 tzimmermann: pcercuei, if the change is completely uniform across all 85 files, a single patch would be better IMHO
12:47 pcercuei: Ok, even if the files touched are scattered across subsystems?
12:48 pcercuei: (not *that* scattered - but it touches 3-4 subsystems)
12:48 MrCooper: then it should probably be split per subsystem at least
12:49 pcercuei: Hmm. But the kernel wouldn't build between commits then
12:49 pcercuei: (I update the prototype of a callback function)
12:51 MrCooper: one possible way to handle that is: add new callback with new signature, convert callback users, remove old callback(, rename new callback to old one)
13:28 jfalempe: tzimmermann, did you take a look at https://patchwork.freedesktop.org/patch/543984/ ?
13:29 jfalempe: I'm not sure it's the best way to fix that, but aspeed is a bit particular in this case. Nothing is connected on DP, but you still want to use it remotely, with a decent resolution.
14:03 tzimmermann: jfalempe, oh. i didn't see it. let me comment
14:40 zzxyb[m]: pq,linkmauve,Maybe you will be interested in this, just a kwin developer told me,have multiple modes for how buffers are transferred between GPUs: https://invent.kde.org/plasma/kwin/-/blob/master/src/backends/drm/drm_egl_layer_surface.cpp?ref_type=heads#L343
14:41 zzxyb[m]: <zzxyb[m]> "excuse me, I created and..." <- this is a more comprehensive solution to the problem.@pq,@linkmauve
14:54 tzimmermann: javierm, FYI i have a few other patchsets for fbdev ops coming up. i also want to use the fb_ops init macros and Kconfig tokens in fbdev drivers. with the fbdev device file now being optional, i'd like toget to the point where the respective fb_ops can be left out from compilation
15:09 penguin42: karolherbst: I'm going to get back to looking at profiling; are we at about the same point as a couple of weeks back?
15:11 karolherbst: yeah
15:13 penguin42: karolherbst: Ack, I'll look at it over the next few days and let you know where I get to
15:25 javierm: tzimmermann: neat. I still owe you the review of the previous set, let me do that now
15:25 javierm: tzimmermann: I meant to do it on Friday but got busy with some stuff
15:25 tzimmermann: oh, you mean the screen_info thing?
15:26 tzimmermann: ok, sounds good
15:26 javierm: tzimmermann: yeah. Although I think you had some review from Arnd already?
15:26 tzimmermann: i wanted to look at your recent FB_CORE stuff, but didn't have time yet
15:26 tzimmermann: yes, arnd looked over it
15:27 javierm: tzimmermann: there's no rush. I believe the menu is much more organized after your suggestion (and from Geert and Arnd)
15:29 arnd: tzimmermann: one more thing I noticed the other day but didn't comment on is that CONFIG_FB_NOTIFY should be selected by FB_DEVICE now, it's no longer needed without that
15:34 javierm: arnd: but should be select CONFIG_FB_NOTIFY if FB right ?
15:34 javierm: because I don't see it used by the DRM fbdev emulation layer
15:37 javierm: arnd: I haven't realized that you were in this channel btw :)
15:38 arnd: javierm: good question. I see four callers of fb_register_client(), and I had assumed that at least the BACKLIGHT_CLASS_DEVICE one does get used with drm drivers, but I have not looked closely at that
15:41 javierm: arnd: yes, but that fb_register_client() call happens from backlight_register_fb() and that's only defined when #if defined(CONFIG_FB) || (defined(CONFIG_FB_MODULE)
15:41 javierm: otherwise is just a stub
15:42 javierm: arnd: I can include that cleanup in the next iteration if you want with your Suggested-by
15:44 arnd: javierm: so in the current mainline code (before the FB_DEVICE and FB_CORE options are split out), we would always register the notifier for a drm device, but after your patches the #ifdef check is no longer true
15:46 arnd: I see this getting called from drm drivers through devm_backlight_device_register(), so I'm still not sure whether this #ifdef should be changed to checking for CONFIG_FB_CORE instead
15:48 javierm: arnd: hmm, that's a good point...
15:48 javierm: arnd: probably we could just keep as a FB select as it is in my latest version. Yes, it's not needed if FB_DEVICE isn't enabled but that's a cleanup we can do later
15:48 pcercuei:gets triggered by #ifdefs
15:49 javierm: arnd: I probably should audit all the #ifdef FB and see which ones have to be replaced by FB_CORE
15:49 arnd: javierm: I meant this more as something that could be changed along with tzimmermann's patches that introduce FB_DEVICE in the first place, it doesn't really belong in your series
15:50 arnd: javierm: the only other such check that I see is in drivers/video/backlight/lcd.c, which has the same code
15:51 javierm: arnd: right. Which isn't selected by DRM so we could just keep as is
15:53 javierm: arnd: but I think you are correct about the check in https://elixir.bootlin.com/linux/latest/source/drivers/video/backlight/backlight.c#L82 and that should be FB_CORE instead. I'll fixup that in my next rev
15:55 arnd: ah right, it seems that there are in fact only two remaining callers of lcd_device_registers: drivers/hid/hid-picolcd.c and drivers/video/omap/lcd_ams_delta.c
15:57 javierm: arnd: yeah
15:57 javierm: arnd: I'll wait for your review though before re-spinning the series. There's no rush of course
17:10 swick[m]: airlied sima what was your conclusion about the color pipeline API thing?
17:52 emersion: gfxstrand: hm, if UMF becomes a thing, would it be driver-agnostic and device-agnostic like drm_syncobj?
17:53 emersion: or would it stay a driver-specific opaque object, with users needing mesa to interact with it?
19:45 sima: emersion, definitely want to at least wrap it in drm_syncobj for interop I think, but within mesa it might be more driver specific
19:47 emersion: sima: hm… would it be drm_syncobj or Something Else?
19:48 sima: well for backwards interop drm_syncobj, because semantics should match
19:48 sima: it's just that the fence materializes only right when it gets signalled :-)
19:48 emersion: it would really help me if it was just drm_syncobj, because that would mean we can work on the new Wayland protocol
19:48 sima: emersion, yeah I think for protocol purposes you can assume drm_syncobj
19:48 emersion: i see
19:48 sima: or that we owe you really big time
19:49 sima: if we rev everything at the same time for umf, we'll imo never get there, too big a jump
19:49 emersion: i agree
19:50 emersion: yeah, i think a drm_syncobj which materializes a signaled fence on completion would work
19:50 emersion: on UMF completion*
19:50 sima: yeah
19:51 sima: we might be able to do better for "vk app renders into winsys buffers(*)"
19:51 sima: (*) conditions apply
19:51 sima: just not sure whether the conditions work out or not
19:51 sima: but worst case fallback should at least be functional
19:52 emersion: do you have a plan for sync_file?
19:52 sima: not possible
19:52 emersion: ok
19:53 sima: I have some really funky ideas to fix some fairly theoretical deadlock issues with sync_file and modesets out-fences
19:53 sima: but they're really kernel internals, not actual umf future fences
19:53 sima: real future fences like umf we just can't squeeze into sync_file without breaking the world
19:53 emersion: the slightly annoying thing with this idea is that there are no GL/Vulkan drm_syncobj import/export
19:53 sima: same for implicit fencing really, it would instead be "future fence semantics implicit sync" which is just entirely a different beast
19:54 emersion: KMS too
19:54 sima: hm I thought vk and exts and gl too for interop?
19:54 sima: kms you need to fish the sync_file out of the drm_syncobj fd
19:54 emersion: but then you need to wait for the fence to materialize
19:55 sima: yeah
19:55 sima: no way around that
19:55 sima: kms doesn't eat future fences
19:55 sima: we'd need to be able to somehow cancel atomic commits
19:55 emersion: vk and GL only have sync_file exts, not drm_syncobj
19:55 sima: and I'd like not to get there :-)
19:55 sima: uh
19:55 emersion: but we can fix that
19:55 emersion: it's just some more work
19:57 sima: you sure? the kernel supports backing them into fd, and there's no other use afaik than import/export to native obj when you start with a vk timeline semaphore?
19:57 sima: *baking
19:57 emersion: for vulkan, exporting a timeline semaphore into a FD gives you the drm_syncobj, but there's no guarantee: as far as the spec is concerned, it's an opaque FD
19:58 sima: it's a platform specific fd
19:58 emersion: which can only be imported into the exact same device
19:58 emersion: and exact same driver
19:58 sima: yeah on linux it's more :-)
19:58 emersion: (UUIDs must match)
19:58 emersion: well, NVIDIA could export to not-drm_syncobj for instance
19:58 sima: well, except that one
19:58 sima: the thing is, you can just try
19:59 emersion: and any other Linux driver could do whatever
19:59 sima: yeah so just check it's mesa or something
19:59 emersion: i'd rather just add the proper ext
19:59 sima: vk_ext_this_is_mesa
19:59 emersion: we have a sync_file ext, we can have a drm_syncobj ext
20:01 jenatali: FWIW Mesa can have non-drm drivers
20:02 sima: hm right
20:03 emersion: we have VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_KHR, we can add VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_DRM_SYNCOBJ_BIT_KHR
20:03 sima: but any fd you get you could test-import as a drm_syncobj into a drm fd, and it'll reject it if it's something else
20:03 emersion: sima, this wouldn't work for import
20:04 sima: emersion, you need to export an fd first to check :-)
20:04 sima: I should probably go ^Z instead of coming up with idiot feature checks ...
20:04 emersion: yeah i don't want to build something hacky like this
20:47 penguin42: life would be a lot easier if 'profile' wasn't used as both a way of measuring performance and as a choice of configuration
20:58 jenatali: Amen
20:58 jenatali: You just switch to the profile profile
21:04 DemiMarie: sima: can one at least assume one can get a pollable FD out of a UMF? That is required to be able to use them cross-process.
21:36 alyssa: my profile pic is a pic of a profile profile
21:38 HdkR: profile inception
21:39 alyssa:flails but you only see the shadow of her side
21:40 Danct12: alyssa, congrats on your new job at valve!
21:40 alyssa: Danct12: cat's out of the bag it appears
21:40 alyssa:meows
21:43 HdkR: Tip: Don't put cat in bag
21:44 alyssa: nya?
21:44 ccr: =(^.^)=
22:59 DemiMarie: alyssa: congratulations on the job!
23:01 lumag: robclark, a stupid question. Whom should I ping for the review of a patch like https://patchwork.freedesktop.org/patch/527953/?series=115283&rev=2 ? I'd like to reiterate the series.
23:03 robclark: probably sima