IRC Logs of #dri-devel on irc.freenode.net for 2023-06-07

00:14 memleak: I'm back, if PS/2 dies out I'll give that a shot next
00:18 memleak: brb
00:30 memleak: Heh, that was my bad.. I went to go enable the PS/2 keyboard driver and noticed CONFIG_INPUT_EVDEV was disabled..
00:31 psykose: i've done that before too! :-)
00:31 psykose: helps to view the full .config diff when you touch it..
00:31 memleak: I did a make tinyconfig and tuned for my hardware and missed one lol
00:31 memleak: I started over my config because I was jumping from 6.1 to 6.4
00:32 psykose: tbh there's no need to do that
00:32 psykose: just use the old one and make defconfig and you get prompted for anything new
00:33 memleak: It wanted to restart my config though every time so instead of doing make olddefconfig I just made a new one, anyway all is well now!
00:35 memleak: Sorry for the noise!
00:37 anholt: sergi: what happened that the piglit uprev bot didn't bump the other image tags in d75973a1422d86799312d7aa60d0dce846fb3dba ?
06:03 zzag: emersion: do you know what gbm_bo_import() does when importing a dmabuf buffer allocated by another gpu? would gbm_bo_import() start a transfer to another gpu or something? or would it just create a gbm_bo wrapper for the specified dmabuf attributes and that's all?
06:03 zzag: a transfer from another gpu*
06:19 RAOF: zzag: *Mostly* you don't care? It's *mostly* transparent. (ie: if you try and import it as an EGL image and then sample from it the driver will do... a thing, and sampling will work)
06:22 RAOF: The mostly there is that it's almost certainly not going to let you scanout of that gbm_bo. I think all the bits necessary to make that possible are there, but nothing's hooked up (see my gbm_bo_import + ALLOW_MIGRATION proposal a couple of days ago)
06:23 zzag: RAOF: well, that still leave me wondering about what gbm_bo_import() does. we have some multi-gpu code which I would like to change, but it makes some assumptions about what gbm_bo_import() does, e.g. starting data transfer and allocating local storage on the gpu
06:23 zzag: still leaves*
06:23 zzag: and I wonder whether it's right thing to do
06:24 RAOF: What are you using gbm_bo_import for?
06:25 zzag: Multi-gpu in compositor: render on one gpu, then gbm_bo_import() that buffer on another gpu and present it
06:25 RAOF: Because if you've got a dmabuf then you can probably just import it directly into your rendering API of choice and use it directly?
06:25 RAOF: Yeah, that's the thing that doesn't work.
06:26 RAOF: Unless by "present" you mean "sample from in your rendering on the GPU that's going to be displaying it"
06:26 zzag: I mean scanout the imported buffer
06:27 kode54: zehortigoza: your branch fixed the frame flipping glitches
06:27 RAOF: Unless I misread the code, that's not expected to work.
06:28 RAOF: Or, rather, gbm_bo_import(, USE_SCANOUT) will check that it's scanoutable, but not do anything to make that happen, and GPUs pretty much only scanout of device-local memory.
06:29 RAOF: (At least, that was the state the last time I tried to scanout of foreign dmabufs and we patched the drivers to return EINVAL at add_fb time rather than silently display black when you tried)
06:32 RAOF: If I am misreading the code I'd love to know, because I'd love that to work properly :)
06:39 zzag: RAOF: "but not do anything to make that happen" yeah, that's what I'm trying to understand. I see that i915 calls drmPrimeFDToHandle() and fills other internal data structures in i915_drm_buffer_from_handle. amdgpu seems to do the same but it also does some memory stuff with amdgpu_va_range_alloc+amdgpu_bo_va_op_raw (not sure what they really do) in amdgpu_bo_from_handle
06:47 emersion: i don't really know zzag
06:48 emersion: RAOF: where is jour proposal?
06:48 RAOF: "proposal" might be a bit strong; it was in this channel, a couple of days ago.
06:51 RAOF: Basically, add an ALLOW_MIGRATION flag to gbm_bo_import, use the dri blit infrastructure to actually do the migration if necessary, and then some wondering about maybe plumbing fences through.
06:51 emersion: i'd really rather not have this
06:51 emersion: GBM is an allocation library, not a rendering library
06:52 emersion: blitring involves sending command buffers, handling synchronization, etc
06:53 emersion: blitting*
06:53 emersion: plus it won't fly with e.g. minigbm
06:55 emersion: is using GL or Vulkan a big problem?
06:57 RAOF: It's not a big problem, it's just annoying and there seems like there must be a better way.
06:57 emersion: why is it annoying?
06:58 RAOF: I mean, mesa's gbm literally has access to a "make this work" function pointer :)
07:00 emersion: another reason:
07:00 RAOF: It's annoying because there's a whole bunch of setup, and a whole bunch of useless state that we hope gets mostly ignored, and it's harder to plumb explicit fences through.
07:00 emersion: compositors should try to import BOs only once
07:01 emersion: if you do a blit at import time, you need to import each frame
07:01 emersion: which is not very nice
07:01 emersion: well, it's setup you need anyways, since you're doing rendering when compositing
07:02 emersion: and explicit fences are in fact something which will be hard to get right with a GBM API, whereas it's already all there in GL/Vulkan
07:02 RAOF: Not if you're compositing on the other GPU, though.
07:02 emersion: i mean, you already have some kind of GL/Vulkan abstraction
07:02 emersion: for me it's just a few function calls to blit
07:02 emersion: in wlroots i mean
07:05 emersion: zzag: i think i asked MrCooper a while ago, and the answer was iirc something like this:
07:06 emersion: in theory nothing should happen on cross-device import, but some drivers might migrate the buffer to a different location
07:10 zzag: hmm, okay.. it seems like the best option is just to do blits and avoid gbm_bo_import because there are no concrete guarantees how it works in multi-gpu case
07:15 emersion: i'd really like a "please fail if you're going to migrate" kind of flag
07:16 emersion: client DMA-BUFs might come from anywhere, i want to try to do direct scan-out for these, but i don't want to start any kind of heavy work when doing that
08:41 karolherbst: I'm kinda not in the mood of dealing with random CI fails (softpipe, zink on lavapipe and virpipe-on-gl): https://gitlab.freedesktop.org/mesa/mesa/-/pipelines/901704
08:42 karolherbst: looks like there are failing tests and nothing has anything to do with my MR
09:31 MrCooper: zzag: curious how you plan to do blits between GPUs without gbm_bo_import or something equivalent :)
09:33 MrCooper: emersion zzag: a BO which is shared between devices has to be accessible by all of those devices; traditionally, this means the BO has to be in system RAM (which generally means scanout can't work with dGPUs)
09:34 MrCooper: on setups where PCIe P2P DMA works, the BO can stay in the exporter device's local memory in theory
09:35 MrCooper: (not sure scanout from another dGPU can work / is a good idea in that case)
09:58 zmike: karolherbst: at least some of these were noted as having been missed in the recent piglit uprev (did someone forget to stress?), but I don't know why the CI expectations haven't been updated
09:58 karolherbst: zmike: okay.. I added a commit to my MR to just add those fails: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23110/diffs?commit_id=0930eb21518b95e08dc5ac7cff87d0fa8c8959f8
10:01 zmike: lgtm
10:01 daniels: RAOF, zzag: I'd personally be furious if gbm_bo_import was silently allocating and blitting under the hood ...
10:01 daniels: I mean, almost all problems on mobile hardware come down to memory bandwidth, so I absolutely don't want to be adding _more_ memory bandwidth as a silent 'helpful' action
10:02 daniels: but then it also pretty much breaks the whole idea of dmabuf, if the client holds one BO which it renders into, and the server holds another BO which is an older shadow copy of some content that used to be in the client's BO, and they don't realise that they no longer refer to the same storage
10:03 daniels: I can see the call for something like a gbm_bo_copy() which would use a DMA engine or something if necessary, but that would _have_ to be an explicit alloc+copy step, not changing import to be a hidden alloc+copy step rather than the lightweight ref it is today (MMU/TLB overhead notwithstanding)
10:16 MrCooper: agreed
10:17 zzag: Okay, thank you all for the comments! :)
10:19 MrCooper: zmike: no image tag bump → piglit snapshot stays the same in images used by CI → no change in results
10:23 daniels: yeah, the piglit-uprev script did break after the great job renaming; there's a fix in there which actually produces the right image tag bump now, as well as making it scream out erroring when it fails to substitute
10:24 karolherbst: soo.. should I just push the updated CI fail lists or should I wait on something else?
10:29 daniels: pls push
11:01 tintou: Hi there, if anyone with Gallium/pipe-loader want to give it a look https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23054 it avoids a crash on context creation failure on my end :)
11:20 daniels: enunes: eric_engestrom reports that the lima runner is out of disk
11:22 enunes: daniels: I'll look at it now... I wonder how, no matter how many disk cleanup scripts already run
11:22 daniels: enunes: are you using docker or podman? if the latter, I pushed some changes fixing failures that I saw on our shared runners
11:23 daniels: if you're still back on docker, you'd have to look at what was failing and why ... they seem to change their API pretty frequently :\
11:23 enunes: I still have the docker setup, but I will gladly move to podman if that is supported now
11:28 enunes: daniels: where did you push fixes that run on podman setups? something I need to pull and run locally too?
11:29 enunes: runners should be good for now
11:29 pq: zzag, in the end, you'll probably end up implementing all possible variations of how to get images from a GPU to a KMS device, and then you have to wonder how pick the combination that not only works but is also performant.
11:29 daniels: enunes: it's all in https://gitlab.freedesktop.org/freedesktop/helm-gitlab-infra/-/commits/main/gitlab-runner-provision
11:29 daniels: eric_engestrom: ^ no need to pause the runner if you haven't already done so
11:29 daniels: enunes: thanks!
11:29 pq: zzag, reminds me of https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/810 - I wonder what Mutter does nowadays.
11:33 alyssa: austriancoder: congrats on the Igalia gig :~)
11:50 zamundaaa[m]: pq: we already have most variations. Atm we do gbm_bo_import -> egl with blit -> CPU copy. The first one that works gets chosen
12:06 pq: zamundaaa[m], which device do you have executing the blit?
12:08 zamundaaa[m]: the target device
12:09 pq: zamundaaa[m], do you make sure the target device is not software-rendering?
12:09 zamundaaa[m]: yes
12:09 pq: cool
12:10 zamundaaa[m]: If you're asking that way, I gotta ask back: does blitting with the source device have advantages over doing this?
12:11 emersion: i don't think blitting with the source device works
12:11 emersion: it would essentially render to a foreign buffer
12:11 emersion: (maybe it works in some cases? not sure)
12:11 zamundaaa[m]:goes and tests it
12:11 pq: only fairly special case: source device blit could be faster than CPU copy into a dumb buffer if that can be accessed at all.
12:13 pq: I'm thinking iGPU as source device, and virtual device as target.
12:15 pq: but in the DisplayLink case with iGPU, the zero-copy with source-allocated buffer worked.
12:16 pq: so maybe source device blit doesn't have use in practice
12:19 pq: There is glBlitFramebuffer or something that may not have to be the same as "rendering" hardware-wise.
12:45 mareko: karolherbst: I don't know if the vectorization patch works
12:50 karolherbst: ahhh
12:50 karolherbst: mareko: well, it does seem to vectorize the loads we've discussed on the issue
12:52 karolherbst: but it would also be nice to know if any GL workloads are impacted here
13:49 alyssa: jenatali: So, I have a lowering pass to convert imageLoad to txf
13:49 alyssa: The problem is... that's not legal in NIR
13:49 alyssa: imageLoad has access qualifiers and txf doesn't
13:50 alyssa: this is causing a fail in KHR-GLES31 which has a test doing
13:50 alyssa: imageLoad()
13:50 alyssa: imageStore()
13:50 alyssa: barrier()
13:50 alyssa: imageLoad()
13:50 alyssa: as images, NIR knows better than to CSE the loads
13:50 alyssa: as txf, NIR will CSE and then get stale data
13:50 alyssa: I could do the lowering Late(TM) but that seems like a hack
13:50 jenatali: Yep, sounds about right
13:51 alyssa: you get around this by doing it only for readonly I guess
13:51 jenatali: Right
13:51 alyssa: Hmm
13:51 alyssa: I definitely need this for read/write
13:51 alyssa: So my options are either do this As Late As Possible and hope it's late enough
13:51 jenatali: A read-only image can be equivalent to a texture
13:52 alyssa: or just keep the image_load and emit the txf internally in the backend at NIR->backend IR time
13:52 jenatali: Yeah the latter is what I'd do here
13:52 alyssa: cool and good
13:52 alyssa: shouldn't be terrible I think
13:52 alyssa: will do
13:52 alyssa: thanks for the input
13:53 alyssa: also why are you online isn't it stupid early for you
13:53 jenatali: Both txf and image load map to the same DXIL intrinsic, so it's effectively what I do too
13:53 jenatali: I have a baby who wakes up at 5am
13:53 alyssa: womp
13:53 jenatali: I only do the txf translation because DXIL also cares about variable types, and an image load pointing to a texture is weird
13:55 jenatali: Oh and FYI soon enough I'll have another baby so I probably won't be around much for a few months after that
13:56 alyssa: congrats :)
13:56 jenatali: I'll still be on IRC thanks to the matrix bridge though so if you need me I'll at least hear about it
13:58 alyssa: =D
14:00 austriancoder: alyssa: thx
14:15 alyssa: jenatali: ugh, so the reason this is sticky is that I have a lot of txf lowerings
14:15 alyssa: so it would end up getting duplicated
14:15 alyssa: might be a lesser evil, but
14:17 jenatali: Like what? Just curious
14:17 alyssa: slice and dice of the sources into the order the hardware wants as backend1/backend2
14:18 jenatali: Ah fun
14:19 jenatali: Then maybe the very late option, after all CSE, is the right approach
14:19 alyssa: this sounds dodgy
14:22 alyssa: I guess duplicating the slice and dice isn't so bad
14:24 DemiMarie: alyssa: is this in C or in Rust? In Rust pattern matching should be able to make short work of stuff like this.
14:25 DemiMarie: Also Mesa would be a good use for all of Linux’s work on fallible Rust allocations.
14:32 DemiMarie: And unlike Mesa’s C code, the Rust code actually has a chance of recovering from them, at least assuming the OS doesn’t kill it.
14:37 zamundaaa[m]: > * <@zamundaaa:kde.org> goes and tests it
14:37 zamundaaa[m]: so I did, and it *kinda* works with Intel + NVidia
14:37 zamundaaa[m]: That is, importing the buffer and rendering to it seems to work fine, until I do glFlush, where it just hangs indefinitely...
14:37 zamundaaa[m]: probably not worth investigating more
14:40 MrCooper: that was a buffer from nvidia imported into an intel context?
14:40 zamundaaa[m]: yes
14:41 MrCooper: not sure how the buffer provenance would make a difference for glFlush; do you have a backtrace where it hangs?
14:43 MrCooper: hmm, one possibility might be the nvidia buffer's implicit sync object having a fence which never signals, there's currently a known nvidia bug like that
14:44 eric_engestrom: daniels, enunes: I didn't pause the runner, I was in a meeting and didn't see your "ok" on doing it until I also saw the "no need anymore" ^^
14:44 eric_engestrom: enunes: thanks for cleaning it!
14:46 enunes: eric_engestrom: no problem, hopefully that should not be regularly needed... I'll plan some maintenance to switch to a podman runner with the new better cleanup script
14:58 zamundaaa[m]: MrCooper: I don't have a debug build of Mesa installed on that laptop right now, so all I can say that under a few layers of iris_dri.so it's stuck in an ioctl
15:05 alyssa: jenatali: k, typed the thing out
15:05 alyssa: realized I definitely need to distinguish them, because even my backend does CSE (and soon scheduling) so needs to know if a given txf can be reordered or not
15:06 DemiMarie: zamundaaa: Is this nouveau or proprietary?
15:07 zamundaaa[m]: proprietary
15:08 DemiMarie: I did not realize the proprietary driver supported dmabufs. I thought that was all EXPORT_SYMBOL_GPL.
15:12 zzag: There's an open source (kinda) nvidia driver
15:12 zzag: https://github.com/NVIDIA/open-gpu-kernel-modules
15:14 DemiMarie: Would it be best to write a new driver from scratch?
15:39 karolherbst: besides that being a massive amount of work, yes
16:35 emersion: vulkan YCbCr question: is it a big deal to use a nearest filter for implicit chroma reconstruction?
16:35 emersion: or is linear really better>
16:35 emersion: i'm talking about VkSamplerYcbcrConversionCreateInfo.chromaFilter
16:36 emersion: maybe you know gfxstrand?
16:51 dj-death: emersion: this is normally controlled by VkSamplerYcbcrConversionCreateInfo::chromaFilter
16:51 emersion: sure, but what is the effect?
16:52 emersion: how bad is nearest compared to linear?
16:52 emersion: is it reasonable to use nearest?
16:53 dj-death: never tried it
16:53 dj-death: visually I mean
16:54 dj-death: if I remember the CTS verifies that you're using the right filter
16:58 emersion: yeah, and intel disables it because of a bug
16:58 emersion: disables the linear one
17:08 dj-death: for one format
17:10 emersion: for single-plane YUV formats yeah
17:11 alyssa: Mesa: error: GL_INVALID_VALUE in glDeleteProgram
17:11 alyssa: CTS, chill
17:27 emersion: hwentlan_: do you have a plan already to merge the HDR patches? do you want to go through the AMD tree, or drm-misc?
17:28 hwentlan_: was thinking of taking the DRM patches through drm-misc, then merging the whole thing, including amdgpu patches through the AMD tree
17:29 emersion: hm, not sure it's a good idea to merge the DRM part via drm-misc
17:29 emersion: will ask danvet
17:30 sima: if it's just for a single tree usually just an ack for the drm parts is enough
17:30 hwentlan_: any reason? I don't mind taking everything through the AMD tree either but wouldn't want the DRM patches to go stale and cause merge conflicts if they're sitting in the AMD tree for while
17:30 sima: it'll come back pretty quickly through the drm.git pulls anyway
17:31 sima: hwentlan_, well if you sneak them in still now before merge window freeze it's quick
17:31 sima: otherwise you kinda have a bit of trouble anyway since drm-next isn't open until about -rc2 again
17:31 hwentlan_: sima, sneak them through drm-misc?
17:31 sima: and doesn't matter whether the drm patches are stuck in drm-misc or amdgpu during that time I think
17:31 sima: hwentlan_, all through amdgpu and make sure agd5f still does a pull?
17:32 hwentlan_: in that case I guess might as well just take the whole set through the amdgpu tree
17:32 sima: I mean assuming it's all ready and all, I've been drowning terrible last few weeks so now idea :-/
17:32 hwentlan_: I see
17:33 hwentlan_: I think they're finally ready for merge. Was going to give people an extra day before merging but if that means we'll miss the merge window I'll merge them today
17:34 emersion: yeah the DRM core part is ready for merging now
17:34 sima: hwentlan_, check with agd5f I guess
17:34 emersion: well, let me know if you want me to merge via drm-misc
17:46 agd5f: hwentlan_, sima was planning to do one more -next PR this week
17:46 agd5f: probably friday
17:46 hwentlan_: I'll merge it to amd-staging-drm-next today
21:11 karolherbst: jenatali, gfxstrand: I'm trying to figure out what's wrong with the spec id parsing inside clc. My understanding is, that multiple values can be assigned the same spec constant id, right? clc atm asserts, that there is a string 1:1 relation between values and spec constants.
21:11 karolherbst: clc_helpers.cpp:340
21:11 karolherbst: and I think we can just remove that check
21:11 jenatali: Sounds reasonable without looking at it
21:11 karolherbst: yeah, but you also added that assert so I kinda want to understand why :)
21:12 jenatali: That was a long time ago, I really don't remember
21:12 karolherbst: okay
21:12 karolherbst: guess I just submit a MR ditching it
21:13 karolherbst: I run into this assert with spir-vs I get from HIP and SyCL
21:21 karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23512
22:14 mattst88: anyone up for a pretty easy review? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23482
22:16 airlied: always happy to keep sparc64 going
22:17 mattst88: I would have used an alpha, but the sparc was faster
22:17 airlied: karolherbst: your rusticl MR needs to drop the softpipe CI change, since it came in via another MR and rebase duplicated it
22:22 daniels: airlied: thought you were all about the VAX
22:26 airlied: daniels: I had a brief period of putting Linux on university sparcs when nobody was looking :-P
22:29 daniels: I had a brief SH4 period of putting Linux on my Dreamcast, then realising I had nothing to do with it and didn't have a year to compile anything so just went back to Tony Hawk 2
22:31 airlied: the VAX at least had interesting but pointles problems to solve :-)
22:35 idr: SPARC was faster? mattst88, shut your mouth.
22:35 idr: Lololololol
22:36 mattst88: idr: gentoo's sparc devbox is 2x 16-core/32-thread CPUs @ 3.6 GHz with 512 GB of RAM :)
22:37 mattst88: it's a biiiiiit faster than my dual 1.15 GHz Alpha, and I don't have to pay for its electricity :)
22:37 idr: I sometimes forget that they keep making more advance SPARC CPUs.
22:38 mattst88: I think most people have, including maybe oracle themselves
22:38 idr: Ha!
22:40 karolherbst: airlied: CI failed anyway :)
22:42 daniels: karolherbst: ergh yeah, the duplicate-case thing is pain; really wish we had a linter for that
22:43 karolherbst: well.. a job failed saying a line is there twice, but yeah.. would be nice to know that ealier
22:44 daniels: right, I mean something that could either be run locally or in a fast-fail check (like the existing 'sanity' job) to avoid bouncing through marge
22:45 karolherbst: yeah.... guess that could help with taking some load of CI, because it kinda is overloaded a lot of the times these days :'(
22:45 airlied: at some point if we had a job fail die we blow the pipeline up and marge would notice or was I dreaming?
22:46 airlied: oh maybe dreams you always had a chance to retry a job before the deadline
22:49 daniels: mmm
22:50 daniels: airlied: if all jobs in the pipeline complete and one or more fails, then marge does give up and walk away - it's not just sleep(3600); check();
22:50 daniels: if one job fails and others are still running, you've got a window to retry - but we did also add in automatic retries anyway
22:51 daniels: that has the side effect of hiding some flakes, but realistically everyone just smashed it straight back at marge regardless, so it's mostly an efficiency gain