IRC Logs of #dri-devel on irc.freenode.net for 2023-07-10

00:08 benjaminl: so I'm debugging some CTS failures in NVK, and I'm pretty sure I've determined that the gpu is entirely skipping global ld instructions on fragment shader helper invocations
00:08 benjaminl: looking through envytools and the ptx docs, the only thing I can find that's supposed to behave like this is texture instructions that have 'liveOnly = 1' set
00:09 benjaminl: anybody know if this is a known/expected thing?
00:29 airlied: benjaminl: yes there is a kernel path
00:29 airlied: patch that has been floating around for ages
00:29 benjaminl: oh, lol
00:30 benjaminl: I would not have expected this to be a kernel thing :)
00:30 airlied: https://gitlab.freedesktop.org/karolherbst/nouveau/-/commit/10a2421f1e44b2ff6050a2c94c9622410c47d3f8
00:30 airlied: not sure if that is the latest patch actually
00:31 airlied: ah https://gitlab.freedesktop.org/nouvelles/kernel/-/commit/bddc8d861bff653f4cb0cb004ed0b454a0ab8451 is an older version
00:34 benjaminl: thanks!
00:35 benjaminl: for doing NVK development, is it generally a good idea to use that 'nouvelles/kernel' tree so that I don't run into this kind of thing?
00:35 benjaminl: not sure how much stuff is fixed there
00:48 airlied: nope
00:48 airlied: there isn't really one tree there, it's just a bunch of works towards getting a new user api
00:50 airlied: I think most people are just running a tree with that patch, also #nouveau or discord is where more talking happens
01:18 benjaminl: airlied: good to know. I didn't realize there was a discord, do you have a link?
01:29 airlied: https://discord.gg/8wDvvHuw
01:29 airlied: I think that should work, not really a discord expert :-P
01:33 benjaminl: seems to work, thanks!
07:16 pcercuei: Is drm-misc-next periodically merged into linux-next?
07:36 Danct12: pcercuei: pretty much
08:16 dottedmag: Has anyone got ccls or clangd working on Mesa repository (or any other LSP server)? ccls just hangs for me, and clangd does not understand #if properly.
08:34 pq: airlied, I'm sure the email I sent was not addressed to your skynet.ie address, yet googlemail still complains to me that cannot deliver to skynet.ie. Do you know what that is about?
08:35 pq: airlied, it was addressed to your linux.ie though.
08:36 airlied: yeah linux.ie is mostly dead
08:36 pq: should I just drop it from the cc list when I see it?
08:55 dottedmag: Do non-DRI backends for libgbm exist? Open source?
08:56 emersion: dottedmag: there is minigbm but AFAIK it's still a full reimpl instead of a backend
08:56 emersion: apart from that, closed-source NVIDIA
08:56 dottedmag: Got it, thanks
08:57 emersion: there may be others, but i'm not aware of any
09:03 pq: tnt, if your desktop environment does not automatically react to hot-unlpug by disabling the unplugged output, I might even call that as expected, though not good, behaviour: a video signal can conceptually be emitted into a disconnected connector, and I'd guess that usb-c in HDMI(?) mode might not do anything else at the same time.
09:07 tnt: pq: there is no DE basically raw X almost. But thing is, I don't want the output to get disabled/off because then windows that were there will be moved to the remaining active output ... Ideally I just want them to not be visible until I either reconnect or I explicitely turn it off. And that's what it does on my old laptop using the 'intel' DDX. The output would be turned off but then everything that was on it would end up
09:11 pq: dottedmag, did you arrange clangd to find compile_commands.json for your build dir? I usually just symlink that file from build dir to project root, but never tried with Mesa.
09:12 dottedmag: pq: I did not, I will try. I guessed it picked it up by itself, as _some_ of things worked.
09:12 pq: dottedmag, I think it can guess a lot, but not the build config.
09:13 dottedmag: I'm trying to navigate Mesa from MacOS, and that means libgbm is disabled, so it does not even appear in compile_commands.json :-/ C preprocessor is a gift that keeps on giving... headaches.
09:17 pq: tnt, I can imagine how that could be caused by intel vs. modesetting DDX driver, but I don't know for sure. (It seems your long irc lines might be getting truncated.)
09:20 tnt: pq: yeah, probably, but other drivers (nvidia and amd) have a not far off behavior for the hdmi/dp output. They don't move it to a VIRTUAL output, but they just say disconnected but then they correctly detect when I replug ...
09:20 tnt: which here isn't the case for the intel driver.
09:20 dottedmag: I think I understand now how libgbm works. Creating a buffer delegates to Mesa src/loader that loads a dri/<driver>_dri.so from Mesa and calls create_resource_with_modifiers in that driver. Is this a correct rough approximation?
09:21 dottedmag: (and yes, there is a shortcut in libgbm to create a dumb buffer, and a loadable backends in there etc)
09:22 emersion: yes
09:22 emersion: then each driver has a driver-specific IOCTL to allocate memory
09:24 dottedmag: Thanks.
09:34 dottedmag: I wonder why *_dri.so drivers reference each other: http://paste.debian.net/plain/1285467
09:34 karolherbst: dottedmag: because it's all the same binary
09:35 dottedmag: Ahh, right.
09:35 dottedmag: Thanks
09:35 karolherbst: yeah.. mesa simply hardlinks and the file name is picked up by the loader more or less
09:54 javierm: tzimmermann: nice work chasing down that amdgpu bug
09:54 tzimmermann: javierm, thanks
09:54 tzimmermann: of course, it's actually trivial to fix
09:55 javierm: tzimmermann: yeah, but wasn't evident. I spent like an hour looking at the code the other day with you and couldn't spot the issue
09:55 tzimmermann: i only got it when i single-stepped through the provided logs
09:56 tzimmermann: and the bug has been introduced 4 yrs ago
09:56 tzimmermann: thanks for reviewing
09:56 javierm: tzimmermann: you are welcome
12:08 javierm: tomba: hi, are you planning to push https://lore.kernel.org/all/be2c4c02-43bc-5b16-2162-b8ace8d34996@ideasonboard.com/ to drm-misc ?
12:46 Hazematman1: Is something going on with the virgl-iris-traces CI job? seems to be giving bad file descriptor errors then vcpu requested shutdown https://gitlab.freedesktop.org/Hazematman/mesa/-/jobs/45190903
13:01 DavidHeidelberg[m]: Hazematman: hmm, I noticed it was failing after Debian 12 uprev, bit I see there is more info in the logs now
13:03 karolherbst: dcbaker: sooo.. there is a weirdo annoying rustup + meson problem. I suspect that meson checks if rustc changed, but with rustup it might not, even if the binaries get updated :')
13:28 tintou: anholt: Hi there, If you have some time can you give another look at https://gitlab.freedesktop.org/anholt/deqp-runner/-/merge_requests/57 ?
13:50 alyssa: kusma: I wonder if we should revisit -Wno-unused-function
13:51 alyssa: AFAICT, the actual issue is that gcc ignores over static inline functions, but clang does not
13:52 alyssa: mesa style is mostly "if you have functions in a header, they should be inline", so..
13:52 alyssa: I kinda want to enable -Wunused-functions but only on gcc? because warning about legitimately unused, non-inline functions can be useful
13:55 kusma: alyssa: If you can get things working without a ton of warnings, go for it IMO.
13:55 kusma: My motivation was to make the clang logs readable, I think
13:56 alyssa: oh IDK if I was volunteering I just needed to vent that somewhere
13:56 alyssa: now that is vented I no longer care :~D
13:56 alyssa:back to work
13:56 alyssa: jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23735#note_1994528
13:57 jenatali: Ack
13:58 jenatali: In an ideal world I'd debug it but I don't have time atm
13:59 jenatali: I really need to actually debug all of the failures in that baseline and get issues filed on them...
14:01 alyssa: Yea..
14:01 alyssa: I don't think any driver passes piglit.
14:01 jenatali: Yeah but we have a lot more fails than most
14:01 alyssa: yeah, fair
14:39 alyssa: in practice, how many patches do apps draw with tess?
14:40 alyssa: like, is the expectation that you draw just a few patches but they generate 1000s of vertices?
14:40 alyssa: I'm trying to figure out the memory budget
14:41 alyssa: with the obvious limits and packing, you get a worst case of ~80KiB per patch for the tessellator output
14:41 alyssa: If the app draws a dozen patches, that's totally fine
14:41 alyssa: if the app draws 100,000 patches... that's less ok
14:51 Company: yesterday I played with the GTK Vulkan stuff on my desktop, and my AMD gave me these memory properties: https://gitlab.gnome.org/-/snippets/5909 - and that got me wondering about how I'm meant to pick the memory type to put my textures into
14:51 Company: because there's only 2 heaps - GPU and CPU memory I suppose - but 11 different types
14:52 Company: do I just pick the first one that has the flags I want? Do I look for an exact match?
14:53 Company: why are there even 11 types - I assume it's not just to confuse people
15:02 MrCooper: Company: it's mainly about the heap (device-local VRAM vs system memory) & propertyFlags, not sure what the types 'usable for' nothing are about though
15:04 Company: what's the technical difference between those types, say types 0 and 3, the DEVICE_LOCAL vs DEVICE_LOCAL | HOST_VISIBLE?
15:04 Company: I suppose one does extra work to set up page table entries or something?
15:05 MrCooper: both are accessible by the GPU; the latter is directly accessible by the CPU as well, the former not
15:05 Company: yeah, I got that
15:06 Company: but why would I ever choose the one with less features?
15:06 CounterPillow: because it's potentially faster
15:06 Company: so my assumption should be fewer features = faster?
15:06 MrCooper: some device-local VRAM can be physically inaccessible to the CPU due to PCIe BAR restrictions
15:07 CounterPillow: Your assumption should be "pick what I actually need"
15:07 Company: MrCooper: shouldn't that end up in 2 heaps then?
15:07 MrCooper: dunno
15:08 Company: CounterPillow: so that means "don't pick the first one that includes the flags I want, instead search for the memory type that's an exact match" I guess?
15:09 CounterPillow: I'd look for an exact match, and then widen from there, but I'm no expert on this at all (someone who writes Vulkan drivers should probably chime in)
15:09 Company: I suppose making it 2 heaps is bad because if you only care about DEVICE_LOCAL and only use type 0, then you'd be locked out of the 2nd heap
15:09 MrCooper: something to keep in mind is that any CPU reads (even implicitly for read-modify-write) from device-local VRAM will send performance down the drain
15:11 Company: well yeah, I'm not trying to do fancy stuff (like revivve cairo-gl ;))
15:11 Company: I'm just trying to understand what's going on so I don't do stupid things
15:12 Company: I was quite proud that the Vulkan stuff I did on intel only wasn't just working flawlessly on this AMD, it was also fast
15:13 MrCooper: well, that's one stupid thing to avoid :) in general, it's safer to stay away from device-local VRAM if there's any CPU access, unless you're 100% sure it's one of the rare cases where it's slightly faster than system memory
15:17 Company: my assumption so far was to only vkMapMemory() if it's HOST_CACHED
15:18 MrCooper: certainly if there's a non-0 chance of any CPU reads
15:18 MrCooper: if not, uncacheable might be slightly faster
15:19 Company: the question is what to do about texture uploads
15:20 Company: so I use a custom buffer and CopyToImage()
15:20 Company: or do I map the image's memory directly?
15:21 Company: (is "upload" the term used for copy-to-vram even? Or is that just something we use in GTK?)
15:22 emersion: "upload" is somewhat commonly used
15:27 mattst88: karolherbst: did you make that MR?
15:28 karolherbst: yes
15:29 mattst88: cool, thanks. I have missed it :)
15:29 karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23981
15:29 mattst88: thanks!
15:51 jani: sima: tzimmermann: mripard: mlankhorst: more acks on https://lore.kernel.org/r/20230706092850.3417782-1-jani.nikula@intel.com ?
15:52 sima: jani, I think it was only ever drm-intel that benefitted from the integration testing
15:52 sima: but yeah a-b: me
15:52 jani: sima: thanks
16:31 tzimmermann: jani, i'm not involved in any of the sound code. do what ever makes the most sense
17:29 dcbaker: karolherbst: Yeah, I've looked into it. There's not much we can do on the Meson end because of the way rustup is designed. They wrap all of the tools, and that wrapper never gets updated, so from ninja's point of view nothing has changed. This would require some kind of rustup change, either in the wrapper itself, or with some sort of file ninja can watch. Say, a consistent file on disk that just contains the toolchain version
17:31 karolherbst: dcbaker: mhh.. I suspect that's not feasible becasue rustup also allows you to set per directory overrides
17:32 karolherbst: there is a ~/.rustup/settings.toml file which encodes all of this, but things might change after the ninja project was created
17:36 dcbaker: I mean, in the worst case if there was some kind of token locally like .rustup_version we could check the sourcedir root for that, but for this to be at all reliable it would have to be something that Meson and Rust agreed on
17:37 karolherbst: I suspect a global `rustc --version` check is out of question?
17:38 karolherbst: could you have an always running target writing that output to a file and ninja only retriggers job based on the content changing?
17:38 Company: the GALLIUM_THREAD env var isn't documented on https://docs.mesa3d.org/envvars.html - is that by design?
17:41 bl4ckb0ne: thanks again for the feedback on !18847 zmike, sorry it took so long to answer
18:57 dcbaker: karolherbst: that makes a no-op rebuild suddenly not very no-op. In the past Jussi and others have been very much of the opinion "no-op builds *must* be no-op, unless as project explicitly creates an always_dirty target"
18:58 dcbaker: I wonder if this is something that I could convince rustup to do for us... I can file a bug and see what they think
19:11 airlied: pq: yes drop it on sight
20:19 karolherbst: airlied: okay.. I've played a bit more around with non uniform workgroups and I think I'm almost done on llvmpipe with this. The only remaining issue is that simply the last block is messed up: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/e5f5b00558a3ae4b8522eaf44f6a87838a889c82
20:20 karolherbst: like if 128 is the normal block size, 127 the last block and you have 16 blocks, the last block doesn't run from id 1920 (15 * 128), but rather from 1905 (15 * 127)
20:22 karolherbst: but I think this will require changing the llvmpipe shader ABI?
20:23 karolherbst: though I think this can be easily be done by providing both values: block and last_block and the offsetting is always done with block or something...
20:23 karolherbst: dunno... didn't really looked into how all that actually works yet
20:57 karolherbst: mhhhh
20:57 karolherbst: this is all part of system value lowering...
20:59 dcbaker: karolherbst: https://github.com/rust-lang/rustup/issues/3403. Let's see what they say
20:59 karolherbst: "non_uniform_1d_basic passed" mhh
21:02 Mershl[m]: [radv] We're currently seeing a GPU hang on Baldurs Gate 3 during startup (Vulkan). An developer of the game responded but I can't make too much out of it: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6875#note_1982310
21:02 karolherbst: ohh.. only stuff with "reqd_work_group_size" fails now...
21:02 karolherbst: yeah.. I think I have to give the nir side another proper thought
22:55 karolherbst: uhhhh...... the lowering of nir_intrinsic_load_workgroup_size using the non variable wokgroup_size doesn't work for CL actually...
22:56 karolherbst: so if non uniform workgroups are using with a specific workgroup size, the last block still gets a different work group size...