08:33 karolherbst: we need a better solution for new_features.txt maybe 🙃
08:34 karolherbst: couldn't we just turn it into a directory and everybody just adds new files there? would get rid of the conflicts
09:18 emersion: karolherbst: maybe .gitattributes with a union merge driver? https://git-scm.com/docs/gitattributes#_built_in_merge_drivers
09:18 karolherbst: mhh maybe that would work
11:41 tzimmermann: javierm, thanks for the quick response
11:42 tzimmermann: i merged the original patch earlier today
11:42 tzimmermann: apologies for the fix. i didn't occur to me that there are format-helper tests for bt601
12:09 javierm: tzimmermann: yeah, I didn't catch it either. I wonder if there could be a way to make that association
12:10 javierm: because is easy to miss unless you grep for tests
12:27 tzimmermann: idk
14:03 robclark: karolherbst: so thinking about it a bit more.. I could more easily support bindless on a7xx.. and if there was a way to support it for compute-only contexts, it wouldn't be too hard on a6xx.. maybe that is the most reasonable way to deal w/ tflite (where, I'm not even sure it won't go above 128 tex and/or img)
14:04 karolherbst: robclark: bindless is kinda a weird api, because of the resident call... though I could make all images resident and just roll with it
14:05 karolherbst: the question is just, how much of a problem would that be to make every single image resident all the time
14:05 karolherbst: alternatively, I'd do it on every kernel launch
14:07 robclark: it isn't really clear to me when they become un-resident
14:07 robclark: but tbh just make them resident immediately.. and we can optimize later if needed
14:08 robclark: need to get things running first before I can profile ;-)
14:09 robclark: I may want to intro a bindless_texture_compute_only sort of cap for a6xx (or even a7xx just to avoid exposing bindless in gl).. but we'll see
14:10 karolherbst: robclark: why would it matter to do it compute only?
14:10 karolherbst: on the GL side it's just another extension
14:11 karolherbst: I don't think anything uses bindless unless it is requested by the application
14:13 robclark: so, a6xx has 5 bindless bases.. we use that internally to have separate img/ssbo state per shader stage (which is what I meant by already using bindless internally within driver).. otherwise I'd have to munge galliums per-stage state into a single global state like I did on earlier gens (but earlier gens didn't have gs/tess so it was a bit less painful) a7xx ups the bindless bases to 8 so we have an additional one to use for
14:13 robclark: gallium api level bindless
14:13 robclark: with compute only context, there is only a single shader stage so the limit on # of bindless descriptor sets isn't a problem
14:14 karolherbst: mhhhh, I see
14:15 karolherbst: could turn it into a shader_cap
14:16 karolherbst: robclark: oh, you also have a limit on bindless descriptors?
14:17 karolherbst: or is it limitless, but you can only have nr of sets
14:18 robclark: the size of descriptors is essentially limitless.. but the # of descriptors is fixed
14:21 karolherbst: okay
14:22 karolherbst: I mean.. going full in with bindless would allow me to ditch all the binding code, so there is definitely an advantage of doing so
14:23 karolherbst: and if I can just keep all images resident at no cost, the better
14:23 karolherbst: I also want to make buffers more bindless, but that's a bit more problematic
14:23 robclark: some other drivers might want you to keep the bindful path.. and I guess folks with <=a5xx would like to still be able to use rusticl so you might need to keep both paths :-(
14:24 robclark: (not sure how well rusticl runs on a5xx, so not sure if that is the end of the world.. but I guess panfrost/asahi/etc?)
14:25 demarchi: sima: are you good with this to fix old dim with old git versions? https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/79
14:27 sima: demarchi, ah that's some serious archeology
14:28 demarchi: I've made sure to mention stone age in the git commit message
14:28 sima: it's massively modern compared to the earliest version of git dim ran on :-P
14:29 sima: but yeah looks reasonable to me, obviously not going to set up a vm with an old git to verify :-)
14:29 sima: demarchi, is that the setup issue that iirc alyssa had? or someone else recently at least
14:29 demarchi: thankfully running arbitrary git versions is pretty easy... I just keep a checkout around, git checkout vX.Y and export on PATH
14:30 sima: well you've hacked git pile, so I think you're a bit more involved than the average user
14:31 demarchi: no, for alyssa I think it was different and fixed by 13a92ce9fd458ebd6064f23cec8c39c53d02ed26
14:32 demarchi: sima: that's how I tested git-pile with multiple git versions... create a container with all of them installed and choose which one to run during tests.... https://github.com/git-pile/git-pile/blob/master/.github/Dockerfile.ubuntu-22.04#L10-L20
14:32 demarchi: we could have something similar in maintainer-tools, but the bootstrap is not simply copy & paste :(
14:35 sima: yeah, actual functional tests with containers would be neat in dim, but a lot of work
14:37 sima: maybe just one for dim setup? that seems to be the brittle thing where breakage is unnoticed for the longest
14:37 sima: and also kinda annoying since it makes it harder for new committers to get going
14:53 alyssa: karolherbst: fwiw there's a gpu overhead penalty to bindless on m1
14:53 karolherbst: alyssa: what about indirect?
14:54 karolherbst: like I could also just manage a table of bound images/textures, but use indirects on the GPU side instead of fixed indicies
14:54 karolherbst: but I could imagine it has the same overhead as bindless ones on m1?
14:56 alyssa: different overhead, I guess
14:56 alyssa: what's wrong with what we have now exactly?
14:56 karolherbst: apparently there are applications binding the same image like 128 times to different kernel args
14:57 karolherbst: I mean it's an application bug
14:57 karolherbst: but I also kinda wanted to optimize the binding model I'm currently using, because rebinding things on each launch is also kinda annoying
14:57 karolherbst: but maybe that's fine
14:59 karolherbst: at least the indirect/handle would be uniform always
14:59 alyssa: i'm not sure this is actually an optimization on all hw
14:59 karolherbst: me neither
15:00 karolherbst: I don't have concrete plans to do either, just thinking out loud
15:00 karolherbst: however, I was considering using bindless to bind more images than the driver/hardware supports
15:00 karolherbst: like the first 32 could be bound, and if the kernel uses more, they are bindless
15:01 karolherbst: like only the additional ones
15:02 karolherbst: some applications depend on a FULL_PROFILE and that means 128 samplers and 64 images
15:05 alyssa: make it the gallium driver's problem to advertise enough images then
15:05 alyssa: since we already implement that bindless fallback internally
15:05 alyssa: and juggling both the internal one and the state tracker one is a mess
15:05 alyssa: as robclark mentioned
15:07 karolherbst: It's good enough for me if driver wants to deal with it internally
15:30 robclark: hmm, I guess I could emulate a higher limit with bindless internally.. an only advertise the higher virtual limits for compute shaders
17:23 karolherbst: that would work, yeah
19:26 alyssa: is there an env var to bypass the shader cache for fossilize-replay?
19:27 alyssa: oh MESA_SHADER_CACHE_DISABLE=true works for vk too ok
19:34 dj-death: alyssa: I use VK_ENABLE_PIPELINE_CACHE=0
19:34 alyssa: ooh nice
19:34 alyssa: thx
19:45 anholt: we should probably set one of those in fossil_replay.sh
19:53 alyssa: anholt: possibly although in this case I was hitting fossilize-replay directly to grab a disassembly of an affected shader
19:53 alyssa: but of course there was no disasm the second time around because it was hot
19:57 pendingchaos: last I checked, fossil_replay.sh was faster with pipeline caching
19:59 alyssa: i might be holding it wrong
19:59 anholt: hmm, yeah, guess I haven't had trouble with shader status going missing with it due to caches.