08:33karolherbst: we need a better solution for new_features.txt maybe 🙃
08:34karolherbst: couldn't we just turn it into a directory and everybody just adds new files there? would get rid of the conflicts
09:18emersion: karolherbst: maybe .gitattributes with a union merge driver? https://git-scm.com/docs/gitattributes#_built_in_merge_drivers
09:18karolherbst: mhh maybe that would work
11:41tzimmermann: javierm, thanks for the quick response
11:42tzimmermann: i merged the original patch earlier today
11:42tzimmermann: apologies for the fix. i didn't occur to me that there are format-helper tests for bt601
12:09javierm: tzimmermann: yeah, I didn't catch it either. I wonder if there could be a way to make that association
12:10javierm: because is easy to miss unless you grep for tests
12:27tzimmermann: idk
14:03robclark: karolherbst: so thinking about it a bit more.. I could more easily support bindless on a7xx.. and if there was a way to support it for compute-only contexts, it wouldn't be too hard on a6xx.. maybe that is the most reasonable way to deal w/ tflite (where, I'm not even sure it won't go above 128 tex and/or img)
14:04karolherbst: robclark: bindless is kinda a weird api, because of the resident call... though I could make all images resident and just roll with it
14:05karolherbst: the question is just, how much of a problem would that be to make every single image resident all the time
14:05karolherbst: alternatively, I'd do it on every kernel launch
14:07robclark: it isn't really clear to me when they become un-resident
14:07robclark: but tbh just make them resident immediately.. and we can optimize later if needed
14:08robclark: need to get things running first before I can profile ;-)
14:09robclark: I may want to intro a bindless_texture_compute_only sort of cap for a6xx (or even a7xx just to avoid exposing bindless in gl).. but we'll see
14:10karolherbst: robclark: why would it matter to do it compute only?
14:10karolherbst: on the GL side it's just another extension
14:11karolherbst: I don't think anything uses bindless unless it is requested by the application
14:13robclark: so, a6xx has 5 bindless bases.. we use that internally to have separate img/ssbo state per shader stage (which is what I meant by already using bindless internally within driver).. otherwise I'd have to munge galliums per-stage state into a single global state like I did on earlier gens (but earlier gens didn't have gs/tess so it was a bit less painful) a7xx ups the bindless bases to 8 so we have an additional one to use for
14:13robclark: gallium api level bindless
14:13robclark: with compute only context, there is only a single shader stage so the limit on # of bindless descriptor sets isn't a problem
14:14karolherbst: mhhhh, I see
14:15karolherbst: could turn it into a shader_cap
14:16karolherbst: robclark: oh, you also have a limit on bindless descriptors?
14:17karolherbst: or is it limitless, but you can only have nr of sets
14:18robclark: the size of descriptors is essentially limitless.. but the # of descriptors is fixed
14:21karolherbst: okay
14:22karolherbst: I mean.. going full in with bindless would allow me to ditch all the binding code, so there is definitely an advantage of doing so
14:23karolherbst: and if I can just keep all images resident at no cost, the better
14:23karolherbst: I also want to make buffers more bindless, but that's a bit more problematic
14:23robclark: some other drivers might want you to keep the bindful path.. and I guess folks with <=a5xx would like to still be able to use rusticl so you might need to keep both paths :-(
14:24robclark: (not sure how well rusticl runs on a5xx, so not sure if that is the end of the world.. but I guess panfrost/asahi/etc?)
14:25demarchi: sima: are you good with this to fix old dim with old git versions? https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/79
14:27sima: demarchi, ah that's some serious archeology
14:28demarchi: I've made sure to mention stone age in the git commit message
14:28sima: it's massively modern compared to the earliest version of git dim ran on :-P
14:29sima: but yeah looks reasonable to me, obviously not going to set up a vm with an old git to verify :-)
14:29sima: demarchi, is that the setup issue that iirc alyssa had? or someone else recently at least
14:29demarchi: thankfully running arbitrary git versions is pretty easy... I just keep a checkout around, git checkout vX.Y and export on PATH
14:30sima: well you've hacked git pile, so I think you're a bit more involved than the average user
14:31demarchi: no, for alyssa I think it was different and fixed by 13a92ce9fd458ebd6064f23cec8c39c53d02ed26
14:32demarchi: sima: that's how I tested git-pile with multiple git versions... create a container with all of them installed and choose which one to run during tests.... https://github.com/git-pile/git-pile/blob/master/.github/Dockerfile.ubuntu-22.04#L10-L20
14:32demarchi: we could have something similar in maintainer-tools, but the bootstrap is not simply copy & paste :(
14:35sima: yeah, actual functional tests with containers would be neat in dim, but a lot of work
14:37sima: maybe just one for dim setup? that seems to be the brittle thing where breakage is unnoticed for the longest
14:37sima: and also kinda annoying since it makes it harder for new committers to get going
14:53alyssa: karolherbst: fwiw there's a gpu overhead penalty to bindless on m1
14:53karolherbst: alyssa: what about indirect?
14:54karolherbst: like I could also just manage a table of bound images/textures, but use indirects on the GPU side instead of fixed indicies
14:54karolherbst: but I could imagine it has the same overhead as bindless ones on m1?
14:56alyssa: different overhead, I guess
14:56alyssa: what's wrong with what we have now exactly?
14:56karolherbst: apparently there are applications binding the same image like 128 times to different kernel args
14:57karolherbst: I mean it's an application bug
14:57karolherbst: but I also kinda wanted to optimize the binding model I'm currently using, because rebinding things on each launch is also kinda annoying
14:57karolherbst: but maybe that's fine
14:59karolherbst: at least the indirect/handle would be uniform always
14:59alyssa: i'm not sure this is actually an optimization on all hw
14:59karolherbst: me neither
15:00karolherbst: I don't have concrete plans to do either, just thinking out loud
15:00karolherbst: however, I was considering using bindless to bind more images than the driver/hardware supports
15:00karolherbst: like the first 32 could be bound, and if the kernel uses more, they are bindless
15:01karolherbst: like only the additional ones
15:02karolherbst: some applications depend on a FULL_PROFILE and that means 128 samplers and 64 images
15:05alyssa: make it the gallium driver's problem to advertise enough images then
15:05alyssa: since we already implement that bindless fallback internally
15:05alyssa: and juggling both the internal one and the state tracker one is a mess
15:05alyssa: as robclark mentioned
15:07karolherbst: It's good enough for me if driver wants to deal with it internally
15:30robclark: hmm, I guess I could emulate a higher limit with bindless internally.. an only advertise the higher virtual limits for compute shaders
17:23karolherbst: that would work, yeah
19:26alyssa: is there an env var to bypass the shader cache for fossilize-replay?
19:27alyssa: oh MESA_SHADER_CACHE_DISABLE=true works for vk too ok
19:34dj-death: alyssa: I use VK_ENABLE_PIPELINE_CACHE=0
19:34alyssa: ooh nice
19:34alyssa: thx
19:45anholt: we should probably set one of those in fossil_replay.sh
19:53alyssa: anholt: possibly although in this case I was hitting fossilize-replay directly to grab a disassembly of an affected shader
19:53alyssa: but of course there was no disasm the second time around because it was hot
19:57pendingchaos: last I checked, fossil_replay.sh was faster with pipeline caching
19:59alyssa: i might be holding it wrong
19:59anholt: hmm, yeah, guess I haven't had trouble with shader status going missing with it due to caches.