03:52tarceri: mareko: sorry for the delay. Yes the cache does crc checks on the data after its loaded
03:53tarceri: see para_and_validate_cache_item() for the multifile cache eaxmple
03:55tarceri: or mesa_cache_db_read_entry() for the single file example
07:05MrCooper: DemiMarie: "compute and gfx are serialized" means any long-running compute job makes the GPU unusable for interactive graphics
08:39karolherbst: airlied: I might need some help debugging llvmpipe JIT code
08:42karolherbst: uhh actually.. I have an idea what's wrong
08:43karolherbst: yeah.. scratch memory is busted
08:53karolherbst: nice, it works :3
09:48cwabbott: dj-death: bad news, apparently past me wasn't careful enough because there are more failures with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25436 and it's a fundamental problem that requires a bigger rework
09:51cwabbott: the problem is that different stages contribute different pipeline_flags, and we can't combine them in vk_graphics_pipeline_state_merge() because we'd have to create a new render pass state object instead of doing a shallow copy and it's totally not setup to do that
09:53cwabbott: I think we weren't sanitizing the flags as much before this so we didn't run into it
09:53cwabbott: but it's a fundamental problem with how we're handling it
09:54cwabbott: I think the only non-terrible solution is to move the pipeline_flags up into vk_graphics_pipeline_state
09:55cwabbott: that means I'm gonna have to rewrite everything again :/
09:56dj-death: hmm I see
09:56dj-death: gfx libs again...
09:57cwabbott:hates GPL
09:57cwabbott: another example of the axiom that there are no correct GPL implementations
09:58dj-death: I don't even want to think about shader objects
10:00karolherbst: shader objects are the best
10:01karolherbst: at least nvk and nvidia should have valid and correct implementations for shader objects
10:01karolherbst: easily
10:01cwabbott: at least shader objects doesn't have any of this nonsense
10:01karolherbst: shader objects map 1:1 to nvidia hardware (more or less)
10:01cwabbott: you just pass create flags into the shader and that's it
10:01cwabbott: no futzing around with copying state everywhere that's wrong half of the time
10:02dj-death: yeah
10:02dj-death: should have been called NV_shader_objects
10:02karolherbst: well.. it makes life easier for a lot of programmers
10:07karolherbst: mareko: yeah.. the UI is way smoother now with COMPUTE_ONLY set
10:08karolherbst: though radv doesn't seem to use the compute queue for compute only vulkan contexts? Is this even a thing in vulkan?
10:08karolherbst: or maybe zink doesn't do something properly?
10:08karolherbst: oh well..
10:20glehmann: sounds like a zink issue, you can easily only create a queue from the family that supports compute but not graphics
10:47karolherbst: mhh.. also the way I query zink devices only works on nvidia if nvidia-drm is used...
10:49karolherbst: mhh.. also gives me a VK_ERROR_OUT_OF_DEVICE_MEMORY...
10:49karolherbst: "566MiB / 23040MiB".. _sure_
10:51karolherbst: ahh
10:51karolherbst: zink allocates from the mappable region.. mhh
10:52karolherbst: "budget = 23461888 (0x01660000) (22.38 MiB)" figures..
10:58karolherbst: ohh probably missing BIND_GLOBAL handling or something...
11:01zmike: you can't use a compute queue in zink until there's a screen param added for compute screens
11:02karolherbst: ohh.. it needs tohappen on the screen level.. right
11:07karolherbst: I have no idea why, but whatever I do, it's entirely broken on nvidia :D
11:12karolherbst: ahh. the vvl go crazy
11:14karolherbst: that's because of one of my local changes...
11:14karolherbst: uhhh
11:14karolherbst: pain
11:15karolherbst: zmike: so the issue is, that nvidia only has this 256MiB window for HOST_VISIBLE | DEVICE_LOCAL memory.. https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/zink/zink_resource.c?ref_type=heads#L879
11:16zmike: that's gpu-
11:16zmike: dependent
11:16zmike: or you have rebar disabled
11:16karolherbst: I might have rebar disabled
11:16karolherbst: though does nvidia even support it on linux?
11:16zmike: yes
11:17karolherbst: how?
11:17karolherbst: it sounds like I have to flash the vbios?
11:19zmike: pretty sure it just works
11:19karolherbst: mhh.. let me check my uefi then, because I'm mildly sure I've enabled it there
11:20zmike: be pretty hard to do any gaming at all in the current year without it
11:21karolherbst: yeah.. it's enabled in my uefi
11:21karolherbst: well
11:22karolherbst: maybe I need a newer GPU? I have no idea, on my Turing I get 256MiB out of the box and that's it
11:22karolherbst: and there is this "NVIDIA Resizable BAR Firmware Update Tool" thing
11:23karolherbst: though there seem to be patches for nvidias driver to enable it as well...
11:24tnt: lspci for my card shows Region 1: Memory at 7400000000 (64-bit, prefetchable) [size=16G
11:25karolherbst: what nvidia gpu is that, and what distribution?
11:25zmike: I use a 2070 when I'm testing that
11:26karolherbst: I bet you have a nvidia kernel module patch for it then
11:26karolherbst: or maybe they added support for it and my driver is outdated..
11:27karolherbst: but anyway "nvidia-smi -q | grep -i bar -A 3" reports 256 MiB on my Quadro 6000
11:27tnt: karolherbst: That's a 4070 running ubuntu 22.04 with 535.(104?) drivers.
11:28karolherbst: I'm on 535.104.05
11:30tnt: But you might be rights that your card needs a vbios update.
11:30karolherbst: it's one of the early turing ones, so yeah...
11:30karolherbst: anyway... having to rely on rebar might not be something which would be feasible here.. dunno
11:31karolherbst: but I suspect the vbios is just toggling some bit given that there is a module patch for it
11:32tnt: There is also some reference to a "compute mode"/"graphics mode", the former having a 8G BAR and the latter 256M.
11:36karolherbst: mhh.. interesting
11:36zmike: things should work, albeit suboptimally, without rebar, but it's definitely not a preferred mode of operation
11:37karolherbst: well.. 256 MiB isn't enough and zink fails to allocate memory at some point
11:37zmike: yeah and then it'll do a fallback to another heap
11:37karolherbst: but there isn't any compatible one
11:37zmike: compatibility is relative
11:38karolherbst: buffer allocates need HOST_VISIBLE and DEVICE_LOCAL and it doesn't look like it falls back to anything.. maybe I should debug a bit more, but it does look like it's just doing nonsense after that and crashes the GPU
11:39zmike: ctrl+f /* demote BAR allocations to a different heap on failure to avoid oom */
11:42karolherbst: ohh indeed.. still crashes the GPU though that might be something else going wrong
11:43karolherbst: mhh.. the vvl doesn't complain... let me run the CTS
11:52masteratwork: I remember few details about thing called binary buddy allocator , in theory it's somehow possible to put stack and heap both to pc-relative locations in the kernel, and have the kernels allocator defragment things , but i do not remember the details well, sortix claims to do that, but i know sbrk and mmap are not very good at this regard. Of course i can be mistaken, binary buddy allocator was for win95 that seemed to
11:52masteratwork: work great alike though.
12:01masteratwork: Maybe in case of heap it would not make overly too much sense, cause heap can be larger on loops of memory intensive apps then code, its programmers responsibility to free it
12:02masteratwork: but if those ain't possible than only thing to do is to index the memory in compressed format, and that is quite rough
12:09masteratwork: i mean not heap overall but the issue is with global variables likely , so that would make more sense, this can be pc-relative
12:10masteratwork: some section as to where they live
12:11masteratwork: all the heap apps allocation would be ping ponged through global variables or TLS or something like that through stack
12:17masteratwork: so it's not like i made a very huge blooper on my compression theories, it's just that i may still lack some of the kernel skills to do it more easily
12:27cwabbott: dj-death: just pushed a fixed version
12:27cwabbott: now with this turnip actually passes all the tests again
12:28cwabbott: as usual, past me was an idiot
12:43cwabbott: dj-death: fyi, because of the reworks rebasing is not going to be trivial
12:43cwabbott: also I didn't build-test anv so I'm testing it in CI now
12:45dj-death: cwabbott: I'm doing it right now
12:45dj-death: it's not too bad
12:50cwabbott: one small thing is that anv had a few places passing around the render pass state which used it just to get the pipeline_flags
12:50cwabbott: so they passed around render pass state, multisample state, etc.
12:51cwabbott: and now they pass around all those other states... plus the vk_graphics_pipeline_state struct that contains all of them
12:52cwabbott: you could collapse all of those arguments down to one now that we have to pass around the overall state anyway, but I just went for the minimal change that replaced state->rp with state
12:53dj-death: yeah
12:59alyssa: cwabbott: hard disagree
12:59alyssa: past you wrote a state-of-the-art RA that present me is still digesting, definitely non-idiot
13:04cwabbott: dj-death: ugh, one last bug that I needed to fix so I pushed again
13:06cwabbott: forgot that other drivers merge libraries first so I have to OR in the flags in vk_pipeline_flags_init
13:06cwabbott: anv pipeline libraries will probably blow up without that
14:09alyssa: airlied: so what happens with fedora 39 given the current mesa + llvm 17 lulz?
14:11karolherbst: I probably also should look into llvm 17.. pain
14:13Armote[m]: does Venus still require CONFIG_TRANSPARENT_HUGEPAGE to be disabled on the host or was this old KVM bug fixed? https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/docs/drivers/venus.rst
14:19alyssa: karolherbst: i'm just confused, like
14:19karolherbst: same
14:19alyssa: why is f39 pulling in llvm17 if it breaks the mesa build
14:19karolherbst: I bet the reson is "it built, so it's fine"
14:20alyssa: mesa literally won't build
14:20karolherbst: maybe we should make life miserable for everybody by always checking for the llvm version and fail to compile on a newer one
14:20karolherbst: heh
14:20karolherbst: that's bad
14:21karolherbst: but anyway, my involvement in downstream fedora is pretty non existent, so I guess that's more like something for airlied to figure out
14:23alyssa: k
14:24psykose: alyssa: are you sure they rebuilt it yet?
14:24psykose: i've noticed a lot of specs that just 'use llvm' when i know for sure the latest llvm doesn't work with them
14:24psykose: i assume they just have some other invisible system or didn't actually rebuild
14:25alyssa: psykose: I don't know. but apparently we're supposed to be shipping f39 and um.
14:26psykose: :D
14:26HdkR: That's a spicy release schedule. LLVM 17 released 10 days ago
14:33daniels: alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23827
14:59alyssa: daniels: sadly that's just the tip of the iceberg
15:00alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9792
15:00alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9791
15:00alyssa: but the fact 23827 isn't merged yet tells me fedora 39 should be fine with building against llvm16 lol
15:01karolherbst: fedora does indeed provide different llvm runtimes, just `llvm-devel` is always the newest
15:36karolherbst: sooo.. let's take a look at this llvm-17 mess
15:37karolherbst: alyssa: I'll probably try to make that CL stuff work on llvm-17 now, not sure how long it will take, but let me try to figure things out until next week
15:37karolherbst: though I think we already had fixes for most in place
15:54karolherbst: mhhh
19:55alyssa: dcbaker: adding script with multiple outputs
19:55alyssa: i know you don't like that but not as much as everyone won't like what it's for (:
19:55dcbaker: alyssa: I'm cool with a script with multiple outputs
19:56alyssa: oh I thought that was treif?
19:56dcbaker: I'm always the one shooting down "gen_thing_c.py" + "gen_thing_h.py" and arguing that we should have "gen_thing.py" that generates both
19:56alyssa: ahhhh
19:57alyssa: this is an extremely spicy intel_clc spin-off
19:57dcbaker: oh boy, how could intel_lcl get any spicier
19:57alyssa: I heard you like C, so I'm writing C to generate C from your C for your C
19:58alyssa: magic that makes arbitraryish OpenCL functions available as nir_builder with no bindings
19:58dcbaker: Xzibit approves this message
19:59dcbaker: the existing CLC stuff is annoying, as is getting people to review the meson bits I'm trying to add to make it less annoying
19:59alyssa: cc me on anything you want reviewed
20:00dcbaker: I should rephrase that "new Meson features I'm trying to add to Meson itself to make clc less annoying"
20:01dcbaker: but in particular this: https://github.com/mesonbuild/meson/pull/12258
20:01dcbaker: which I need to make dependency(..., native : 'both') work correctly
20:02alyssa: oh
20:02cmarcelo: dcbaker: oh, I thought we had issues with multiple outputs? probably can fix this for glsl_type stuff (we have three outputs, so right now three separate scripts).
20:03dcbaker: only if we use capture : true, which we shouldn't be using because it makes everything slow on Windows
20:03cmarcelo: yeah. I've moved off capture, so maybe can just merge them up. will take a note of that.
20:03dcbaker: Well, slow on Linux too, but really slow on Windows because forking is so expensive
20:04dcbaker: I would review that change
20:04alyssa: how does capture with 2 outputs work..?
20:05dcbaker: it doesn't :)
20:05alyssa: no perf issue then!
20:05alyssa: :P
20:05dcbaker: lol
20:06cmarcelo: well.... one for stderr and other for stdout. :-D
20:06dcbaker: capture is just slow in general because Meson has to wrap a custom_target that uses feed or capture inside a meson script
20:06dcbaker: so you get an extra fork
20:08alyssa: ah
20:08cmarcelo: not the case of meson, but I think you could open fds and let the child inherit them, so not touching stdout or stderr.
20:08cmarcelo: (but yeah, unrelated to the wrapper issue)
21:27karolherbst: airlied: any opinions on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24839/diffs?commit_id=599100b091f8f0b0bbeb4bffda1bb0d88eb9244e ?
22:43gfxstrand: Ugh... Found a very old bug in spirv_to_nir with tesselation patch I/O. Do I dare fix it?
22:43gfxstrand: We're using VARYING_SLOT_VARN sometimes for patch stuff
23:07gfxstrand: I kinda want a load/store_patch_output
23:39cmarcelo: gfxstrand: found by inspection or you have a test case?