01:59 fdobridge: <j​ekstrand> We're going to have a driver ID vor NVK in the next header update: https://github.com/KhronosGroup/Vulkan-Docs/pull/1983#event-7819126207
02:01 fdobridge: <j​ekstrand> Don't anyone implement KHR_driver_info. There's an MR outstanding for it and I'd like Yusuf to be able to get a patch in.
02:02 fdobridge: <j​ekstrand> I'll bump the header and rebase on Friday, probably.
02:04 fdobridge: <a​lyssarzg> ✨
02:06 fdobridge: <k​arolherbst🐧🦀> :ferrisBongo:
03:28 fdobridge: <j​ekstrand> Gotta love it when an obvious C++ and C feature is still missing from rust...
03:28 fdobridge: <j​ekstrand> https://github.com/rust-lang/rust/issues/76560
03:28 fdobridge: <j​ekstrand> I can't have a const integer generic parameter and do math on it. 😫
03:29 fdobridge: <j​ekstrand> Trying to make isaspec work for Rust and I wanted to have a BitSet<N> struct where N is the number of bits. But I can't divide by 32 to size my array. 😕
03:35 mhenning: jekstrand: I'd claim that C doesn't really have that feature either
03:35 mhenning: but yeah, some of those types of features have been taking a while
03:36 fdobridge: <j​ekstrand> C doesn't have it officially in the same way as C++ does with constexpr but most C compilers will treat things as constants as long as they can reasonably fold it.
03:36 fdobridge: <j​ekstrand> Then again, C doesn't have generics so...
03:36 fdobridge: <j​ekstrand> But you can do it within a macro and it's generally fine
03:37 mhenning: you might be able to do it with a rust macro?
03:39 mhenning: yeah, I seem to be able to do `let array: [i32; 64 / 32];` locally
03:40 mhenning: so I think the restriction is on type-level integers at the moment
04:10 mhenning: jekstrand: Yeah, the macro thing seems to work. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=039959cf3419f562b42b0523498fe9dd
04:10 mhenning: although not sure about msrv for that
06:53 airlied: Test case 'dEQP-VK.sparse_resources.buffer.transfer.sparse_binding.buffer_size_2_10'.. Pass (Passed)
07:08 airlied: dEQP-VK.sparse_resources.image_sparse_binding.* also all passing
07:17 airlied: okay sparse residency will require some more thought I expect
07:19 airlied: dakr: have to think a bit more about sparse residency and the kernel API
07:19 airlied: we have to be able to not fault on read/writes to areas that aren't mapped
07:21 airlied: skeggsb_: you might also have considered how to deal with that case, I'll have to look at other impl to see what ideas they had
07:25 airlied: fdobridge: <jekstrand> maybe you too
07:29 airlied: might need the ability to map a range with no backing, but have it not fault, then map proper bindings over the top of it
08:51 fdobridge: <m​arysaka> I pushed my initial stuffs for the MME Fermi builder as previously discussed https://gitlab.freedesktop.org/marysaka/mesa/-/tree/maxwell-mme
08:52 fdobridge: <m​arysaka> Still quite work in progress but most operations that can be defined compared to tu104 are here
08:53 fdobridge: <m​arysaka> (There is also some commits that move the tu104 mme code to its own subdirectory, I still need to unify the mme_builder interface again but that's a story for another time)
09:13 fdobridge: <k​arolherbst🐧🦀> very cool!
15:31 dakr: airlied: wonder if from a UAPI pov it might be enough to just accept binds without a gem handle, mark the address range and map in a zero filled dummy page on fault or so. though, I hope the mmu supports something better than that.
16:37 fdobridge: <j​ekstrand> We do want the ability to do VA range reservations. In particular, the NV page tables make a distinction between fully unbound and unbound but "safely faults". The later is needed for sparse resources so you can detect faults in the shader without taking a full fault that kills the context.
16:38 fdobridge: <j​ekstrand> IDK what happens if you do "normal" shader access or non-shader access to such a memory region. Maybe it returns zero?
16:38 fdobridge: <k​arolherbst🐧🦀> tex instructions have a predicate output for that e.g.
16:39 fdobridge: <j​ekstrand> Generally, though, "map a page on fault" with GPUs is an absurdly hard problem.
16:39 fdobridge: <j​ekstrand> Yeah, I know. The question is what happens if you don't provide a predicate or use a non-sparse form or hit it from vertex fetch or something.
16:39 fdobridge: <k​arolherbst🐧🦀> there is no sparse form afaik
16:39 fdobridge: <k​arolherbst🐧🦀> there is no no-sparse form afaik (edited)
16:40 fdobridge: <j​ekstrand> Ok, so it probably just returns zero or garbage
16:40 fdobridge: <k​arolherbst🐧🦀> there is just a predicate indicating if the access was sparse or not
16:40 fdobridge: <j​ekstrand> Or maybe black
16:40 fdobridge: <k​arolherbst🐧🦀> yeah.. no idea what the actual data will be when loaded
16:40 fdobridge: <k​arolherbst🐧🦀> but you'll know when you load from sparse
16:41 fdobridge: <🌺​ ¿butterflies? 🌸> At least for the compute side - you have as much time as you need for that.
16:41 fdobridge: <🌺​ ¿butterflies? 🌸> Includes mapping such a page from disk
16:42 fdobridge: <🌺​ ¿butterflies? 🌸> Before resuming the task
16:42 fdobridge: <j​ekstrand> In theory, yes. But getting the locking to work out inside the kernel isn't as easy as it sounds.
16:43 fdobridge: <🌺​ ¿butterflies? 🌸> Oh, I only have details on how UVM does it - never really poked into nouveau
16:43 fdobridge: <k​arolherbst🐧🦀> okay.. LDG also has a flag to indicate sparse memory
16:43 fdobridge: <k​arolherbst🐧🦀> just not LD
16:45 fdobridge: <k​arolherbst🐧🦀> wouldn't be surprised if the register content stays the same
16:45 fdobridge: <🌺​ ¿butterflies? 🌸> (Interestingly, CUDA doesn’t provide such a thing as far as I know)
16:47 fdobridge: <j​ekstrand> It's mostly a D3D12/Vulkan feature as far as I know.
16:50 fdobridge: <🌺​ ¿butterflies? 🌸> https://cdn.discordapp.com/attachments/1034184951790305330/1042481757259313172/image0.jpg
16:50 fdobridge: <🌺​ ¿butterflies? 🌸> Hmmm, it’s in OptiX too.
16:53 fdobridge: <🌺​ ¿butterflies? 🌸> CUDA sparse textures - 11.1
16:53 fdobridge: <🌺​ ¿butterflies? 🌸> https://github.com/NVIDIA/optix-toolkit/tree/master/DemandLoading
17:08 fdobridge: <🌺​ ¿butterflies? 🌸> > The OptiX Demand Loading library allows hardware-accelerated sparse textures to be loaded on demand, which greatly reduces memory requirements, bandwidth, and disk I/O compared to preloading textures. It works by maintaining a page table that tracks which texture tiles have been loaded into GPU memory. An OptiX closest-hit program fetches from a texture by calling library device code that checks the page table to see if the
17:08 fdobridge: <🌺​ ¿butterflies? 🌸> … ok so this impl isn’t like HMM land at all - but much simpler conceptually