00:04fdobridge: <gfxstrand> How's this:
00:04fdobridge: <gfxstrand> > NVK bugs can be filed in the [Mesa issue tracker]. However, right now there
00:04fdobridge: <gfxstrand> > are enough missing features and known issues that a lot of apps won't work
00:04fdobridge: <gfxstrand> > and we already know that. Please don't file "XYZ game doesn't play" issues
00:04fdobridge: <gfxstrand> > just yet. If there is a specific feature you know is missing or if you are
00:04fdobridge: <gfxstrand> > sure you've found an actual driver bug, then an issue is appropriate. We
00:04fdobridge: <gfxstrand> > have way more excited users than developers working on the driver so we
00:04fdobridge: <gfxstrand> > don't want to flood the issue tracker.
00:04fdobridge: <gfxstrand> >
00:04fdobridge: <gfxstrand> > If you do think you've found a legitimate bug, please provide as much
00:04fdobridge: <gfxstrand> > information as possible in the report. Mesa git SHA, kernel version, and
00:04fdobridge: <gfxstrand> > GPU information are all a must. If you didn't build Mesa yourself, please
00:04fdobridge: <gfxstrand> > tell us where you got your build. If you have any hack patches or had to
00:04fdobridge: <gfxstrand> > force-enable anything to get the app to work, provide that information as
00:04fdobridge: <gfxstrand> > well.
00:04fdobridge: <gfxstrand> >
00:04fdobridge: <gfxstrand> > Also, be patient. This is a very new driver. We aim to eventually get to
00:04fdobridge: <gfxstrand> > the point where most apps work but it will take time. We also need to
00:04fdobridge: <gfxstrand> > prioritize our work and do things in the order that makes sense so don't be
00:04fdobridge: <gfxstrand> > surprised if your favorite feature doesn't get implemented as quickly as
00:04fdobridge: <gfxstrand> > some other.
00:06fdobridge: <airlied> sounds good
00:06fdobridge: <karolherbst🐧🦀> yep
00:07fdobridge: <karolherbst🐧🦀> ack by me
00:30fdobridge: <gfxstrand> I think we're about ready. I just need to do a couple things once it the kernel patches hit drm-misc-next and then we're ready to hand off to Marge. 🎉
00:30fdobridge: <gfxstrand> I'm going to go figure out some supper and try to relax for the rest of my evening.
00:31fdobridge: <gfxstrand> @karolherbst , you should probably go to bed. 😛
00:31fdobridge: <gfxstrand> Tomorrow's gonna be a big day. 🤩
00:31fdobridge: <karolherbst🐧🦀> mhhhh
00:43airlied: depends if dakr shows up :)
00:43airlied: I don't think I have drm-misc-next commit abilities
01:06fdobridge: <prop_energy_ball> I'm hype
01:06fdobridge: <prop_energy_ball> I have a spare NV card I normally use for Windows passthru
01:06fdobridge: <prop_energy_ball> Might give some stuff a go when I see it on drm-next :)
01:57fdobridge: <gfxstrand> I thought you had commit rights for everything. 😂
01:59fdobridge: <airlied> Seemed easier to maintain clear lines of responsibility I stayed out of the way 🙂
02:01fdobridge: <karolherbst🐧🦀> worst case I can push thoes changes
02:08fdobridge: <gfxstrand> It also wouldn't kill us to delay until Monday. 😅 It'd give Kara more time to edit my blog post.
05:25sravn: gfxstrand: An faq entry describing the current status for zink support would be nice
06:03airlied: I think we'd want to merge 240, then we have at least suboptimal gl4.5
06:41fdobridge: <gfxstrand> Has anyone tried running any CTS or piglit on Zink or are we all going off the list in the issue?
06:42fdobridge: <airlied> I tried a piglit run, but I didn't survive
06:42fdobridge: <airlied> I mostly tried glxinfo to see it said GL 4.5 🙂
06:43fdobridge: <airlied> and I ran heaven for a few seconds
08:25fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> You should try gamescope (you need to advertise Vulkan 1.2) but it also requires host-visible VRAM from what I could tell
10:15fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> https://gitlab.freedesktop.org/nouveau/mesa/-/issues/77 :nope_gears:
10:30fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I wonder if we could get Forza Horizon 5 working in 2 years :triangle_nvk:🏎️
11:09fdobridge: <esdrastarsis> Or RE4 Remake
12:11fdobridge: <gfxstrand> If Zink doesn't actually work, I'm inclined not to include it in the FAQ and leave it in the "random apps may not work" category.
12:28fdobridge: <gfxstrand> Most of that stuff falls under the "we need a new compiler" category. A lot of features will more or less fall into place once I get back to finishing up NAK.
12:28fdobridge: <gfxstrand>
12:28fdobridge: <gfxstrand> There's also sparse residency which shouldn't be too hard now that we have the new UAPI. We just need to add a few NAK helpers and wire it up. I do think we may want to come up with better internal interfaces for that stuff, though. Right now, we're leaking internal `nvk_image` details into the back-end queue code.
12:28fdobridge: <gfxstrand>
12:28fdobridge: <gfxstrand> And then there's ray-tracing... I'm not even going to think about that for quite a while. 😅
12:40fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I wonder if Bas could help build the new slowest ray-tracing implementation
12:47fdobridge: <mohamexiety> I wonder how Ray tracing will work, actually
12:48fdobridge: <mohamexiety> For AMD it's regular compute shaders. On NV there's dedicated units and so on and NV is actually really tight lipped about these 😮
12:48fdobridge: <mohamexiety> I wonder how ray tracing will work, actually (edited)
12:52fdobridge: <gfxstrand> I'm sure we can crack it wide open. I have a half decent idea what shape I expect it to take already, based on my combined Intel/Nvidia experience. Precious few details, though.
12:58fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Software vs hardware raytracing 🐸
12:59fdobridge: <gfxstrand> I mean... It's all software at some level. Even the hardware ray-tracing has a lot of software work involved.
12:59fdobridge: <karolherbst🐧🦀> I think raytracing uses its own class
13:00fdobridge: <karolherbst🐧🦀> and we have no docs on those 🙂
13:00fdobridge: <karolherbst🐧🦀> could ask Nvidia and see if we can wiggle it
13:00fdobridge: <gfxstrand> Unless NVIDIA just baked it all into the hardware like they've done with other stuff.
13:00fdobridge: <karolherbst🐧🦀> now that we have a vulkan driver it's fairly reasonable to ask for it
13:00fdobridge: <karolherbst🐧🦀> ohhh, I'm sure it's like the other stuff
13:01fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Or you can do the nouveau way and REnouveau it :nouveau:
13:01fdobridge: <gfxstrand> Well, "class"... RT is basically stateless. They could give us the class header and it would tell us almost nothing.
13:01fdobridge: <karolherbst🐧🦀> well.. at least on how to execute them
13:02fdobridge: <gfxstrand> Sure. But that's like the easiest thing to RE.
13:02fdobridge: <karolherbst🐧🦀> yeah...
13:02fdobridge: <karolherbst🐧🦀> just mostly don't want to use our own headers for things 😄
13:03fdobridge: <gfxstrand> Sure
13:03fdobridge: <gfxstrand> No arguments there.
13:05fdobridge: <karolherbst🐧🦀> it's kinda wonky that even with all the docs I have, nothing even mentions raytracing
13:06fdobridge: <karolherbst🐧🦀> I'll try to bring it up on the next meeting with nvidia and see how the vibes are
13:06fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> For now we could do software raytracing :triangle_nvk:
13:09fdobridge: <karolherbst🐧🦀> ` Like Turing, the second-generation RT Core in GA10x includes dedicated hardware units for BVH traversal and ray-triangle intersection testing`
13:09fdobridge: <karolherbst🐧🦀> ` Once the SM has cast the ray, the RT Core will perform all of the calculations needed for BVH traversal and triangle intersection tests, and will return a hit or no hit to the SM.`
13:10fdobridge: <karolherbst🐧🦀> ohh winky
13:10fdobridge: <karolherbst🐧🦀> *wonky
13:10fdobridge: <gfxstrand> Sounds like Intel
13:10fdobridge: <karolherbst🐧🦀> ` The GA10x SM can process two compute workloads simultaneously, and is not limited to just compute and graphics simultaneously as in prior GPU generations, allowing scenarios such as a compute-based denoising algorithm to run concurrently with RT Core-based ray tracing work.`
13:11fdobridge: <karolherbst🐧🦀> but yeah... seems like we just program the RT cores and eat the result
13:12fdobridge: <karolherbst🐧🦀> https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf
13:12fdobridge: <gfxstrand> Well, actually, I expect it to be like Intel but with more automagic around stack management and call/return/resume.
13:12fdobridge: <gfxstrand> But I need to see some IR first.
13:13fdobridge: <karolherbst🐧🦀> the thing is.. I see nothing for ray tracing in the ISA docs I have
13:14fdobridge: <karolherbst🐧🦀> but as far as stack management goes
13:14fdobridge: <karolherbst🐧🦀> there is no stack anymore
13:15fdobridge: <karolherbst🐧🦀> you even have to save the return address yourself and everything
13:15fdobridge: <mohamexiety> that's in general? or RT only? 😮
13:16fdobridge: <karolherbst🐧🦀> in general
13:16fdobridge: <karolherbst🐧🦀> they got rid of the stack in volta
13:17fdobridge: <mohamexiety> that's interesting
13:18fdobridge: <karolherbst🐧🦀> well.. the stack was a pain to deal with anyway
13:18fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> So is there only the heap and registers?
13:18fdobridge: <karolherbst🐧🦀> yeah, but you have special "barrier" registers for thread mask management and other wonky stuff
13:19fdobridge: <karolherbst🐧🦀> forcing the thread group to run in quad mode is also all explicit. You safe the old mask, do your quad stuff and restore the mask
13:21fdobridge: <gfxstrand> "stack" was maybe the wrong term. What I mean is that I expect local mem to "just work". This is as opposed to Intel where you get a thread ID which you have to track and then spilling is all manual via global mem.
13:22fdobridge: <karolherbst🐧🦀> ahhh
13:22fdobridge: <gfxstrand> Sure, you might have to save and restore stuff but I expect you don't have to juggle thread IDs or anything dumb like that.
13:23fdobridge: <karolherbst🐧🦀> ahh right, yeah, that's probably not as painful on nvidia
13:24fdobridge: <karolherbst🐧🦀> some hardware is really wonky on that ID stuff.. even AMD is weird
13:24fdobridge: <karolherbst🐧🦀> when trying to implement non uniform work group, I got told that AMD hardware doesn't know how big their blocks are 🙃
13:24fdobridge: <karolherbst🐧🦀> why make everything so painful
13:25fdobridge: <karolherbst🐧🦀> nvidia has sys vals for a bunch of random stuff, launched threads, active threads, ID on all the levels, etc...
13:26fdobridge: <karolherbst🐧🦀> uhhhhhhhhhhh
13:26fdobridge: <karolherbst🐧🦀> trying to debug some stupid nv50 regression, and it's sending commands to the hardware which the gl driver 100% doesn't send 🙃 or maybe gdb is just silly to me today
14:13fdobridge: <georgeouzou> what about the bvh building? Is this done with normal compute shaders?
14:42fdobridge: <karolherbst🐧🦀> wouldn't be surprising if that's done in compute
14:46fdobridge: <gfxstrand> That's done in compute shaders.
14:48fdobridge:<gfxstrand> is watching drm-misc and being disappointed. 🙃
14:52fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> i think there should be a proper error message when trying to run the new uAPI NVK on old uAPI kernel :nouveau:
14:53fdobridge: <esdrastarsis> Will nouveau gallium be dropped when Zink is working on NVK?
14:54fdobridge: <gfxstrand> Maybe? 🤷🏻♀️ That's all still TBD. We need to see how well it works, first. Nouveau gallium will be the plan for pre-kepler probably forever, though.
14:55fdobridge: <gfxstrand> I can try to cook one up. Unfortunately, that's actually kind of annoying to do properly due to how things are initialized and by whom and in what order.
14:56fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Inb4 Triang3l decides to learn old NVIDIA hardware and make Fermikan 😅
14:56fdobridge: <gfxstrand> Actually... I've got an idea for a plan.
14:57fdobridge: <gfxstrand> It'll just take a bit of re-arranging.
14:57fdobridge: <esdrastarsis> I see
15:00fdobridge: <gfxstrand> The paradox of Zink is that, gallium is good enough at this point that, if you build your Vulkan driver right (which I have) then, by the time you have a competent Vulkan feature set, you can build a gallium driver by re-using bits of the Vulkan driver for a fraction of the cost of building a whole gallium driver from scratch.
15:00fdobridge: <karolherbst🐧🦀> yeah...
15:01fdobridge: <gfxstrand> Zink is a paradox of sorts. Gallium is good enough at this point that, if you build your Vulkan driver right (which I have) then, by the time you have a competent Vulkan feature set, you can build a gallium driver by re-using bits of the Vulkan driver for a fraction of the cost of building a whole gallium driver from scratch. This means that, while Zink looks like it saves you a pile of work, it doesn't really. (edited)
15:01fdobridge: <karolherbst🐧🦀> moving some of the state emission into common code shouldn't be all too painful (and use it from both even). The pain points with gl drivers is mostly that you are fighting with gallium, the CTS and the hardware at once 🙃
15:02fdobridge: <gfxstrand> Reusing state emission isn't important.
15:02fdobridge: <gfxstrand> Not as long as you do it reasonably competently.
15:03fdobridge: <karolherbst🐧🦀> yeah, fair, my point is rather, in vulkan you don't have to deal with an abstraction layer and can focus on the hardware bits, and once you understand the hardware, writing a gallium driver let's you focus on the gallium bits
15:03fdobridge: <gfxstrand> The problem with nouveau GL isn't that it's a gallium driver and not a Vulkan driver. It's that it's a crufty pile of 💩.
15:03fdobridge: <karolherbst🐧🦀> it's just kinda sad, that people still start with GL these days
15:03fdobridge: <karolherbst🐧🦀> 😄
15:03fdobridge: <karolherbst🐧🦀> yeah, fair
15:03fdobridge: <gfxstrand> And I mean that it the best possible way. 🙃
15:03fdobridge: <karolherbst🐧🦀> I wonder how much of that is due to how gallium was 15 years ago
15:03fdobridge: <karolherbst🐧🦀> or well.. 10
15:03fdobridge: <karolherbst🐧🦀> or 15?
15:03fdobridge: <karolherbst🐧🦀> it's been long
15:04fdobridge: <gfxstrand> Probably quite a bit
15:04fdobridge: <karolherbst🐧🦀> and also understanding of the hardware wasn't the best
15:04fdobridge: <karolherbst🐧🦀> for fun reasons, nvc0 is kinda the main driver and nv50 was a copy of that
15:04fdobridge: <karolherbst🐧🦀> and nv50 is still terrible in a few places
15:04fdobridge: <gfxstrand> There are many ways for a code bade to become 💩 that don't involve it being designed poorly at the start.
15:04fdobridge: <gfxstrand> There are many ways for a code base to become 💩 that don't involve it being designed poorly at the start. (edited)
15:05fdobridge: <esdrastarsis> What's funny is that the gallium driver is faster than NVK for some reason, but I haven't tested it with the new uAPI
15:05fdobridge: <karolherbst🐧🦀> anyway.. a new driver would be better
15:05fdobridge: <karolherbst🐧🦀> the gallium driver also doesn't use global mem for ubos 😄
15:05fdobridge: <karolherbst🐧🦀> ubos are one reason why stuff is fast on nvidia hardware
15:05fdobridge: <karolherbst🐧🦀> ubos are... _fast_
15:05fdobridge: <gfxstrand> Most code bases tend towards 💩 over time unless great care and effort are put into preventing the 💩ification.
15:06fdobridge: <gfxstrand> Oh, that's because we don't use real uniforms yet.
15:06fdobridge: <gfxstrand> Jinx
15:06fdobridge: <karolherbst🐧🦀> yeah.. I'm sure using ubos will speed up things by x3 at least 😛
15:06fdobridge: <karolherbst🐧🦀> ohhh
15:06fdobridge: <karolherbst🐧🦀> you can even test it in gl
15:07fdobridge: <karolherbst🐧🦀> we have code to spill ubos, might want to spill all of them.. though it might be compute only stuff
15:07fdobridge: <karolherbst🐧🦀> mhhh
15:07fdobridge: <karolherbst🐧🦀> it's a bit weird, but compute only has 8 slots...
15:07fdobridge: <karolherbst🐧🦀> and two are used for uniforms and driver stuff
15:08fdobridge: <karolherbst🐧🦀> sooo.. advertizing 8 isn't as trivial as one would hope
15:09fdobridge: <karolherbst🐧🦀> gl requires 8 per stage, no idea what vulkan requires, if it requires anything at all for ubos
15:09fdobridge: <gfxstrand> We'll get there in time. Annoyingly, though, proper UBO support is also blocked on NVK. Sure, I could do something that works with legacy bound UBOs and we probably should eventually for Pascal and earlier but we really want bindless and that's not something codegen is likely to ever support.
15:10fdobridge: <gfxstrand> We'll get there in time. Annoyingly, though, proper UBO support is also blocked on NAK. Sure, I could do something that works with legacy bound UBOs and we probably should eventually for Pascal and earlier but we really want bindless and that's not something codegen is likely to ever support. (edited)
15:11fdobridge: <karolherbst🐧🦀> yeah.. it would be quite a mess to add in codegen
15:11fdobridge: <karolherbst🐧🦀> though...
15:11fdobridge: <karolherbst🐧🦀> shouldn't be too hard
15:11fdobridge: <karolherbst🐧🦀> just add a `.bindless` flag like we did for textures 🙃
15:11fdobridge: <karolherbst🐧🦀> on `Symbol`
15:12fdobridge: <karolherbst🐧🦀> and then the indirect isn't an indirect index, but the ubo address
15:12fdobridge: <karolherbst🐧🦀> at least I think that would be the least painful way inside codegen
15:12fdobridge: <gfxstrand> You need uniform register support
15:12fdobridge: <karolherbst🐧🦀> mhhhhhhhhh....
15:12fdobridge: <karolherbst🐧🦀> I forgot about that
15:13fdobridge: <gfxstrand> Yeah...
15:13fdobridge: <gfxstrand> That's why NAK doesn't have them yet.
15:13fdobridge: <gfxstrand> I need to get spilling sorted first then UGPRs then bindless UBOs.
15:13fdobridge: <karolherbst🐧🦀> though I don't think bindless ubos are special compared to bound ones, I think the initial data uploading just happens later. I actually wonder if bindless ubos share the remaining free slots and what happens if you need more than that
15:14fdobridge: <gfxstrand> AFAICT, bindless UBOs are pure magic
15:14fdobridge: <karolherbst🐧🦀> they might also just have a page cache and unload accessed pages
15:14fdobridge: <karolherbst🐧🦀> *upload
15:14fdobridge: <gfxstrand> I think so
15:14fdobridge: <karolherbst🐧🦀> should be easy to figure out
15:15fdobridge: <karolherbst🐧🦀> just need some proper profiling
15:15fdobridge: <gfxstrand> I don't think there's any state associated with them. That's why they're bindless. 🙃
15:15fdobridge: <karolherbst🐧🦀> though uploading the entire ubo would be simple
15:15fdobridge: <karolherbst🐧🦀> you have the size in the address
15:15fdobridge: <karolherbst🐧🦀> and the hardware could just map it
15:16fdobridge: <karolherbst🐧🦀> and they just use unused ubo slots
15:16fdobridge: <karolherbst🐧🦀> "lazy binding" might be a proper term here? 😄
15:17fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Just like cats?
15:17fdobridge: <karolherbst🐧🦀> faster
15:17fdobridge: <karolherbst🐧🦀> ubos on nvidia have gpr access speed
15:17fdobridge: <esdrastarsis> gpr?
15:17fdobridge: <karolherbst🐧🦀> registers
15:17fdobridge: <karolherbst🐧🦀> using ubos can even be faster than registers
15:18fdobridge: <karolherbst🐧🦀> for instance, if an instruction can't take a full 32 bit immediate value, you'd have to move it into a register first
15:18fdobridge: <karolherbst🐧🦀> with an ubo, you can just extract such constants and have a constant ubo you use instead
15:18fdobridge: <karolherbst🐧🦀> so you even kill the mov and are faster
15:18fdobridge: <karolherbst🐧🦀> (and you might use less registers)
15:18fdobridge: <georgeouzou> Are they using the same memory as shared compute memory?
15:19fdobridge: <karolherbst🐧🦀> no
15:19fdobridge: <karolherbst🐧🦀> shared memory is L2 cache
15:19fdobridge: <karolherbst🐧🦀> and slower
15:19fdobridge: <georgeouzou> Oh this is even faster then
15:19fdobridge: <karolherbst🐧🦀> well.. as I said: UBOs are as fast as registers
15:19fdobridge: <karolherbst🐧🦀> they have blocks for ubos and you prefill them with data before draw
15:20fdobridge: <karolherbst🐧🦀> sure, you still have the memory access, but you only have to do it once
15:20fdobridge: <georgeouzou> If the data in the ubo is large they only load the data that is used by the shader?
15:20fdobridge: <karolherbst🐧🦀> no, you upload the entire thing
15:20fdobridge: <georgeouzou> Because I think they have like a 64kb limit
15:20fdobridge: <karolherbst🐧🦀> yeah
15:21fdobridge: <karolherbst🐧🦀> it's 64k
15:21fdobridge: <karolherbst🐧🦀> and you have 18 slots on modern hardware
15:21fdobridge: <karolherbst🐧🦀> you also have a register file of 64k 🙃
15:21dakr: airlied, gfxstrand: Just read your messages, I will fix up the SPDX and copyright stuff. I can also apply the series together with your uAPI fixup patch to drm-misc-next. However, I wonder if we want this to go through drm-misc?
15:21fdobridge: <karolherbst🐧🦀> ohh wait.. the register file is 256k
15:21fdobridge: <karolherbst🐧🦀> 64k * 4
15:23fdobridge: <georgeouzou> If one has multiple ubos that all together are above 64k ? Some spill to global memory right?
15:23fdobridge: <georgeouzou> I mean bound to the 18 slots
15:23fdobridge: <karolherbst🐧🦀> all 18 exist in hardware, yes
15:24fdobridge: <karolherbst🐧🦀> compute only has 8 for unknown reasons
15:24fdobridge: <georgeouzou> Hmm
15:26fdobridge: <karolherbst🐧🦀> sadly none of the nvidia whitepapers talk in detail about bindless ubos 😢
15:27fdobridge: <karolherbst🐧🦀> but uhm... I got told that one of my guesses isn't wrong
15:42fdobridge: <karolherbst🐧🦀> @mhenning if you got some time, mind reviewing https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23724 ?
15:48fdobridge: <karolherbst🐧🦀> pain... ubuntu "LTS" 22.04 updates from LTS 5.15 to EoL 6.2... :painpeko:
15:54gfxstrand: dakr: Where else would it go?
15:54gfxstrand: dakr: Genuine question. I don't actually know. (-:
15:56gfxstrand: I just believe airlied when he says branch names. 😂
16:09dakr: Dunno, the MAINTAINERS file mentions https://gitlab.freedesktop.org/drm/nouveau.git, sometimes Dave also pulls directly from Bens tree...
16:11gfxstrand: I don't know either and Dave won't be around for a while (if at all today) :-/
16:11dakr: I mean, he clearly stated drm-misc above.. :D
16:11gfxstrand: :D
16:15karolherbst: yeah.. whatever. drm-misc or the nouveau tree or something. We can use the nouveau tree, but then it needs to be a proper pull request and uhhh.. it's kinda annoying. The hope was ti use gitlab with CI and everything, but generally it's more work than just pushing through drm-misc directly
16:27dakr: So drm-misc it is. :)
16:28karolherbst: yep
17:10fdobridge: <gfxstrand> It's kinda horrible and I kinda hate it but I think this works: https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/253
17:12fdobridge: <gfxstrand> It's kinda horrible and I kinda hate it but I think this works: https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/25 (edited)
17:12fdobridge: <gfxstrand> It's kinda horrible and I kinda hate it but I think this works: https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/253 (edited)
17:47fdobridge: <karolherbst🐧🦀> yeah, I think that's the proper way of doing so
17:47fdobridge: <karolherbst🐧🦀> those version numbers are alwyas kinda wonky
17:50fdobridge: <gfxstrand> Yeah, checking if `VM_INIT` succeeds seems like it should be reliable.
17:50fdobridge: <gfxstrand> I just don't like how intertwined everything is inside the winsys but part of that is because it's all intertwined inside the kernel APIs.
17:51fdobridge: <karolherbst🐧🦀> yeah...
17:52fdobridge: <gfxstrand> One day, I may decide I care enough to detangle it all
17:52fdobridge: <gfxstrand> Today is not that day
17:52fdobridge: <karolherbst🐧🦀> mhhh... on the kernel side?
17:52fdobridge: <gfxstrand> userspace could be less tangled, too.
17:52fdobridge: <karolherbst🐧🦀> where specifically though?
17:52fdobridge: <gfxstrand> I started trying to but bo_create was being more magical than needed and that was going to make it a pain
17:53fdobridge: <karolherbst🐧🦀> ahh
17:53fdobridge: <gfxstrand> What we really want is a `nv_device_info_init(drmDevicePtr device);` and then drop the info from `nouveau_device`. It becomes a VM manager and BO cache and not much else.
17:54fdobridge: <gfxstrand> What we really want is a `nv_device_info_init(drmDevicePtr device, struct nv_device_info *info);` and then drop the info from `nouveau_device`. It becomes a VM manager and BO cache and not much else. (edited)
17:54fdobridge: <karolherbst🐧🦀> yeah.. that's probably a good idea
17:54fdobridge: <gfxstrand> It can all be done and I've got a rough plan. I just hit `-ENOSPOONS`
18:02fdobridge: <gfxstrand> Seems to work. I'm gonna merge it.
18:05fdobridge: <mohamexiety> did something change with how we handle initialization in a recent commit? pulled latest `nvk/main` and now I can't run
18:05fdobridge: <mohamexiety>
18:05fdobridge: <mohamexiety> ```
18:06fdobridge: <mohamexiety> [mesa] [mohamed@fedora build]$ export NVK_I_WANT_A_BROKEN_VULKAN_DRIVER=1
18:06fdobridge: <mohamexiety> [mesa] [mohamed@fedora build]$ vulkaninfo
18:06fdobridge: <mohamexiety> WARNING: [Loader Message] Code 0 : loader_scanned_icd_add: Driver /home/mohamed/dev/mesa-dev/mesa/build/src/nouveau/vulkan/libvulkan_nouveau.so supports Vulkan 1.3, but only supports loader interface version 4. Interface version 5 or newer required to support this version of Vulkan (Policy #LDP_DRIVER_7)
18:06fdobridge: <mohamexiety> MESA: error: ../src/nouveau/vulkan/nvk_physical_device.c:679: VK_ERROR_INCOMPATIBLE_DRIVER
18:06fdobridge: <mohamexiety> ERROR: [../src/nouveau/vulkan/nvk_physical_device.c:679] Code 0 : VK_ERROR_INCOMPATIBLE_DRIVER
18:06fdobridge: <mohamexiety> ERROR: [Loader Message] Code 0 : setup_loader_term_phys_devs: Failed to detect any valid GPUs in the current config
18:06fdobridge: <mohamexiety> ERROR at /builddir/build/BUILD/Vulkan-Tools-sdk-1.3.216.0/vulkaninfo/vulkaninfo.h:231:vkEnumeratePhysicalDevices failed with ERROR_INITIALIZATION_FAILED
18:06fdobridge: <mohamexiety> ```
18:06fdobridge: <karolherbst🐧🦀> yeah, you'll need linux 6.6
18:06fdobridge: <mohamexiety> ugh, now this is risky
18:07fdobridge: <mohamexiety> I thought 6.6 was needed for Turing+ only
18:08fdobridge: <karolherbst🐧🦀> nope, for everything
18:09fdobridge: <mohamexiety> I see.
18:09fdobridge: <mohamexiety> main issue I am worried about upgrading is the weird firmware stuff I ran into earlier with the 3080 making me unable to boot again. currently, this system outright doesn't recognize/init the 3080 so all is fine
18:09fdobridge: <mohamexiety> buuut I guess I can try
18:09fdobridge: <karolherbst🐧🦀> don't you have grub or something?
18:09fdobridge: <mohamexiety> yes
18:09fdobridge: <karolherbst🐧🦀> but yeah.. we might want to figure out what's the problem with your 3080
18:10fdobridge: <karolherbst🐧🦀> _but_ it should be alright if your mesa is new enough, because I think you just ran into a userspace bug with gnome, no?
18:11fdobridge: <mohamexiety> the userspace bug was due to firmware init failing somewhere during boot
18:11fdobridge: <karolherbst🐧🦀> ohhh...
18:11fdobridge: <karolherbst🐧🦀> right...
18:11fdobridge: <karolherbst🐧🦀> do you have a log from that?
18:11fdobridge: <gfxstrand> You can build with `-Dnvk-legacy-uapi=true` for now
18:11fdobridge: <karolherbst🐧🦀> I think if we forward that to nvidia it might help figuring it out
18:12fdobridge: <gfxstrand> We'll probably rip out the legacy UAPI code before the next Mesa release gets branched but having it in there for a short period makes development a tiny bit easier.
18:20fdobridge: <mohamexiety> that worked, thanks!
18:22fdobridge: <karolherbst🐧🦀> system values are now TGSI free in codegen :3
18:22fdobridge: <karolherbst🐧🦀> well.. MR is assigned to marge, but it might make it easier to implement certain things
18:37fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> That kernel version doesn't exist right now 🫃
18:39dakr: gfxstrand, airlied: applied the series to drm-misc-next.
18:52gfxstrand: dakr: \o/
18:52gfxstrand: Building now....
19:02fdobridge: <gfxstrand> @karolherbst Can I please get an RB for the "Import drm_nouveau.h" patch in the NVK MR?
19:02fdobridge: <gfxstrand> I'd like that one to have a proper RB tag
19:07fdobridge: <karolherbst🐧🦀> does that commit have a different name, because I can't find it
19:10fdobridge: <gfxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24326/diffs?commit_id=2ee043a82bf690a0dc0b6aca6de07bee578ad6f0
19:11fdobridge: <mhenning> Are we certain that the new uapi will land in kernel 6.6? If there's any chance it might slip it might make sense to make the error message a more generic "kernel version too old" instead
19:11fdobridge: <gfxstrand> The 6.5 window only recently closed. We'll hit 6.6.
19:11fdobridge: <gfxstrand> And if we don't, we can update the message and back-port the fix.
19:15fdobridge: <karolherbst🐧🦀> done
19:17fdobridge: <karolherbst🐧🦀> we really have to fix gpu recovery 😢 but it's such a painful thing to debug
19:18fdobridge: <gfxstrand> Pardon me while I assign this little MR to marge....
19:19fdobridge: <karolherbst🐧🦀> mhhh... gitlab needs gifs or something
19:19fdobridge: <gfxstrand> Ugh... That sysval MR is going to conflict silently.
19:19fdobridge: <gfxstrand> Can we delay it?
19:19fdobridge: <karolherbst🐧🦀> uhh...
19:19fdobridge: <karolherbst🐧🦀> rebase on top of it?
19:19fdobridge: <gfxstrand> It's going to be less painful to rebase sysvals
19:20fdobridge: <karolherbst🐧🦀> okay
19:20fdobridge: <gfxstrand> I'll do the rebase.
19:20fdobridge: <karolherbst🐧🦀> anyway, unassigned from marge
19:21fdobridge: <karolherbst🐧🦀> though the big question is if CI will agree with our plans or not 🙂
19:37fdobridge: <gfxstrand> I really need to denylist some of these synchronization tests....
19:38fdobridge: <gfxstrand> Also, I'm getting a lot of crashes in this run and I don't know why...
20:00fdobridge: <airlied> I hope someone is babysitting Marge :-p
20:11fdobridge: <karolherbst🐧🦀> it's probably fine
20:19fdobridge: <gfxstrand> And... marge fail.
20:19fdobridge: <gfxstrand> Totally my fault
20:20fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Which revision of this patch should I use?: https://patchwork.freedesktop.org/series/112994 (revision 11 and 12 look weird to me)
20:21fdobridge: <airlied> I suppose I should write a kernel blog
20:22fdobridge: <gfxstrand> Just grab drm-misc
20:23fdobridge: <airlied> for raytracing, you could get a simple demo and use my nvidia cmd tracer to work out some bits I suppose
20:24fdobridge: <karolherbst🐧🦀> that moment you rebase 16000 commits
20:33fdobridge: <airlied> I think we've entered the dreaded wait for marge to timeout phase
20:34fdobridge: <karolherbst🐧🦀> you still have 1.5 hours until it's Saturday for me, don't disappoint me, you said it's going to land today 😛
20:37fdobridge: <gfxstrand> Second build failure was @airlied 😛
20:41fdobridge: <mohamexiety> this is very exciting to watch honestly.. I can't wait!
20:43fdobridge: <gfxstrand> Third build failure is people adding too many Mesa build flags. 🙄
20:44fdobridge: <karolherbst🐧🦀> just remove them all
20:45fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> This exact error striked again :nope_gears:
20:48fdobridge: <prop_energy_ball> I mean that could be removed it'd just make my life more painful :-)
20:48fdobridge: <prop_energy_ball> Does NVK really not support hvv rn?
20:51fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> NVK has 2 heaps right now: a device-local heap (for the VRAM) and a host-visible/coherent heap (for the system RAM)
21:06fdobridge: <airlied> okay I wrote a blogpost, now to hold off
21:06fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Where can I find it?
21:06airlied: dakr: you might need to build with the nouveau svm option enabled, and send a follow patch if it still breaks
21:10fdobridge: <airlied> I can't publish it until marge succeeds 🙂
21:15fdobridge: <gfxstrand> All the build tests pass. This time for sure!
21:17fdobridge: <karolherbst🐧🦀> it doesn't build for me with the legacy uapi enabled
21:19fdobridge: <gfxstrand> Then don't enable legacy uapi. 😛
21:20fdobridge: <karolherbst🐧🦀> don't ship with that option then 😛
21:20fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> When was the legacy uAPI created?
21:20fdobridge: <karolherbst🐧🦀> it was before history was recorded
21:23fdobridge: <gfxstrand> Fixed. Sadly, I lost a race with marge so we'll see if that screwed anything up
21:24fdobridge: <karolherbst🐧🦀> it probably does
21:24fdobridge: <karolherbst🐧🦀> still have 36 minutes tho
21:27fdobridge: <airlied> yeah might be another 1hr timeout wait
21:28fdobridge: <karolherbst🐧🦀> 🥲
21:31fdobridge: <gfxstrand> I figred out how to cancel the pipeline so marge would kick it back to me.
21:31fdobridge: <gfxstrand> Kicked back to marge
21:49fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> 🇬🇧
21:49fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1137140354689277992/Screenshot_20230805_004629.png
21:55fdobridge: <gfxstrand> Ah, the terror of trying to land a giant MR which touches all the CI even though it affects none of it. 😅
22:02fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Have you encountered hanging issues with the new uAPI? 🐸
22:03fdobridge: <gfxstrand> Things have seemed a little less stable but not bad.
22:04fdobridge: <gfxstrand> They were more stable on the branch I was running than on drm-next. IDK why.
22:05fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> vkcube (X11) works fine until you try to close it but vkcube-wayland freezes on the first frame (push_sync has no effect and dmesg is too silent)
22:06fdobridge: <gfxstrand> I wouldn't be surprised if there are some VK <-> GL interop issues.
22:06fdobridge: <gfxstrand> Feel free to file the first NVK bug. 🙂
22:09fdobridge: <esdrastarsis> merged 🥳
22:10fdobridge: <gfxstrand> 🎉
22:11fdobridge: <georgeouzou> 🎉 nice!
22:12fdobridge: <karolherbst🐧🦀> 🎉
22:13fdobridge: <karolherbst🐧🦀> phase of the moon
22:16fdobridge: <gfxstrand> Ugh... It's my wifi driver. 🙃
22:16fdobridge: <gfxstrand> Gotta love early rc kernels...
22:17fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> That's why I'm building the stable kernel instead 🐸
22:17fdobridge: <gfxstrand> I should just denylist iwlwifi
22:19fdobridge: <gfxstrand> It's not like that machine will ever use wifi
22:24fdobridge: <phomes> 🥳
22:28fdobridge: <mhenning> 🎊
22:34fdobridge: <gfxstrand> https://www.collabora.com/news-and-blog/news-and-events/nvk-has-landed.html
22:37fdobridge: <karolherbst🐧🦀> @gfxstrand do you wanna test https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24447 or should I just assign?
22:37fdobridge: <karolherbst🐧🦀> though it shouldn't change anything... or is doing nvk weirdo things on that side?
22:37fdobridge: <karolherbst🐧🦀> should probably check
22:38fdobridge: <gfxstrand> I'm gonna test
22:38fdobridge: <gfxstrand> Give me a minute. Need to disable the wifi driver first. 😂
22:38fdobridge: <karolherbst🐧🦀> ohh.. nvk uses `nvc0_program_assign_varying_slots` mhh...
22:38fdobridge: <karolherbst🐧🦀> mhhhh
22:38fdobridge: <karolherbst🐧🦀> it's copied code
22:38fdobridge: <karolherbst🐧🦀> yeah.. `nvk_vtgp_gen_header` needs updating
22:38fdobridge: <karolherbst🐧🦀> lemme
22:40fdobridge: <gfxstrand> I didn't see that updated in nvc0_program.c
22:40fdobridge: <karolherbst🐧🦀> now it should be fine
22:41fdobridge: <karolherbst🐧🦀> probably would have failed to compile as I typed the value now
22:42fdobridge: <karolherbst🐧🦀> not sure I'm gonna find the spoons to rework the input/output stuff, but it shouldn't be _that_ terrible either
22:43fdobridge: <karolherbst🐧🦀> I should do a `s/PIPE_SHADER/MESA_SHADER/` commit 🙂
22:44fdobridge: <karolherbst🐧🦀> mhh there is no `PIPE_SHADER_TYPES` equivalent
22:46fdobridge: <karolherbst🐧🦀> but it's also used in a wierdo way..
22:46fdobridge: <karolherbst🐧🦀> ehh.. I'll deal with it later
22:52fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I think I may have missed some patches (I now rebuilt the kernel with pretty much all the required patches)
23:05fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> And that was it 🐸
23:05fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> https://cdn.discordapp.com/attachments/1034184951790305330/1137159317758431272/Screenshot_20230805_020420.png
23:09fdobridge: <gfxstrand> 🥳
23:10fdobridge: <karolherbst🐧🦀> wasn't there this thing to promote bindless ubos to bound one, or was that quite the pita to actually use in vulkan?
23:11fdobridge: <gfxstrand> Not something easy and automatic to hook up.
23:11fdobridge: <gfxstrand> Some drivers do. ANV does.
23:11fdobridge: <karolherbst🐧🦀> Right..
23:14fdobridge: <karolherbst🐧🦀> might make sense to extract some of it to common code. I might even look into it, because we'll need it anyway for older gens
23:16fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Overwatch 2 still crashes with the new uAPI 🤔
23:17fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> https://pastebin.com/xQKFMFvW
23:19fdobridge: <karolherbst🐧🦀> uhhh
23:20fdobridge: <karolherbst🐧🦀> I think you run into push buffer corruptions 😄
23:20fdobridge: <karolherbst🐧🦀> which... shouldn't happen
23:20fdobridge: <karolherbst🐧🦀> `c597 mthd 05b0 data 20010703` that's totally bogus
23:21fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> That date is older than me 🧓
23:22fdobridge: <karolherbst🐧🦀> @gfxstrand nvk should probably start calling `nv_push_validate` again
23:24fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> @esdrastarsis Is your GPU temperature visible with GSP? And what kernel branch do you use for GSP support now?
23:25fdobridge: <esdrastarsis> No, latest Ben's kernel tree
23:25fdobridge: <esdrastarsis> Nothing related to sensors is visible with GSP
23:32fdobridge: <karolherbst🐧🦀> @asdqueerfromeu mind running with this patch and check if it asserts?
23:32fdobridge: <karolherbst🐧🦀>
23:32fdobridge: <karolherbst🐧🦀> ```patch
23:32fdobridge: <karolherbst🐧🦀> diff --git a/src/nouveau/vulkan/nvk_queue_drm_nouveau.c b/src/nouveau/vulkan/nvk_queue_drm_nouveau.c
23:32fdobridge: <karolherbst🐧🦀> index b387530d90b..1ab4cf6a375 100644
23:32fdobridge: <karolherbst🐧🦀> --- a/src/nouveau/vulkan/nvk_queue_drm_nouveau.c
23:32fdobridge: <karolherbst🐧🦀> +++ b/src/nouveau/vulkan/nvk_queue_drm_nouveau.c
23:32fdobridge: <karolherbst🐧🦀> @@ -477,6 +477,7 @@ nvk_queue_submit_drm_nouveau(struct nvk_queue *queue,
23:32fdobridge: <karolherbst🐧🦀> for (unsigned i = 0; i < submit->command_buffer_count; i++) {
23:32fdobridge: <karolherbst🐧🦀> struct nvk_cmd_buffer *cmd =
23:32fdobridge: <karolherbst🐧🦀> container_of(submit->command_buffers[i], struct nvk_cmd_buffer, vk);
23:32fdobridge: <karolherbst🐧🦀> + nv_push_validate(&cmd->push);
23:32fdobridge: <karolherbst🐧🦀>
23:32fdobridge: <karolherbst🐧🦀> list_for_each_entry_safe(struct nvk_cmd_bo, bo, &cmd->bos, link)
23:32fdobridge: <karolherbst🐧🦀> push_add_bo(&pb, bo->bo, NOUVEAU_WS_BO_RD);
23:32fdobridge: <karolherbst🐧🦀> ```
23:32fdobridge: <karolherbst🐧🦀> need asserts enabled and stuff
23:33fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Also I just noticed that the DXVK frametime graph is now rendering correctly on NVK (so someone definitely did some codegen improvements) 🐸
23:34fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Rebuilding NVK now
23:34fdobridge: <karolherbst🐧🦀> I hope you have `-Ddebug=true` set or something
23:34fdobridge: <karolherbst🐧🦀> ehhh
23:35fdobridge: <karolherbst🐧🦀> `-Db_ndebug=false`
23:35fdobridge: <karolherbst🐧🦀> why have 1 flag if you can have 100
23:35fdobridge: <karolherbst🐧🦀> in theory that change _should_ prevent any garbage to be sent to the GPU 😄
23:36fdobridge: <karolherbst🐧🦀> and I think we want that for CI runs anyway, so it's less instable
23:36fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> `exe: ../mesa/src/nouveau/nvidia-headers/nv_push.c:28: nv_push_validate: Assertion 'push->end != push->start' failed.` 🤔
23:36fdobridge: <karolherbst🐧🦀> sounds like it's working
23:36fdobridge: <karolherbst🐧🦀> uhhh.. or maybe not
23:36fdobridge: <karolherbst🐧🦀> ehh
23:37fdobridge: <karolherbst🐧🦀> you might want to remove _that_ assert
23:37fdobridge: <karolherbst🐧🦀> I think we actually submit empty stuff
23:37fdobridge: <karolherbst🐧🦀> 😄
23:37fdobridge: <karolherbst🐧🦀> yeah, so just delete that ` /* submitting empty push buffers is probably a bug */ assert(push->end != push->start);` part
23:38fdobridge: <karolherbst🐧🦀> I wonder if I should turn it into a _full_ validator
23:39fdobridge: <karolherbst🐧🦀> that would be a _fun_ project
23:43fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I didn't hit the assert this time (I got a DEVICE_LOST though)
23:44fdobridge: <karolherbst🐧🦀> mhh
23:44fdobridge: <karolherbst🐧🦀> could also be some random memory corruption
23:45fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I do have to turn off the 1 << 11 assert for the game to load its characters though
23:46fdobridge: <karolherbst🐧🦀> mhhh?
23:49fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Ekstrand even had to do this: https://gitlab.freedesktop.org/gfxstrand/mesa/-/commit/76067b14d786b62ec003cfd87c18d25f7e9570db
23:50fdobridge: <karolherbst🐧🦀> mhhh
23:50fdobridge: <karolherbst🐧🦀> yeah, well.. that's fine
23:50fdobridge: <karolherbst🐧🦀> or at least shouldn't cause corrupted commands
23:51fdobridge: <karolherbst🐧🦀> mhhhhh
23:51fdobridge: <karolherbst🐧🦀> I think I know what happens
23:54fdobridge: <karolherbst🐧🦀> mind ditching the patch and run with `NVK_DEBUG=push_dump` and upload that entire log?
23:54fdobridge: <karolherbst🐧🦀> it will be _huge_ probably
23:55fdobridge: <karolherbst🐧🦀> ohh
23:55fdobridge: <karolherbst🐧🦀> forget what I said
23:55fdobridge: <karolherbst🐧🦀> run with `NVK_DEBUG=push_sync`
23:55fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> With the nv_push_validate dropped?
23:56fdobridge: <karolherbst🐧🦀> yeah
23:56fdobridge: <karolherbst🐧🦀> push_sync should dump the last failed submission
23:59fdobridge: <karolherbst🐧🦀> I'm sure nvk somewhere submits an empty command, which shouldn't happen, probably some for loop being wrong 🙂