IRC Logs of #nouveau on irc.freenode.net for 2023-07-26

00:21 fdobridge: <karolherbst🐧🦀> I think I'll probably try to figure out this null pointer access tomorrow on pre turing.. should speed up the CTS quite a lot
00:21 fdobridge: <gfxstrand> @karolherbst Do we need to get the helpers patch landed in the kernel before we can land NVK?
00:22 fdobridge: <karolherbst🐧🦀> not necessarily. Ben said we should do it in userspace for 2nd gen maxwell+, so I think NVK still has to call the macro there. It's the same code.
00:22 fdobridge: <karolherbst🐧🦀> for older gens I should ping ben again 😄
00:22 fdobridge: <karolherbst🐧🦀> but the kernel patch will land with a Cc: stable, so whatever
00:22 fdobridge: <gfxstrand> Do we need anything in the kernel to enable userspace?
00:22 fdobridge: <karolherbst🐧🦀> no
00:23 fdobridge: <gfxstrand> Oh. Okay then.
00:23 fdobridge: <gfxstrand> 😄
00:23 fdobridge: <karolherbst🐧🦀> just enable the macro on `chipset >= 0x120`
00:23 fdobridge: <karolherbst🐧🦀> or whatever the check is
00:23 fdobridge: <gfxstrand> So why have I been cherry-picking this kernel patch for months? 😂
00:23 fdobridge: <karolherbst🐧🦀> becuase it's needed on older gens
00:23 fdobridge: <karolherbst🐧🦀> that reminds me.. I wasn't using that patch myself 🙃
00:23 fdobridge: <karolherbst🐧🦀> should have enabled the macro....
00:24 fdobridge: <karolherbst🐧🦀> let me post an MR
00:24 fdobridge: <karolherbst🐧🦀> (and test it)
00:25 fdobridge: <gfxstrand> Ugh... d32s8 broke... 🙄
00:25 fdobridge: <karolherbst🐧🦀> I hope it wasn't me 🙃
00:25 fdobridge: <karolherbst🐧🦀> and if it was, I disagree
00:27 fdobridge: <gfxstrand> I think it was me
00:31 fdobridge: <karolherbst🐧🦀> @gfxstrand https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/241
00:31 fdobridge: <karolherbst🐧🦀> for kepler and 1st gen maxwell we'll need the kenrel patch though
00:32 fdobridge: <karolherbst🐧🦀> oh wow.. with that and my other MR it fixes a bunch of stuff
00:33 fdobridge: <karolherbst🐧🦀> `Pass: 2172, Crash: 2, UnexpectedPass: 70, ExpectedFail: 134, Skip: 10622, Duration: 2:24, Remaining: 6:10:26
00:38 fdobridge: <karolherbst🐧🦀> I pinged Ben on the kernel patch
00:52 fdobridge: <karolherbst🐧🦀> mhhh `dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.dontunroll.volatile.storage_image.fmt_qual.null_descriptor.samples_1.1d.comp` is hitting that null poinetr thing...
00:52 fdobridge: <karolherbst🐧🦀> I wonder why 🤔
00:55 fdobridge: <gfxstrand> Actually, a combination of @airlied , @phomes , and me. 😅
00:55 fdobridge: <karolherbst🐧🦀> mhh.. that test passes on turing
00:55 fdobridge: <gfxstrand> It's possible that null image descriptors don't work on some hardware
00:55 fdobridge: <karolherbst🐧🦀> yeah...
00:56 fdobridge: <karolherbst🐧🦀> apparently 🙂
00:57 fdobridge: <karolherbst🐧🦀> are null descriptors something drivers have to handle?
00:58 fdobridge: <karolherbst🐧🦀> I wonder if we would have to insert a null check or if there is a different way out for us
00:59 fdobridge: <karolherbst🐧🦀> ehh.. wait
00:59 fdobridge: <karolherbst🐧🦀> I think there was something dumb we have to do...
01:02 fdobridge: <karolherbst🐧🦀> ohh looks like this is an option thing at least
01:05 fdobridge: <karolherbst🐧🦀> mhhhh
01:06 fdobridge: <karolherbst🐧🦀> Nvidia seems to enable that on like all GPUs
01:10 fdobridge: <karolherbst🐧🦀> I wonder if we have to bind a fake image view
01:10 fdobridge: <karolherbst🐧🦀> the hardware can do bound checks, so if you have a NULL image and access it out of bounds, nothing bad would happen
01:33 fdobridge: <airlied> I have a wierd failure mode where the machine my ampere card is in, sometimes when it dies, it takes out my crappy gigabit network switch and I have to powercycle it
01:33 fdobridge: <karolherbst🐧🦀> nouveau is quite good at taking down the entire system
03:00 fdobridge: <gfxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24326
03:01 fdobridge: <gfxstrand> Depends on the HW. Turing lets us just fill the descriptor with 0 and we get null surface behavior. IDK if there's something else we have to do on Maxwell or not.
03:35 fdobridge: <gfxstrand> @airlied Out of curiosity, why'd you pull just the new stuff from `nouveau.h` into the `drm-uapi` folder and not the whole header like we've done for other drivers?
03:38 fdobridge: <airlied> can't remember, the plan once it was upstream was to sync over the complete headers
03:39 fdobridge: <gfxstrand> Okay. I'm inclined to sync the version I have in `/usr/include/drm` now and switch everything to use that.
03:40 fdobridge: <gfxstrand> That would leave fewer weird includes scattered around
03:43 fdobridge: <airlied> yeah whatever I did wasn't meant to live a long time
03:46 fdobridge: <gfxstrand> Okay
03:48 fdobridge: <gfxstrand> Ugh... The version I have in /usr/include doesn't have the getparam stuff for some reason
03:48 fdobridge: <gfxstrand> How does anything build?!?
03:48 fdobridge: <airlied> /usr/include/libdrm maybe
03:49 fdobridge: <airlied> guess we are discovering why I didn't copy it over 🙂
03:50 fdobridge: <gfxstrand> Yeah, that's what we appear to be using.
03:50 fdobridge: <gfxstrand> Why are we using deprecated APIs?
03:50 fdobridge: <gfxstrand> Why are they deprecated and removed from the header while nouveau GL is still using them?
03:54 fdobridge: <airlied> one of those skeggsb plans that never made it all the way 🙂
03:55 fdobridge: <airlied> they are in the internal nouveau_abi16.h header file
03:55 fdobridge: <airlied> could probably just put them back in the main header
03:56 fdobridge: <airlied> since it changed 11 years ago and we haven't noticed
03:59 fdobridge: <airlied> bleh it looks like nvidia do viewport updates with an mme macro
04:07 fdobridge: <gfxstrand> Was there something to replace it?
04:10 fdobridge: <airlied> I think the idea was to expose things via the NVIF interface
04:41 fdobridge: <airlied> @gfxstrand did I remember you dumping the nvidia mme macros?
05:45 fdobridge: <airlied> @faith also is the nvk_bo_sync.[ch] include backwards in the meson.build?
07:07 fdobridge: <airlied> ah you fixed tha already
07:44 fdobridge: <airlied> @gfxstrand also https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/242 is probably worth merging if you can give it a run
11:07 fdobridge: <triang3l> not yet 👀
12:57 fdobridge: <gfxstrand> 😅
14:17 fdobridge: <phomes> @gfxstrand you can grab the top commit on https://gitlab.freedesktop.org/phomes/mesa/-/commits/meta-cleanup/ as a fixup if you want to commit the meta code via nvk mr
14:19 fdobridge: <gfxstrand> Can you make that an NVK MR so it doesn't get lost. I'll pull it as part of that
14:21 fdobridge: <phomes> https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/243
14:21 fdobridge: <gfxstrand> Thanks!
14:52 fdobridge: <karolherbst🐧🦀> @gfxstrand mind merging https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/241 so we don't forget about it? 😄
14:56 fdobridge: <gfxstrand> done
14:56 fdobridge: <karolherbst🐧🦀> \o/
14:56 fdobridge: <karolherbst🐧🦀> now I think we have to figure out that image view problem or whatever that is...
14:56 fdobridge: <karolherbst🐧🦀> and I think with that pre turing should be fairly stable
14:56 fdobridge: <gfxstrand> Merged. I also went ahead and rebased it in to vk_meta.
14:57 fdobridge: <karolherbst🐧🦀> @gfxstrand where is that null descriptor stuff handled by the way?
14:59 fdobridge: <gfxstrand> nvk_device.c:179
14:59 fdobridge: <gfxstrand> VK_KHR_fragment_shader_barycentric
14:59 fdobridge: <gfxstrand> nvk_device.c:179
15:00 fdobridge: <karolherbst🐧🦀> mhhhhhhh
15:04 fdobridge: <karolherbst🐧🦀> I think you planned to assert that `null_image_index` is 0?
15:04 fdobridge: <karolherbst🐧🦀> but yeah.. we might have to actually bind a proper image view there
15:04 fdobridge: <karolherbst🐧🦀> tic entry rather
15:05 fdobridge: <karolherbst🐧🦀> let's see...
15:20 fdobridge: <gfxstrand> Do I not assert that? Feel free to add an assert.
15:22 fdobridge: <karolherbst🐧🦀> nope, just added an `ASSERTED` thing
15:22 fdobridge: <karolherbst🐧🦀> maybe you also didn't want to check for that, I dunno 😄
15:23 fdobridge: <gfxstrand> Oh, oops. 😅 Let me rebase in an assert
15:26 fdobridge: <gfxstrand> Done. I actually did have one originally. It just got lost in a refactor.
15:26 fdobridge: <karolherbst🐧🦀> ahh
15:28 fdobridge: <karolherbst🐧🦀> mhh.. so opengl seems to upload a zero sampler at least, but that's not relevant for surface ops
15:29 fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Could NVK support multiple KMDs? :triangle_nvk:
15:30 fdobridge: <karolherbst🐧🦀> if they are upstream, probably
15:33 fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> RADV has secret Windows KMD support, Turnip supports the msm KMD and ANV supports the weird Xe KMD (so it's NVK's turn)
15:33 fdobridge: <gfxstrand> In theory. What other KMDs are you wanting to support?
15:34 fdobridge: <gfxstrand> I'd like to make us some secret Windows KMD support but have had better things to do
15:34 fdobridge: <gfxstrand> I don't want to support nvidia's "open" thing
15:34 fdobridge: <karolherbst🐧🦀> I think we all know where this is going 😛
15:34 fdobridge: <mohamexiety> I guess the main one is open RM but what would that bring us?
15:35 fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> It might be more stable (and faster) than the nouveau KMD right now
15:37 fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> And you don't have to reboot to use NVK if you're using the NVIDIA driver (so the development workflow might be easier)
15:39 fdobridge: <esdrastarsis> Faster than nouveau gsp?
15:40 fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> It's definitely faster than OOTB nouveau without GSP (I can't compare it to nouveau GSP because I gave up rebasing the OGK patch)
15:41 fdobridge: <esdrastarsis> I see
15:44 fdobridge: <gfxstrand> Yes, but we can't support it long-term.
15:45 fdobridge: <gfxstrand> There's no API stability, no guarantee that they won't push a change next week that breaks everything. We can't carry that burden.
15:51 fdobridge: <gfxstrand> Also, I don't want to do anything to lend further legitimacy to OGK because that will discourage Nvidia from working with us going forward.
15:52 fdobridge: <gfxstrand> So, yes, it performs better today but I don't think spending time supporting it is good long-term. We're better off fixing the problems with the nouveau KMD.
15:52 fdobridge: <gfxstrand> And, yeah, that means no NVK + CUDA. I can live with that.
16:01 fdobridge: <karolherbst🐧🦀> nouveau without GSP doesn't have reclocking, so yes, it's slower, but not because of nouveau vs OGK
18:31 fdobridge: <gfxstrand> @karolherbst @mhenning It would be awesome if the two of you could figure out what to do with those couple of codegen patches which are breaking GL. I could dig in but I really don't know that code in the codegen back-end very well. For NAK, I took an entirely different approach.
18:31 fdobridge: <karolherbst🐧🦀> yeah, already on it
18:31 fdobridge: <gfxstrand> Thanks!
18:44 fdobridge: <karolherbst🐧🦀> ohh.. I think I know what broke my patch..
18:47 fdobridge: <karolherbst🐧🦀> mhhh, maybe not..
18:53 fdobridge: <karolherbst🐧🦀> uhhh.. I might have to read up on nvk code for that. maxwell+ is special in how it set ups images and I wouldn't be surprised if gl and nvk do entirely different things here
18:57 fdobridge: <gfxstrand> We don't do anything interesting or special
18:58 fdobridge: <gfxstrand> And with NAK, I got rid of the 3D image hack.
18:58 fdobridge: <karolherbst🐧🦀> yeah.. the issue is more like how the image is bound
18:58 fdobridge: <karolherbst🐧🦀> NVK does it all bindless anyway, GL doesn't
18:58 fdobridge: <gfxstrand> Yeah
18:59 fdobridge: <karolherbst🐧🦀> so for kepler I suspect a texture view is missing, because the `image_size` gets actually lowered to a texture query. On maxwell that's probably fine because on how things are set up and everything
19:00 fdobridge: <karolherbst🐧🦀> we reserve 8 slots for image views on maxwell
19:00 fdobridge: <gfxstrand> https://gitlab.freedesktop.org/gfxstrand/mesa/-/blob/nak/main/src/nouveau/compiler/nak_nir_lower_tex.c
19:01 fdobridge: <gfxstrand> `image_size` gets lowered to `TXQ`.
19:03 fdobridge: <karolherbst🐧🦀> yeah.. but for gl things are a bit funky in this area
19:04 fdobridge: <karolherbst🐧🦀> anyway, I'll take a deeper look tomorrow, but maybe the answer will be to rework the GL side a bit, or just add a compiler flip for that
19:04 fdobridge: <karolherbst🐧🦀> I'm fairly sure the compiler change was fine on kepler for nvk
19:10 fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Does it run vkcube though? :ferris:
19:16 fdobridge: <gfxstrand> Well, I don't have VS or FS fully hooked up yet, so no. 😛
19:23 fdobridge: <airlied> btw I'm not as against getting nvk working on nvidia open/closed rm, but I'd want to see a fully working port and then evaluate it the sanity rather than make a decision we'd definitely include it based on half assed patches
19:27 fdobridge: <airlied> I did run nvidia vulkan talos on the same laptop and there's a very large gap 😛
19:30 fdobridge: <gfxstrand> I think that's because userspace still sucks but time will tell as we continue to improve NVK.
19:31 fdobridge: <airlied> I commented out all pipeline flushes, I got corruption but not a huge fps increase 🙂
19:32 fdobridge: <gfxstrand> That'll do it!
19:32 fdobridge: <airlied> but yeah I'm not taking it too serious until we have a compiler
19:32 fdobridge: <gfxstrand> And a compiler that actually uses things like bindless UBOs instead of global memory fetching the universe.
19:33 fdobridge: <airlied> yup that too
19:42 fdobridge: <karolherbst🐧🦀> yeah.. UBOs will make everything fast
19:57 fdobridge: <mhenning> I'd be happy to poke at it, although I'm pretty busy with non-volunteer things for the next few days, so karol might get there first.
20:05 fdobridge: <karolherbst🐧🦀> @mhenning you sure that `dEQP-GLES31.functional.program_interface_query.uniform.array_size.default_block.types.isampler_2d_ms` crashes for you?
20:08 fdobridge: <karolherbst🐧🦀> because it doesn't work me for weird reasons
20:08 fdobridge: <karolherbst🐧🦀> ohh wait
20:08 fdobridge: <karolherbst🐧🦀> kepler2...
20:08 fdobridge: <karolherbst🐧🦀> mhhh
20:15 fdobridge: <karolherbst🐧🦀> well.. that ain't crashing either
20:23 fdobridge: <karolherbst🐧🦀> ahh yes.. of course
20:24 fdobridge: <karolherbst🐧🦀> invalid memory read 🙂
20:27 fdobridge: <karolherbst🐧🦀> ahh yes
20:27 fdobridge: <karolherbst🐧🦀> @gfxstrand the lod indeed has to come after the sample arg
20:27 fdobridge: <karolherbst🐧🦀> I'll comment on the MR
20:28 fdobridge: <gfxstrand> Okay. That works, too.
20:28 fdobridge: <gfxstrand> As long as GL and NVK are both happy, I'm happy.
20:34 fdobridge: <karolherbst🐧🦀> I've posted a patch in the MR, if you run that through NVK that would be great
20:34 fdobridge: <gfxstrand> Sure. Give me a minute. I've got another run going now.
20:34 fdobridge: <gfxstrand> Well, 15 min. 😅
20:36 fdobridge: <karolherbst🐧🦀> I think offsets and depth compare is also wrong, not sure it matters though
20:37 fdobridge: <karolherbst🐧🦀> uhm...
20:37 fdobridge: <karolherbst🐧🦀> wait a second
20:37 fdobridge: <karolherbst🐧🦀> ISA docs disagree with offsets/depth order 🙃
20:37 fdobridge: <karolherbst🐧🦀> I'll leave that out then
20:40 fdobridge: <gfxstrand> So do I apply that diff by itself? Or revert my patch and apply that diff?
20:41 fdobridge: <karolherbst🐧🦀> revert yours and then apply that diff
20:41 fdobridge: <karolherbst🐧🦀> it's weird though...
20:45 fdobridge: <karolherbst🐧🦀> uhhh...
20:51 fdobridge: <karolherbst🐧🦀> when I was writing that code I just went with whatever worked and didn't cause regressions compared to TGSI, that part was just terrible to deal with
21:22 fdobridge: <gfxstrand> Okay, I'll try to run it now
22:15 fdobridge: <karolherbst🐧🦀> please let me know which tests regress 😄
22:15 fdobridge: <gfxstrand> Nothing on NVK on Turing.
22:15 fdobridge: <gfxstrand> Seems good
22:15 fdobridge: <karolherbst🐧🦀> nice
23:49 fdobridge: <airlied> if anyone has any ideas on depth_range_unrestricted, https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/244 has the start of implementing it, but it's defeating me at the moment.
23:57 fdobridge: <esdrastarsis> I think you enabled depth clip in both MRs
23:58 airlied: yes that one is based on the other one
23:59 fdobridge: <airlied> since they both mess around in the same register, have to keep them in order