06:08fdobridge: <Sid> for all my friends on arch based distros
06:08fdobridge: <Sid> here's a PKGBUILD I wrote that should pull and build Faith's fork of the kernel, NVK branch, at the latest commit every time
06:08fdobridge: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1236922590359191562/PKGBUILD?ex=6639c56e&is=663873ee&hm=e197b2d3427baaa074b5fe21063e137728cf87efce158d4ad0b5503c9453f361&
15:02fdobridge: <zmike.> @gfxstrand is this the ES week?
15:03fdobridge: <gfxstrand> Uh... maybe? Modifiers should be fine if I don't cross the GPUs.
15:07fdobridge: <zmike.> Uh oh
15:45fdobridge: <redsheep> I still haven't gotten around to testing a zink session on that branch and kernel, had a busy few days. I'll try to get to it today or tomorrow. I'm glad to see gamescope is working, that's promising
15:47fdobridge: <redsheep> Has anyone noted a difference in performance so far? I don't see that anyone has mentioned it here, if so
16:05fdobridge: <gfxstrand> Re-building the CTS now
16:05fdobridge: <gfxstrand> Mostly just to drop the unneeded desktop GL fixes
16:05fdobridge: <gfxstrand> so I'm not carrying patches in my submissions
16:12fdobridge: <zmike.> @gfxstrand try out my script to verify that you'll pass before you waste the time
16:12fdobridge: <zmike.> it should only take a few mins if you have enough cores
16:37fdobridge: <gfxstrand> Ugh... Now X clients can't connect to the display.
16:37fdobridge: <gfxstrand> Reboot time, I guess
16:42fdobridge: <zmike.> M :lul: D I F I E R S
16:43fdobridge: <gfxstrand> Explicit sync also landed and that's screwing things up a bit, too.
16:46fdobridge: <gfxstrand> hrm... I have to start glxgears from the desktop session and then it all works fine.
16:55fdobridge: <gfxstrand> ```
16:55fdobridge: <gfxstrand> Test case 'dEQP-EGL.functional.color_clears.multi_context.gles2.rgba8888_pbuffer'..
16:55fdobridge: <gfxstrand> MESA: error: ../src/nouveau/vulkan/nvk_queue_drm_nouveau.c:486: DRM_NOUVEAU_EXEC failed: No such device (VK_ERROR_DEVICE_LOST)
16:55fdobridge: <gfxstrand> MESA: error: ZINK: vkQueueWaitIdle failed (VK_ERROR_DEVICE_LOST)
16:55fdobridge: <gfxstrand> MESA: error: ../src/nouveau/vulkan/nvk_queue.c:285: Submit failed (VK_ERROR_DEVICE_LOST)
16:55fdobridge: <gfxstrand> MESA: debug: ../src/vulkan/runtime/vk_device.c:288: Timeline mode is ASSISTED.
16:55fdobridge: <gfxstrand> MESA: error: ZINK: vkQueueSubmit failed (VK_ERROR_DEVICE_LOST)
16:55fdobridge: <gfxstrand> ZINK: device lost detected!
16:55fdobridge: <gfxstrand> ```
16:55fdobridge: <gfxstrand> Let's try again
16:58fdobridge: <redsheep> Oh when I started reading that I got excited about clear colors since I thought you might have a reproducer. My time playing with renderdoc seemed to suggest is the issue in unigine valley and heaven like airlied suggested
16:59fdobridge: <redsheep> Really hard to tell for sure though since renderdoc really really likes to crash around the part where the rendering breaks
17:05fdobridge: <gfxstrand> ```
17:05fdobridge: <gfxstrand> Test case 'dEQP-EGL.functional.render.multi_thread.gles2_gles3.rgb888_pbuffer'..
17:05fdobridge: <gfxstrand> MESA: error: ../src/nouveau/vulkan/nvk_queue_drm_nouveau.c:486: DRM_NOUVEAU_EXEC failed: No such device (VK_ERROR_DEVICE_LOST)
17:05fdobridge: <gfxstrand> MESA: error: ZINK: vkQueueSubmit failed (VK_ERROR_DEVICE_LOST)
17:05fdobridge: <gfxstrand> ZINK: device lost detected!
17:05fdobridge: <gfxstrand> ```
17:05fdobridge: <gfxstrand> At least it's consistent...
17:06fdobridge: <gfxstrand> Oh, not consistent
17:06fdobridge: <gfxstrand> And the tests pass by themselves. 🤦🏻♀️
17:41fdobridge: <nishii_ko> is nvk in 24.1? the proprietary drivers are *really* getting on my nerves with the flickering now... -w-
17:41fdobridge: <nishii_ko> https://cdn.discordapp.com/attachments/1034184951790305330/1237096916626702447/image.png?ex=663a67c9&is=66391649&hm=bf1a4b8d8460e1408c34307ba59cea200ccdd66850f5acbdc4e096e692561955&
17:49fdobridge: <rinlovesyou> nouveau is also in 24.0, at least according to the arch pkgbuild
17:49fdobridge: <rinlovesyou> nvk is also in 24.0, at least according to the arch pkgbuild (edited)
17:50fdobridge: <Sid> yeah depending on your distro you can install it right away
17:50fdobridge: <Sid> on arch it's vulkan-nouveau
17:50fdobridge: <rinlovesyou> but it's very behind what we have now, so i'd at least wait until 24.1, and even then, NVK will still not net you the same performance as proprietary in many cases
17:51fdobridge: <rinlovesyou> it made it into the stock `mesa` package in the extra repo
17:52fdobridge: <Sid> oh
17:52fdobridge: <Sid> TIL
17:52fdobridge: <rinlovesyou> but of course you can install `vulkan-nouveau-git` and `lib32-vulkan-nouveau-git` on top of it to be up to date
17:53fdobridge: <rinlovesyou> it's 24.0 though so it's still `nouveau-experimental` there
17:53fdobridge: <Sid> um
17:53fdobridge: <Sid> did it make it into the package, or the PKGBUILD
17:53fdobridge: <Sid> ```
17:54fdobridge: <Sid> pacman[sidpr@constructor ~]$ pacman -Ss vulkan-nouveau
17:54fdobridge: <Sid> extra/vulkan-nouveau 1:24.0.6-2
17:54fdobridge: <Sid> Open-source Vulkan driver for Nvidia GPUs
17:54fdobridge: <Sid> multilib/lib32-vulkan-nouveau 1:24.0.6-2
17:54fdobridge: <Sid> Open-source Vulkan driver for Nvidia GPUs - 32-bit
17:54fdobridge: <Sid> ```
17:54fdobridge: <Sid> .-.
17:54fdobridge: <rinlovesyou> the pkgbuild for `mesa` is for 24.0.6-2, and that's the same version that's on the package
17:55fdobridge: <rinlovesyou> so i would assume that means it made it in
17:55fdobridge: <Sid> yeah so
17:55fdobridge: <Sid> the arch mesa pkgbuild does split packaging
17:55fdobridge: <Sid> which is why I asked if it made it into the package or the pkgbuild
17:55fdobridge: <rinlovesyou> oh i see
17:56fdobridge: <Sid> :pat:
17:56fdobridge: <Sid> look at the pkgname array
17:56fdobridge: <rinlovesyou> so what happens when i just install `mesa`
17:56fdobridge: <Sid> and then the package_vulkan-nouveau() function
17:57fdobridge: <Sid> it installs whatever is packaged by the package_mesa() function in the pkgbuild
17:57fdobridge: <Sid> which is the gallium drivers
17:57fdobridge: <rinlovesyou> oh
17:58fdobridge: <rinlovesyou> god that's confusing
17:58fdobridge: <Sid> split packaging baybee
17:58fdobridge: <Sid> builds everything together
17:58fdobridge: <Sid> but packages it separately
17:58fdobridge: <rinlovesyou> like it makes sense, you don't want amd drivers when you have an nvidia card
17:58fdobridge: <rinlovesyou> but still
17:58fdobridge: <rinlovesyou> like it makes sense, you don't want amd or intel drivers when you have an nvidia card (edited)
17:58fdobridge: <Sid> to ensure the packages are compatible with each other/built on the same base
17:58fdobridge: <Sid> but also remain fairly modular
18:01fdobridge: <Sid> a lot of arch packages employ split packaging
18:01fdobridge: <Sid> including the kernel/kernel-headers/kernel-docs packages
18:28fdobridge: <phomes_> I am testing various games with 28655 and 28937
20:04fdobridge: <gfxstrand> Should be. I'd avoid 24.0 but 24.1 should mostly work. It won't be incredibly fast but it should mostly work.
20:05fdobridge: <nishii_ko> good to know, thanks
21:25fdobridge: <gfxstrand> Should I add a fence kicker to nouveau.ko? I think I just might...
21:30fdobridge: <karolherbst🐧🦀> wouldn't be the first workaround like that I think
21:44fdobridge: <gfxstrand> @airlied IDK if you're around but did you see the thing about shared tiled images blowing up?
21:45fdobridge: <gfxstrand> I think my plan for a kicker would be to actually add it to drm_scheduler. Before it actually times anything out, it would call the "kick" callback which can go kick the tires on any outstanding fences and see if they've actually completed. Then we only process the timeout if they're still not signaled after the kick.
21:46fdobridge: <gfxstrand> Otherwise, we'd need a kick timer running constantly
21:48fdobridge: <karolherbst🐧🦀> yeah...
21:49fdobridge: <karolherbst🐧🦀> I've added something like that to fix a suspend/resume bug
21:49fdobridge: <karolherbst🐧🦀> this beauty: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6b04ce966a738ecdd9294c9593e48513c0dc90aa
21:50fdobridge: <karolherbst🐧🦀> (`nouveau_fence_wait` kicks the fence internally as well or something)
21:52fdobridge: <gfxstrand> Hrm... Maybe I can just hack up the scheduler to attempt a wait?
21:52fdobridge: <gfxstrand> That'd at least tell me if that's what's going on.
21:54fdobridge: <karolherbst🐧🦀> maybe? But I'm also convinced that something is busted with fences, and I couldn't figure out what exactly, but I don't think it's a nouveau problem per se, more like how it's integrated with the other bits
21:55fdobridge: <airlied> I did post a remove busy wait patch a few weeks ago
21:55fdobridge: <airlied> I'm around, but my focus is definitely more a blurred galaxy of stuff than a single point of light
21:56fdobridge: <karolherbst🐧🦀> same tbh
21:56fdobridge: <airlied> so I need modifiers + kernel patch + run CTS to see bad things?
21:56fdobridge: <airlied> https://patchwork.freedesktop.org/patch/589947/
21:56fdobridge: <gfxstrand> No CTS. Just two GPUs and gears
21:57fdobridge: <gfxstrand> Or any other way of using modifiers on a truly shared BO so it's forced into system RAM.
21:57fdobridge: <karolherbst🐧🦀> curious....
21:58fdobridge: <karolherbst🐧🦀> I'm not really familiar enough to tell either way
21:58fdobridge: <karolherbst🐧🦀> I just know that also Ben thinks something is busted with fences
22:00fdobridge: <airlied> I think dakr and myself agreed that code was possibly racy
22:01fdobridge: <gfxstrand> All I know is that the EGL tests pretty routinely time out (which is unrelated to the modifiers thing) and the timeouts appear to have nothing to do with the test itself.
22:01fdobridge: <karolherbst🐧🦀> I'm convinced that this code is racy
22:01fdobridge: <airlied> Throw that patch in your tree and see if it helps or hinders
22:02fdobridge: <karolherbst🐧🦀> mhh, maybe not
22:03fdobridge: <karolherbst🐧🦀> the `base.flags` part got my attention
22:03fdobridge: <karolherbst🐧🦀> (inside `nouveau_fence_done`)
22:04fdobridge: <karolherbst🐧🦀> but that might also be fine
22:04fdobridge: <zmike.> @gfxstrand if you're running on x11 sometimes the tests can time out if your display is blanked
22:05fdobridge: <zmike.> (this isn't specific to nvk)
22:06fdobridge: <gfxstrand> Display isn't blanked and I'm running with `vblank_mode=0`
22:07fdobridge: <gfxstrand> @airlied For the tiled + system RAM fault: https://gitlab.freedesktop.org/drm/nouveau/-/issues/357
22:08fdobridge: <gfxstrand> Building now
22:09fdobridge: <gfxstrand> If I thought this were restricted to modifiers, I might not care too much. However, I'm pretty sure it's a general issue and it seems like the sort of thing that would bite us in the butt hard in high-pressure situations.
22:33fdobridge: <gfxstrand> @airlied
22:33fdobridge: <gfxstrand> ```
22:33fdobridge: <gfxstrand> Test case 'dEQP-EGL.functional.render.multi_thread.gles2.rgb888_pbuffer'..
22:33fdobridge: <gfxstrand> MESA: error: ../src/nouveau/vulkan/nvk_queue_drm_nouveau.c:486: DRM_NOUVEAU_EXEC failed: No such device (VK_ERROR_DEVICE_LOST)
22:33fdobridge: <gfxstrand> MESA: error: ZINK: vkQueueWaitIdle failed (VK_ERROR_DEVICE_LOST)
22:33fdobridge: <gfxstrand> MESA: error: ../src/nouveau/vulkan/nvk_queue.c:285: Submit failed (VK_ERROR_DEVICE_LOST)
22:33fdobridge: <gfxstrand> MESA: debug: ../src/vulkan/runtime/vk_device.c:288: Timeline mode is ASSISTED.
22:33fdobridge: <gfxstrand> MESA: error: ZINK: vkQueueSubmit failed (VK_ERROR_DEVICE_LOST)
22:33fdobridge: <gfxstrand> ZINK: device lost detected!
22:33fdobridge: <gfxstrand> ```
22:34fdobridge: <gfxstrand> It's always the multi-thread tests.
22:34fdobridge: <gfxstrand> I wonder if it's something to do with sharing fences across contexts.
23:04fdobridge: <airlied> uggh don't have an easy nvidia machine with 2 PCIEs slots at the moment. I suppose I could use laptop + dGPU box
23:06fdobridge: <karolherbst🐧🦀> ~~will nova support dGPU boxes from the get-go~~