00:09 gfxstrand[d]: Well, *Dragon Age: The Veilguard* is now running at 25 FPS...
00:11 gfxstrand[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33571
00:13 gfxstrand[d]: Not quite as good as the 15-20x improvement I got before, but I'll happily take 3.5x. 😁
00:15 gfxstrand[d]: "Oops, we were reading from WC maps" explains quite a bit of stalling.
00:30 gfxstrand[d]: Maybe this means we're now GPU bound and some of my other perf attempts will actually make a difference? <a:shrug_anim:1096500513106841673>
00:36 gfxstrand[d]: I knew 7 FPS had to mean something really stupid.
00:50 gfxstrand[d]: This does make me wonder, though... I didn't think VKD3D used HOST_ONLY descriptor sets with the EDB path.
01:15 redsheep[d]: I finally have some time to do some testing, since that is merged I think I will give main a shot to see how things are now. Might be good to rebase nvk/virt-cbs as well
01:15 gfxstrand[d]: Sure
01:16 gfxstrand[d]: I don't think today's patch will make any difference for D3D11 titles, BTW. It's used pretty heavily by some D3D12 things, though.
01:17 redsheep[d]: What do you expect for native titles?
01:17 gfxstrand[d]: No change
01:18 gfxstrand[d]: Descriptor copies aren't used much by native apps AFAIK. We mostly stuck them in the API because Microsoft did.
02:27 redsheep[d]: I have finally got a plasma wayland+zink+nvk session working for the first time on my system which I am very glad to see. Unfortunately calling it working is maybe an exaggeration. I can only have one monitor on at 60hz and any attempt to change the display config results in it either not doing it, locking my entire machine up, or crashing some parts of the session. The discord client also flashes
02:27 redsheep[d]: wildly, though the old kind of corruption might be gone. Might also just be that whatever part corrupted before is getting messed up and reset now though.
02:40 redsheep[d]: Fortunately plasma x11+zink+nvk seems to be working better than before, almost working as expected. I think the discord/chromium corruption is actually gone. I see a little bit of flicker with my terminal and a couple other things but this is much closer to viable than it was a few months ago.
02:51 redsheep[d]: Oh that's cool, talos 1 has about 7% more performance, but more impressively talos 2 launches now! I mean I only get like 20 fps but... this game was absolutely impossible to run before. I tried everything I knew when I tested this a few months back.
03:09 gfxstrand[d]: Cool
03:10 gfxstrand[d]: I'm guessing the perf is from cbuf textures but I'm not sure.
03:10 gfxstrand[d]: I've never tried Talos 2
03:15 redsheep[d]: I just went from main 3 days ago to the latest and talos 2 doesn't gain any performance. Perf is probably about 20-25% that of prop but still, it works, and with some upscaling and not ultra settings it's not bad
03:16 mhenning[d]: gfxstrand[d]: or maybe inferring ldg.constant on read-only ssbos
03:18 redsheep[d]: Talos 2 is probably a very abnormal test case, being unreal 5 and all, and making very heavy use of nanite and I think software lumen as well. I think this is the first unreal 5 game I have had working, so there's probably some trickery needed to get more performance
03:20 mhenning[d]: Oh, we fixed the first descendant in https://gitlab.freedesktop.org/mesa/mesa/-/issues/12273 which is also ue5. maybe that's what fixed it
03:20 redsheep[d]: mhenning[d]: Ah, that would make sense. That's great
03:22 redsheep[d]: I might be wrong, but I think that latest MR did give a little bit of perf in the witness. It's hard to tell for certain because I don't have the test setup that I had before to be able to go back and forth between a load of relevant builds.
03:23 redsheep[d]: I had to scrap everything and start over
03:24 gfxstrand[d]: redsheep[d]: TBH, I'm going to call nanite working at all a big win at the moment. 😁
03:25 gfxstrand[d]: mhenning[d]: Oh, neat. I missed that. It happened while I was off. I should go read through the code.
03:26 mhenning[d]: Oh, yeah, that was one of the bigger changes while you were out
03:27 gfxstrand[d]: I saw a couple different bug reports about cbufs causing panics. Hopefully that fixes all of them.
03:29 redsheep[d]: I don't have any recent before testing to compare to but deep rock galactic was one of the most inexplicably low performing games on dx12 and it runs great now, like I would actually play this. Good frametimes and all, no upscaling needed
03:29 redsheep[d]: If any of my testing so far has to do with your latest MR, it's this
03:30 gfxstrand[d]: Basically all D3D12 titles are going to hit the ld.constant and descriptor thing hard. There's probably some more low hanging fruit if we can find it but I think those are probably the biggest wins we're going to find.
03:32 gfxstrand[d]: But getting the big things out of the way hopefully means we can actually see and measure the smaller things.
03:33 redsheep[d]: I think I ran this at fsr2 performance before, so 1080p, and struggled to get 60 fps. Now I get about 100 fps with native 4k. I think prop is probably still around double this but still, this is running massively better
03:34 gfxstrand[d]: If prop is 2x at 4k, that's probably image compression.
03:35 gfxstrand[d]: I expect color compression to be on the order of 20-50%, depending on title, maybe more for heavy MSAA.
03:36 gfxstrand[d]: I'm sure we're still hurting on the descriptor front, too, but I'm not 100% sure what we're going to do about that.
03:37 redsheep[d]: That's probably going to have a lot to do with whether a given game is currently memory bandwidth starved, but right now I only have guesswork to tell whether that is the case. I hope the performance gains are that big though.
03:37 redsheep[d]: That would be really awesome
03:37 gfxstrand[d]: Everything is always bandwidth starved
03:38 gfxstrand[d]: The question is whether or not the starvation is in places where compression helps.
03:39 gfxstrand[d]: But we'll see
03:40 gfxstrand[d]: I also need to figure out how to make structure buffers not suck. It may be sufficient to just add some prediction.
03:41 redsheep[d]: Looks like the last few months of changes have collectively won about 50% in doom eternal. That's still 20>30 fps but still, big improvement
03:42 redsheep[d]: I feel like this game will continue to be one that I would say is a good one to look into, that's still about 10-12% of prop performance
03:42 gfxstrand[d]: And it's a native title, isn't it?
03:42 redsheep[d]: Yeah it is
03:48 gfxstrand[d]: That might be a good one to look at then. I just wish I felt confident that looking at a game would mean I could figure out how to improve it. :frog_upside_down:
03:49 airlied[d]: sometimes looking through radv history for the game name can give hints 😛
04:02 gfxstrand[d]: Yeah...
04:09 gfxstrand[d]: I also really need to get my test rig set up so I can swap between the blob and nouveau. I have no point of comparison for anything right now.
04:10 gfxstrand[d]: Even just being able to look at timestamps on a renderdoc trace with one vs the other might tell me where to focus.
07:29 djdeath3483[d]: gfxstrand[d]: : any objection on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33548 ?
08:26 snowycoder[d]: gfxstrand[d]: I've made a tiny script that swaps between nvidia and nouveau and builds mkinitcpio. You need to reboot but you can swap in 1 minute. I use it to game on heavier titles
08:27 tiredchiku[d]: or just have 2 entries in your bootloader with different modules blacklisted kernel side
08:28 marysaka[d]: On my side I have nvidia drivers blacklisted by default and a custom grub entry that boot latest kernel without the blacklist
08:28 marysaka[d]: (tho it's quite manual and I was lazy to automate this)
08:29 snowycoder[d]: tiredchiku[d]: Yeah I didn't want to create arch upgrade hooks😅
08:29 tiredchiku[d]: I have 2 kernels installed, with the nv driver installed for only one of then
12:46 phomes_[d]: gfxstrand[d]: no win on Age of empires IV running VKD3D 😭 . But I also only see it using VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_EXT three times
22:44 gfxstrand[d]: Turns out we're not really getting the EDB path in VKD3D. Womp womp.
22:44 gfxstrand[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33589
22:50 gfxstrand[d]: https://github.com/HansKristian-Work/vkd3d-proton/pull/2346
22:59 snowycoder[d]: Sorry but, what does EDB stand for?
23:00 redsheep[d]: EXT Descriptor Buffer
23:00 redsheep[d]: Vulkan extension that's supposed to make vkd3d go fast, and it kinda does but mostly on AMD lol
23:26 gfxstrand[d]: Yeah. It's actually pretty much the same on NVK as far as I can tell.
23:41 gfxstrand[d]: Okay, I've confirmed that those two MRs should get us the EDB path and that the EDB path does not suffer from the WC map read-back issues fixed with my `HOST_ONLY` descriptors patch.
23:41 gfxstrand[d]: So we'll get this fixed multiple ways.
23:42 gfxstrand[d]: Annoyingly (but not surprisingly), the EDB path has exactly the same perf as the fixed descriptor set path.
23:47 gfxstrand[d]: redsheep[d]: EDB actually isn't any worse than NVK's current "normal" descriptor path. I intentionally designed the two to be as close as possible. All the optimizations we do for normal descriptors also work with EDB.
23:47 gfxstrand[d]: But also, I've been complaining about our normal descriptor path being shitty so take that into account. 😂
23:51 gfxstrand[d]: But also, D3D12 has lots of indirection and we'd like to get rid of that.
23:57 redsheep[d]: gfxstrand[d]: I was more referring to it being a problem for nvidia prop. Are you saying nvk doesn't have to suffer from whatever is killing perf over there? I wasn't ever clear on exactly what that was, was it just the indirection?