IRC Logs of #nouveau on irc.freenode.net for 2024-01-07

00:01 fdobridge_: <pac85> Yeah they re the game and patch it
00:02 fdobridge_: <pac85> It's like open source with extra steps
00:03 fdobridge_: <pac85> Yeah they decompile the game and patch it (edited)
00:05 fdobridge_: <pac85> Also I think some mods have interfaces for plugin like mods
00:06 fdobridge_: <pac85> Yeah they decompile the game and distribute patches (edited)
00:42 fdobridge_: <redsheep> Yeah, most simple or well written mods can just work through the mod apis and avoid changing game code beyond what fabric or forge changes. I would be pretty surprised if that's true of the vulkan mod though, it breaks compatibility with other mods that touch rendering.
00:50 fdobridge_: <redsheep> Microsoft even publishes obfuscation maps to make it much easier for those who develop the mod apis, or those modding their code directly.
01:04 fdobridge_: <karolherbst🐧🦀> why not just keep the class/interface/method names then? 😄
01:07 fdobridge_: <redsheep> Well it's java so it compiles to bytecode, the obfuscation maps make it so you can decompile the bytecode and then easily deobfuscate it. They could just skip all those steps and publish the source, but they're Microsoft so they're obviously not going to do that.
01:08 fdobridge_: <pac85> Bytecode isn't obfuscates by default though right?
01:08 fdobridge_: <karolherbst🐧🦀> yeah.. you can keep the original class/method names
01:08 fdobridge_: <karolherbst🐧🦀> like that's how you develop against jars
01:08 fdobridge_: <pac85> So, why would you onfisvuatw it then publish maps
01:09 fdobridge_: <redsheep> Oh? Yeah I don't know then. Just know what I got from developing a really basic mod a few years ago. That does seem like a weird choice if they could just toggle that.
01:10 fdobridge_: <redsheep> Does it make the bytecode smaller?
01:10 fdobridge_: <karolherbst🐧🦀> maybe they don't want to break compat 😄
01:10 fdobridge_: <karolherbst🐧🦀> yeah, also that
01:10 fdobridge_: <karolherbst🐧🦀> though a jar file is a zip file
01:11 fdobridge_: <karolherbst🐧🦀> so it really doesn't matter all that much. Maybe the CPU overhead is also slightly lower?
01:11 fdobridge_: <redsheep> Right, but it might help a little when like a billion people will redownload every new update
01:20 fdobridge_: <pac85> The C# IL thing is similar, many unity games can be decompiled and you pretty much get all the function interface and class names
01:33 fdobridge_: <redsheep> Anyway, just to get back to the channel topic, I am still trying to figure out how to turn on tile based rasterization and the other perf related hardware features we were talking about a couple weeks ago. Nvidia's headers were mentioned as being potentially useful but after a few hours of research there I don't think IU found what I was looking for.
01:33 fdobridge_: <redsheep> It was mentioned that I would need to capture and inspect some traces, but after a while researching that I can't see how I could capture the right kind of traces.
01:35 fdobridge_: <redsheep> Does anybody have a doc somewhere on how to capture those, or have anything else to point me in the right direction?
01:36 fdobridge_: <redsheep> I imagine the important interaction is happening between nvidia's userspace driver and kmd but I have no idea how you would get a trace or how you would even begin to try to inspect that.
01:37 fdobridge_: <pac85> I think you might want to capture a command stream, do you have some test applications where tiling is not enabled so you can comoare register values?
01:38 fdobridge_: <redsheep> See that's just the thing, can applications even have it off? My understanding was that when maxwell introduced TBR it was transparent, they went to some lengths to make it so applications aren't aware of it.
01:40 fdobridge_: <redsheep> So I guess I would just need a command stream from the nvidia drivers and then from nvk, right? There are probably quite a few other differences that will make it harder to tease out.
01:42 fdobridge_: <pac85> I meant like, if you know a case where it gets implcitily disabled you will then have relatively similar command streams and therefore a limited amount of things to sift through
01:43 fdobridge_: <pac85> But I guess if you have the register names it could be enough to just look at a dump from the nv driver
01:45 fdobridge_: <redsheep> Hmm I wonder what might cause it to turn off, I will look into that. Are the register names published anywhere? That would be something to look into as well
01:48 fdobridge_: <pac85> https://github.com/NVIDIA/open-gpu-doc/tree/master/classes/3d
01:48 fdobridge_: <pac85> This right?
01:51 fdobridge_: <redsheep> I see what you mean now, this is the same info as https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/nouveau/headers/nvidia/classes?ref_type=heads
01:52 fdobridge_: <redsheep> I already looked through this, this what I was referring to when I mentioned nvidia's headers, I just didn't match that up with what you were saying about register naming, that makes sense now.
01:53 fdobridge_: <redsheep> There's plenty of stuff in here about tiling but I didn't find anything that was clearly there for enablement, but maybe it is in here after all. I just didn't really understand what I was looking at.
01:54 fdobridge_: <redsheep> Ok, I think I have some direction now, thanks.
02:08 fdobridge_: <bylaws> see deko3d
02:09 fdobridge_: <bylaws> I iirc the tiled cache regs arent in the oss headers
02:11 fdobridge_: <bylaws> https://github.com/devkitPro/deko3d/blob/master/source/maxwell/engine_3d.def#L234
02:13 fdobridge_: <redsheep> Woah alright, yeah looks like they pretty much already did the work
02:13 fdobridge_: <pac85> Uh nice
02:14 fdobridge_: <bylaws> https://github.com/devkitPro/deko3d/blob/a51e76225b251141a42b11588def717aec39aab1/source/maxwell/gpu_3d_base.cpp#L231
02:14 fdobridge_: <bylaws> innit
02:14 fdobridge_: <bylaws> init (edited)
02:14 fdobridge_: <bylaws> https://github.com/devkitPro/deko3d/blob/a51e76225b251141a42b11588def717aec39aab1/source/maxwell/gpu_3d_base.cpp#L207 (edited)
02:15 fdobridge_: <pac85> They might be different on newer GPUs right? Though since tile size is there it should be easy to find them in a dump since it's easy to make it use different tile sizes
02:16 fdobridge_: <redsheep> Yeah I was about to say I don't have a maxwell gpu, but that's a good point, the trianglebin test application would potentially make it pretty easy to tease out on ada that way.
02:21 fdobridge_: <bylaws> I don't thunk nv move stuff around that much
02:21 fdobridge_: <bylaws> maybe
02:24 fdobridge_: <pac85> Mmm I see, then maybe worth trying to play around with those regs in nvk. It'd be e interesting to know whether it just renders to cache and the tile size won't break rendering if gotten wrong like on amd or whether it must be gotten right
02:26 fdobridge_: <redsheep> I would bet if it can be wrong at all something might still explode if the tile size is set too large, but once I can enable it at all it should be pretty easy to incrementally push it larger. I can also just count pixels when using trianglebin on the blob.
02:40 fdobridge_: <redsheep> I thought I might just go ahead and try to write down some tile sizes for AD102 and they're surprisingly massive, maybe due to the massive cache? Even at 8x msaa with 128bpp pixel format they appear to be 512x1024, and without any msaa and at 32bpp they're absolutely enormous at 4096x2048.
02:40 fdobridge_: <redsheep> No wonder I had a hard time telling the difference between tiled on and off, you could fit an entire 1080p framebuffer into one tile up to 4x msaa
02:48 fdobridge_: <redsheep> Hmm. Yeah it appears that the calculation is as simple as samples\*bpp\*x\*y < l2 cache size with it preferentially having larger y values. That might mean that the optimal tile sizes are different between chips that cut down that cache vs ones that do not
02:54 fdobridge_: <redsheep> I don't have any other gpus to confirm but I would bet that holds true of the l2 cache on maxwell and ampere, so this might actually be really easy.
03:04 fdobridge_: <redsheep> I got somebody on a GA104 to test it and no, it doesn't appear it is that simple at all. That chip has 4MB of L2 cache, and appears to be making tiles that would be 16 or 32 MB, which doesn't make any sense. Maybe there's compression, or maybe it's just got different silicon that the data is going to.
03:07 fdobridge_: <redsheep> GA104 has 128x256 tile size at 128bpp and 8x msaa.
06:47 fdobridge_: <!DodoNVK (she) 🇱🇹> https://youtu.be/8c2AbwLv6BY :triangle_nvk:
06:54 EisNerd: someone an idea how to work around #188, so I get my dedicated accelerator back working with recent kernels?
06:55 EisNerd: (compiling the kernel is no issue for me)
09:13 EisNerd: btw the log doesn't work, looks like php was disabled for the location
10:07 EisNerd: looks like the log doesn't work, as php was disabled in this location
10:09 EisNerd: Can anyone read this, I get slowly the impression I'm set to no.voice for some reason?
10:10 HdkR: I can read it
10:11 HdkR: +M is set so you're authenticated with nickserv
10:13 HdkR: As for you question about the kernel issue, I'm unrelated so I can't give an answer.
10:40 karolherbst: uhh.. somehow gitlab never sent me any notifications for that bug...
10:41 karolherbst: EisNerd: still happening with mainline? But anyway, I don't think I ever hit this bug on any of my gpus, maybe Lyude has, but I think what might help is to figure out what commit actually broke it
11:19 EisNerd: as said I build my kernel anyway, so if someone comes up with an educated guess, we can work on it
11:19 EisNerd: the other point is, can we disable the audio part, so I can use the accelerator? As I don't use HDMI auido anyway
11:49 karolherbst: EisNerd: what would that achieve?
12:15 EisNerd: the error message seems to imply that it is in the hd audio part, so maybe if we disable this part of the code, the rest may become usable again
12:17 EisNerd: and if this works, we can focus on commits regarding the audio relevant code
12:22 EisNerd: btw does someone know if the GK106GLM [Quadro K2100M] can provide dx 11 fl 10, via vulkan + dxvk in latest wine/proton?
12:24 EisNerd: as my Intel HD Graphics 4600 (HSW GT2), seems not to be able to
12:30 EisNerd: ok, just tried to load nouvea, while laptop is running with X => immediate freeze
12:31 EisNerd: so this is still an issue
12:32 EisNerd: btw "echo 'auto' > '/sys/bus/pci/devices/0000:01:00.0/power/control'" at least stops the gpu from heating for nothing
12:35 EisNerd: I'll need to check if I can get a git checkout (kernel) to build
12:35 EisNerd: and if it works without genpatches
13:58 EisNerd: is there some specific git repo I should use?
15:19 EisNerd: seems the regression was introduced with 6.0
15:30 EisNerd: any idea if nouveau relys on code in drm/msm?
15:47 fdobridge_: <Sid> shouldn't
15:54 EisNerd: most likely b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
15:54 EisNerd: I'll try to compile a kernel right before and one right after
15:55 EisNerd: then I can boot and check if it loads
15:55 EisNerd: pulling kernel git takes serious time
15:57 EisNerd: damn can't search right now on github, can someone tell me where "nvkm_vmm_new_" could be found?
16:05 EisNerd: found it
16:17 karolherbst: EisNerd: it doesn't have anything to do with audio
16:19 karolherbst: well.. maybe the audio stream setup in dp is broken, but the warning isn't necessarily related to the crash you are seeing
16:19 karolherbst: and for the git repo, you can just use whatever kernel.org provides
16:31 EisNerd: ok, I'll try right after the above mentioned commit
19:23 EisNerd: ok, I think I'll first check if the kernel boots with nouveau blacklisted and then I'll try to load it
23:28 EisNerd: ok I found the commit as expected, the already named one
23:29 EisNerd: I build b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1 => modprobe produces error
23:30 EisNerd: I build 12b68040a5e468068fd7f4af1150eab8f6e96235 => modprobe works