IRC Logs of #nouveau on irc.freenode.net for 2024-10-11

01:44 esdrastarsis[d][d]: airlied[d]: Does zink-video work on nvidia proprietary driver?
01:45 airlied[d]: don't know, maybe zmike[d] has given it a spin there, I don't really care about it
01:48 zmike[d]: I haven't, but maybe next week
13:48 tiredchiku[d]: bello, vacation ends in less than 24 hours
13:48 tiredchiku[d]: caught up on the NVK talk, still need to go through Mary's mesh shader one
13:49 tiredchiku[d]: gfxstrand[d]: how do I identify whether mine's ampere A or B
13:51 tiredchiku[d]: gonna get back to figuring out depthClampZeroOne and binding address report as soon as I get home tomorrow evening
13:58 tiredchiku[d]: nvidia's wayland presentation was also quite interesting
13:58 tiredchiku[d]: I do like the push for vulkan as the graphics API across the desktop, afaik even kde had a vulkan roadmap for plasma
14:00 tiredchiku[d]: https://invent.kde.org/plasma/kwin/-/issues/169
14:01 mohamexiety[d]: tiredchiku[d]: A is just the A100 and derived parts (A10, A30, et al) so don't worry about it
14:03 tiredchiku[d]: I see, thanks
14:05 tiredchiku[d]: I don't really understand the hardware, neither the kernel side of things, neither most of the technical words thrown around in these spaces
14:05 tiredchiku[d]: but I have a fancy mallet and free time and I'm gonna contribute in whatever way I can 😅
14:05 tiredchiku[d]: https://tenor.com/view/patrick-star-dumb-duh-gif-13669009
14:06 tiredchiku[d]: me attempting to contribute, circa 2024, colorized
14:07 asdqueerfromeu[d]: tiredchiku[d]: Wouldn't this migration increase the risk of GPU hangs though? :cursedgears:
14:34 tiredchiku[d]: driver skill issue
15:52 gfxstrand[d]: The 1650 I plugged into my desktop this morning says the 1650 issues aren't with NVK. 😕
15:52 tiredchiku[d]: nouveau?
15:52 tiredchiku[d]: (the kernel module)
15:53 gfxstrand[d]: When the time comes to R/E RT, I'm going to be curious to know if they actually deleted the hardware on 1650 or if they just disabled in SW.
15:53 gfxstrand[d]: tiredchiku[d]: Yeah. They're probably kernel issues.
15:53 tiredchiku[d]: gfxstrand[d]: the hardware is missing, afaik
15:53 tiredchiku[d]: a quick look at techpowerup reveals it
15:53 tiredchiku[d]: tho, I do remember the 1660Ti having a small number of "tensor cores"
15:54 tiredchiku[d]: desktop 1660Ti, that is
15:54 mohamexiety[d]: it's a bit weird. so the original TU11x cards have deleted hardware
15:54 mohamexiety[d]: but after a while, they started repurposing TU104 and TU106 dies into GTX 1650s and 1660s
15:56 mohamexiety[d]: those afaik do not have the hardware deleted, and there was even some old AI software that would detect tensor cores and try to use them -- but they were software throttled so it led to really poor performance
15:56 mohamexiety[d]: see https://github.com/LeelaChessZero/lc0/issues/1670
15:56 mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1294327842565259344/image.png?ex=670a9c3d&is=67094abd&hm=0a12ace4400a4458a194f9760550725d9b2fae296de91a475cc5b4853e01265a&
15:56 mohamexiety[d]: this is an NV engineer
15:56 tiredchiku[d]: tl;dr, it lacks the hardware for all intents and purposes :P
15:59 tiredchiku[d]: gfxstrand[d]: I do have a 1660Ti (laptop) if you need things tested
15:59 gfxstrand[d]: This one is a TU106
15:59 gfxstrand[d]: I should ebay an older one
15:59 tiredchiku[d]: mine's a 116, iirc
15:59 tiredchiku[d]: 116M doesn't have the hardware tho
16:02 tiredchiku[d]: I'm kinda tempted to pick up one of those hacked together mobile-chips-but-for-desktop
16:02 tiredchiku[d]: but customs might screw me over .-.
17:14 redsheep[d]: I wonder if it could ever be possible to use the rt hardware on tu106 1600 cards, it's likely not practical to fuse it off so at worst I'd expect it's just not tested for defects. But yes the hardware is definitely missing from tu116, the transistor counts and die sizes make that pretty clear
17:22 gfxstrand[d]: According to Wikipedia, if I buy a 1660 or 1630, it'll be an actual 11x chip
17:32 gfxstrand[d]: I just ebay'd a 1660 TI. If Wikipedia is to be believed, there is only one model and it's always a 116.
17:36 tiredchiku[d]: all the best!
17:41 gfxstrand[d]: It looks like the 1630 is always a 117 but IDK if I believe it.
19:18 airlied[d]: What's the problem with 116/117?
19:26 asdqueerfromeu[d]: airlied[d]: I have no DP audio with GSP enabled for example
19:27 airlied[d]: I doubt that is what gfxstrand[d] is referring to though
19:30 mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1294381598573199522/image.png?ex=670ace4d&is=67097ccd&hm=def4a7029ebe1dd51a83e65096bdc4b1f5dbc007d4e50d0f9d6f974c12d49e62&
19:30 mohamexiety[d]: airlied[d]: someone in xdc chat on matrix mentioned some application crashes (https://gitlab.freedesktop.org/mesa/mesa/-/issues/11901) and faith says that this isn't the only case of 1650s having issues
19:47 airlied[d]: I have tu11x cards just not plugged in
20:20 gfxstrand[d]: Also, when we start bringing up RT features, there's hardware missing and I don't know what. I mean, I know what at the marketing level but not the technical.
20:43 redsheep[d]: I'd bet most of the new instructions full turing has over pascal aren't there
21:50 gfxstrand[d]: Maybe? That would mean no bindless UBOs and a few other things.
21:50 gfxstrand[d]: That would kinda suck but I also wouldn't be too surprised.
21:51 gfxstrand[d]: At the very least, I'm guessing the tensor cores aren't there.
21:57 mohamexiety[d]: it should be just RT and tensor core stuff that gets cut (and even for the tensor cores they actually did add in separate HW for some instructions that are done on tensor cores on RTX GPUs. FP16 is done by tensors on RTX GPUs but GTX 16xx have dedicated HW for it), but we'll see I guess
21:59 gfxstrand[d]: karolherbst[d]: Care to try building https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594 on your macbook? It's failing the aarch64 build in CI.
22:21 redsheep[d]: gfxstrand[d]: Oh I bet that stuff is there
22:24 redsheep[d]: I'm just glad they only opted to have a weird split architecture that one time
22:35 tiredchiku[d]: wouldn't simply running vulkaninfo on GTX and RTX turings give us what has what
22:38 TranquilIty[m]: This message is a test - does the Matrix bridge function from the Matrix side to the Discord & IRC side ?
22:39 blockofalumite[d]: Seems that it does, yipee (sorry for disrupting the convo 🙏)
23:23 karolherbst[d]: gfxstrand[d]: ping me again tomorrow
23:25 gfxstrand[d]: kk
23:25 redsheep[d]: Do you have a branch anywhere for the beginnings of the nvk port to windows?
23:25 gfxstrand[d]: tiredchiku[d]: No. That'll just tell us what extensions are exposed. It won't actually tell us what hardware does or doesn't exist. Things might be emulated or disabled by SW even though HW exists.
23:26 redsheep[d]: I'd definitely like to help with that, maybe more than anything else right now, especially since I can be booted into windows more often
23:26 gfxstrand[d]: redsheep[d]: No. I started doing the very early RE but got distracted building the NVKMD abstraction and never actually wrote any Windows code.
23:26 redsheep[d]: Ah, ok
23:26 redsheep[d]: I like your gcc vision
23:26 redsheep[d]: Obviously
23:26 gfxstrand[d]: If you wanted to work on it, you'd be more than welcome to.
23:27 gfxstrand[d]: The RADV code exists so you can see roughly what WDDM calls to make and the tools I used for the RE are all public.
23:27 gfxstrand[d]: But it can be kind of daunting to dive into that on your own.
23:27 redsheep[d]: Yeah I'll see if I can figure out how your radv branch works
23:27 redsheep[d]: I could even test that with my igpu as a starting point
23:28 gfxstrand[d]: It may not work out-of-the-box because I've hard-coded the device info. But I'm also not using much out of the device info so there's a good chance basic stuff will be okay as long as the HW generation is close enough.
23:28 gfxstrand[d]: Or you can copy and paste your device info from Linux.
23:28 gfxstrand[d]: I literally just printed it in GDB and copied and pasted
23:28 redsheep[d]: Hmm ok, makes sense
23:29 gfxstrand[d]: But it's a pain. RADV has a very big device info. NVK's is just PCI IDs, class versions, and like 2 other things.
23:30 gfxstrand[d]: If you want to take it on, the first thing to get working is device enumeration. If you pull !29945 as your starting point, it has the hooks for device enumeration. You just need to add a wddm2 back-end to NVKMD and start stubbing things out and plumb that through.
23:30 gfxstrand[d]: Second thing is memory allocation and mapping.
23:30 redsheep[d]: I'm worried the Nvidia kmd is further from the nouveau kmd than amdgpu is from their windows kmd
23:31 gfxstrand[d]: It is but it shouldn't be too far from the NVKMD abstraction.
23:31 gfxstrand[d]: NVKMD was very much designed with WDDM2 in mind
23:31 redsheep[d]: I think Dave was right that a good stepping stone would be porting to that on Linux
23:32 redsheep[d]: Ok, I'll poke at it. It's probably over my head but maybe I'll figure some things out
23:33 gfxstrand[d]: Even if it is a bit, you'll probably still learn things
23:34 gfxstrand[d]: And I do want to make it happen one day. There's a friend of mine at NVIDIA who keeps asking when there will be a Windows port because he wants to try it. 😂
23:35 redsheep[d]: Yeah I think the idea has huge potential to end up making mesa better
23:35 redsheep[d]: And I really like the idea of open drivers on windows, windows developers don't hate open source, there just isn't enough momentum when it comes to drivers
23:43 gfxstrand[d]: Yeah
23:45 gfxstrand[d]: And honestly running on the NVIDIA KMD shouldn't be bad. Memory allocation might be tricky to get right and IDK how sparse will work in practice (the WDDM2 bind call doesn't have a `pPrivateData` so there's no place to provide a PTE kind like on nouveau). But submission should be pretty easy. We get a mapped buffer somehow and access to the page with the doorbell and the rest is documented.
23:51 redsheep[d]: Once it works on windows I wonder how much more it would take for nsight to work, if anything. I think it ties back into being able to improve performance which is a big part of it for me