IRC Logs of #nouveau on irc.freenode.net for 2025-01-19

01:18 gfxstrand[d]: RE: Vulkan in compositors
01:18 gfxstrand[d]: I don't think it will be too many more years before there are features in Vulkan that aren't in OpenGL that compositors will want. Like the ability to look at timeline semaphores from shaders. IDK that they'll need to go Vulkan-only any time soon but the Vulkan backend may be better.
01:18 gfxstrand[d]: That's not me promising any such features on any particular timeline but my spidey senses say something like that may be coming.
01:19 gfxstrand[d]: But I would be sad if major desktop or toolkits dropped GL too quickly. And things like xfce can continue to support it forever.
01:21 orowith2os[d]: clearly everything should just use wgpu /j
01:21 orowith2os[d]: (I would unironically say that, but I know some things can't due to it being too limited in some cases)
01:28 Lynne: no pointers in wgsl; its a toy
01:29 Lynne: I looked, I hate glsl _that_ much
03:23 gfxstrand[d]: Yeah, all the fancy features in WGSL are fake. It has to be able to compile down to some truly ancient SPIR-V variants as well as MSL.
07:06 tiredchiku[d]: how much of nvkmd_pdev and nvkmd_dev is shared with nvk's vkPhysicalDevice and vkDevice?
07:09 tiredchiku[d]: ah the entire structs are passed
07:09 tiredchiku[d]: meaning I should probably try to retain the structs as they are on nvrm too
08:42 gradualweist: As i said, it's that i have everything , and i would commit otherwise everything needed for old hw and new too, best would perform FPGA type of fabric overall. But we do not have so strong military to guard for terrorism, actually sanctions on the market has to be placed for imports into certain areas etc. That ecosystem has to be built before. In other words we have all the software , and
08:42 gradualweist: you should be patient not to commit out of shoes lines anyhow, continue your great work as nothing happened, bring up all the new hw, and i say as long as you bring in +- and memory management, i am your backup plan right away, i will help you once again out, but indeed it is not clever to introduce those algorithms under the hood to all parties in the world. Contributions of mine
08:42 gradualweist: otherwise would not be very complex.
08:43 gradualweist: i would participate in OS work or driver bring up my own too, but i have no point to do this, because no one pays me for it.
14:08 verifiedgangsta: there is no actual complexity which is why if you were anyone on this planet, you'd had figured out it long time ago, that on indexing i use three powers to measure a mate to the number, that is used top and bottom subtract operands both, where as to cancel or permit a value with the help of the mate, we use a number gotten for both from last power in pseudo code, and that is the
14:08 verifiedgangsta: simplest ever method against the magic value of all combinations, to filter that out from the data banks, but if you never ever worked hard in your life with your head, such as the estonians who will get killed soon and use computers to do fraud such as wank tunes and providing wrong info the world and lifting your moods up by trashing others, it's not only expected that you get killed
14:08 verifiedgangsta: alike , but it's also expected that you are deadly stupid too.
15:58 phomes_[d]: mhenning[d]: I tested with X4 Foundations, Serious Sam 2017, Evil Genius 2, and Counter strike 2. No difference in fps reported by mangohud. I should probably log the use of instancing in each game so we know if the games are relevant for testing of this...
18:31 mhenning[d]: Thanks for testing!
18:33 mhenning[d]: I was hoping it would fix the framerate issues on Baldur's Gate 3 since the renderdoc capture I have shows it using quite a bit of instancing. But no change there, either.
18:36 karolherbst[d]: I doubt on that level anything is really noticable, because shader execution is prolly more relevant. Might show up in high fps "benchmarks" tho
18:36 karolherbst[d]: or anything needing high draw counts
18:43 mhenning[d]: You might think so, but this BG3 trace has me scratching my head a bit. Improve the compiler to reduce alu usage? No change. Improve shader memory bandwidth? No change. So I was wondering if it was something silly like being bottlenecked by this MME loop but that also made no change.
18:43 karolherbst[d]: might be cache
18:43 karolherbst[d]: but dunno
18:43 karolherbst[d]: maybe it's just CPU bound?
18:44 karolherbst[d]: ohh wait.. it's a trace
18:44 karolherbst[d]: still.. might be cpu bound
18:46 mhenning[d]: My current hunch is that we're doing too many WFIs around clears, but we'll see if that's actually the issue or not
18:46 mhenning[d]: I don't think cpu usage is that high
18:46 karolherbst[d]: possibly, but then you'd still see perf gains elsewhere I think
18:47 karolherbst[d]: well....
18:47 karolherbst[d]: perf counters aren't wired up yet
18:47 karolherbst[d]: they might tell
18:48 mhenning[d]: Yeah, it might be worthwhile to invest in some better profiling
18:48 karolherbst[d]: could start with the engine counters, they aren't that hard, just needs kernel patches
18:48 karolherbst[d]: and they can tell how long all those engines are busy
18:48 karolherbst[d]: or well.. how much they are busy
18:49 karolherbst[d]: I'm sure GSP has an API for them because it's the thing you want to do in firmware
18:52 mhenning[d]: Yeah, I'm hesitating on that because I actually haven't ventured into kernel space so far
18:52 mhenning[d]: but yeah, I might take a crack at that if none of the kernel people want to work on it
18:53 karolherbst[d]: the vgpu header has `GET_ENGINE_UTILIZATION` mhhh
18:53 mhenning[d]: I guess the old gl driver also has that method of reading some perf counters by dispatching shaders which could be done without kernel changes
18:53 karolherbst[d]: it needs kernel changes
18:53 karolherbst[d]: it's a privileged thing
18:54 karolherbst[d]: and you need to ask the kernel to set it up...
18:54 karolherbst[d]: ohh wait
18:54 mhenning[d]: at the expense of being a little more overhead
18:54 karolherbst[d]: we have this mme macro
18:54 karolherbst[d]: the thing is...
18:54 karolherbst[d]: configuring the slot is an entire black box
18:54 mhenning[d]: we might be talking about different perf counters right now
18:54 karolherbst[d]: so one needs to re nvidia tools to know what bits give you what counter
18:54 karolherbst[d]: nah
18:55 karolherbst[d]: the slots are configured on the context switched gr mmio regs
18:55 karolherbst[d]: and you can read them out via shaders
18:55 karolherbst[d]: we used the sw class on previous gens to configure it
18:56 karolherbst[d]: I think....
18:58 karolherbst[d]: yeah...
18:58 karolherbst[d]: `BEGIN_NVC0(push, SUBC_SW(0x0600), 1)`
18:59 mhenning[d]: I was talking about the stuff in nv50_query* in the gl driver.
18:59 karolherbst[d]: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c?ref_type=heads#L2335
18:59 karolherbst[d]: yes
18:59 karolherbst[d]: I do to
18:59 karolherbst[d]: too
19:00 karolherbst[d]: there are somw sw methods we added so userspace can configure the mp counters
19:00 karolherbst[d]:but
19:00 karolherbst[d]: I'm not at all familiar with the details
19:00 karolherbst[d]: but the SET_PRIV_REG mme macro should work for those as well
19:01 karolherbst[d]: so I don't think it will require actual changes
19:01 karolherbst[d]: but dunno.. might have to allow access to certain regs and figure out the allow/deny listing stuff
19:01 karolherbst[d]: those things are kinda security relevant because of side-channel stuff, so who knows
21:27 gfxstrand[d]: mhenning[d]: That could be. I know Nvidia has some pretty complex logic in their driver to try and avoid WFI whenever possible. We're pretty naive right now. We're being especially dumb around copies.
23:54 Elon_Musk: Starlink today & the 1st Month is on me ! Limited Time Offer/ PROMO CODE YOLO ! http://star.linkrelay.com/
23:56 Ermine: gfxstrand: ^