01:26mangodev[d]: question
01:26mangodev[d]: what is a subchannel split? i've seen them talked about a ton with general nouveau/nvk development, but i can't really find what it means or refers to
01:26mangodev[d]: is it a type of branch?
01:37mangodev[d]: while i'm at it
01:37mangodev[d]: what *is* libpoly? is it relevant to most mesa drivers, or does it only pertain to specific ones?
01:38mangodev[d]: i'm aware that it's split from the asahi driver's source code, but i'm not 100% what it's for, as there's no documentation on mesa3d.org about it
01:39sonicadvance1[d]: libpoly{gon} for fancy things around tessellation.
01:39sonicadvance1[d]: and GS I think?
01:40mangodev[d]: is it for more restrained/mobile hardware? or for all hardware in general?
01:40mangodev[d]: i saw that the MR tag for it describes libpoly as "tesselation and geometry shader **emulation**"
01:40sonicadvance1[d]: Yea, that describes it pretty well. For hardware that needs tessellation and geometry emulated, which most hardware just uses a compute shader for.
01:41mangodev[d]: ahhhhh okay, i see
01:42mangodev[d]: i was thinking it was mobile related, given it originally stemmed from the asahi driver
01:58airlied[d]: a subchannel is when you have a channel then is has different classes for copy engine, gfx engine etc
01:58airlied[d]: and switch is moving between the subchannels on a channel
03:11mhenning[d]: mangodev[d]: I wrote some docs on subc switches here: https://docs.mesa3d.org/drivers/nvk/hardware_docs.html#subchannel-switches
09:08karolherbst[d]: Mhhhh I wonder what's up with that.. the 3D vs compute thing
09:09karolherbst[d]: though there is SCG and maybe they referred to that 🙃
09:11mohamexiety[d]: yeah it's weird. on windows too if you take nsight to e.g. dlss captures you can also see that there's no wfis
09:12mohamexiety[d]: so it's weird
09:12misyltoad[d]: gfxstrand[d]: Do you have any time to review my DLSS MR at some point?
09:12misyltoad[d]: It has some RBs by others but would definitely like your blessing
09:22karolherbst[d]: mohamexiety[d]: well you can do that just fine with SCG in theory
09:22mohamexiety[d]: yes but there are wfis on ada and older
09:22karolherbst[d]: mhhhh
09:22karolherbst[d]: well SCG isn't magic, so maybe they can't use it for DLSS on older gens
09:23karolherbst[d]: or maybe there isn't a WFI under certain circumstances indeed
09:25marysaka[d]: mohamexiety[d]: btw I noticed that envyhooks might be broken on Blackwell B as GpGet is now gone
09:25marysaka[d]: I rely on it to determine the progression of the command buffer dumper
09:25mohamexiety[d]: GpGet?
09:32marysaka[d]: mohamexiety[d]: it's part of the GPFIFO mapping, when you submit work you write to GP_PUT the offset of the last pushbuf entry to run
09:32marysaka[d]: GP_GET is what I was using to read the current value to know to have the range of entries to dump
09:33marysaka[d]: I think I could probably keep track of the last GP_PUT values on a per channel basis in envyhooks instead anyway
09:33mohamexiety[d]: interesting. i wonder how things worked seemingly fine till now lol
09:34marysaka[d]: mohamexiety[d]: It is possible that it reads as 0 and end up dumping everything... and the deduplication logic is kicking in
09:35marysaka[d]: I feel it would be cleaner to just keep track of GP_PUT anyway as this will map to what the user requested in that case
09:35marysaka[d]: instead of whatever is in flight at the moment of dumping
09:36mohamexiety[d]: yeah
14:09phomes_[d]: misyltoad[d]: Jedi Fallen Order was on -90% sale so I just grabbed it. It should be UE4 with DLSS support so another data point to see if the issue I found is UE4 specific
16:00karolherbst[d]: `Pass: 14298, Skip: 14701, Timeout: 1, Duration: 6:12, Remaining: 10:03:49` ðŸ˜
16:02karolherbst[d]: ohh I had a bad print in my local code 🥲
16:50mhenning[d]: karolherbst[d]: SCG is for async compute queues, not compute submitted to a graphics queue
16:54karolherbst[d]: mhenning[d]: it's actually the same thing
16:55karolherbst[d]: like SCG is always on the graphics queue
16:56mhenning[d]: I don't know what you're talking about
16:57karolherbst[d]: like the point of SCG is to run graphics and compute shaders simultaneously
16:57mhenning[d]: right
16:57mhenning[d]: and to do that you create two queues and pair them together
17:00karolherbst[d]: I mean it kinda depends on what level of abstraction you are looking at
17:00karolherbst[d]: it's just scheduling in the end
17:02mhenning[d]: I mean , sure we could create two hardware queues for one VkQueue and implement it that way
17:02mhenning[d]: but SCG the hardware mechanism still requires two hardware queues
17:02karolherbst[d]: I don't think there is any need for two hardware queues
17:02mhenning[d]: and WFIs are a same-queue issue
17:02karolherbst[d]: at least not on newer gpus
17:03karolherbst[d]: it might be that on pascal it still needs two queues because it's more static
17:06karolherbst[d]: unless you mean two API queues in which case that's probably how nvidia does leverage SCG in their driver