13:10fdobridge: <gouz> 100% tessellation CTS tests on turing!
13:48fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Hopefully this will check the first DXVK box 😅
13:50fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> And one zink box for OpenGL 4.0 🐸
13:59fdobridge: <gouz> Now running a complete cts run .. lets see how many fails I get
14:09fdobridge: <Rhedox> how much of this is NIR and how much of this will have to be redone for NAK?
14:25fdobridge: <gouz> Small changes in codegen overall
14:34fdobridge: <gouz> I get many fails on combined image sampler usage with tess. Generally on a full run we are +7k pass and +9.5k fails.
14:41fdobridge: <gouz> I get many fails on descriptors usage with tess. Generally on a full run we are +7k pass and +9.5k fails. (edited)
15:05fdobridge: <Conan Kudo (ニール・ゴンパ)> how'd this go?
15:06fdobridge: <karolherbst🐧🦀> the meeting would have been in three hours if the host hadn't canceled it for health reasons 🥲
15:06fdobridge: <karolherbst🐧🦀> I'll discuss it with Kevin and start a proper thread with all the people involved
15:08fdobridge: <karolherbst🐧🦀> I'll make sure you are on Cc there
15:15fdobridge: <gouz> Hmm, running it with NV50_PROG_OPTIMIZE=1 most of these fails are vanished.
15:15fdobridge: <gouz> With it a full run is +17k pas +150 fails
15:16fdobridge: <gouz> Hmm, running it with NV50_PROG_OPTIMIZE=1 most of these fails are vanished.
15:16fdobridge: <gouz> With it a full run is +17k pass +150 fails (edited)
15:29fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Just needs some NIR lowering sauce 🍩
15:29fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> NV50_PROG_OPTIMIZE=1 is bad for performance (but NAK may be decent fairly soon)
15:41fdobridge: <gouz> Geometry shaders seem to be also affected by this
15:44fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> But using the NV50 variable destroys GTA San Andreas performance 🐸
17:54fdobridge: <gouz> There seems to be an issue with the FlatteningPass in Program::optimizePostRA
18:19fdobridge: <karolherbst🐧🦀> yes
18:24fdobridge: <karolherbst🐧🦀> I can't remember but it was doing some bogus stuff, which... is kinda okay in most cases, but sometimes totally not
18:37nvuser: RSpliet: ah, I'll try to see if I can get the same information from my maximum boost clock of 1150MHz, or what Nvidia is says is 1150MHz
18:43nvuser: it seems that 1137MHz is indeed a valid boost clock for this card anyway, but I guess that info may be helpful somehow in any case
18:44nvuser: I'll give it a try soon, see if I get something different out of it when "boosting"
18:55nvuser: yeah I didn't want to make more noise about it, I guess it's mostly useful at a later time
19:51nvuser: RSpliet: ah, I have those values as well now: https://dpaste.com/DF2JFLL2H
19:52RSpliet: nvuser: it's fine to make a bit of noise if it's on-topic
19:52nvuser: Well yes, but I wanted to get all the info first
19:53nvuser: Those are the values boosting to 1150MHz, I think is the max for this card with it's preinstalled fancy cooler, I guess maybe more for someone who has water cooling, or third party hefty thing like morpheus ii cooler
19:54RSpliet: Cool. I was specifically curious about those values for that mysterious 1006MHz where nouveau only sets 1000MHz
19:55nvuser: ah but is the same case for nvboost=2, it sets 1137 there as well
19:56RSpliet: can you grab these registers when nouveau sets 1137 too?
19:57nvuser: nvboost=1 is exact on 0f, and also 0a, 07 etc are the same as nvidia, but nvidia does have something different for perf level 2, which I guess is 0d, could get something for that later maybe.. if those levels would be used
19:57nvuser: you mean grab it in the nvidia driver?
19:57RSpliet: Those PLLs you call 1150MHz by the way set the GPC to a meager 936MHz
19:58nvuser: So they wouldn't increase clock speeds on nouveau?
19:58RSpliet: nvuser: ok, so here's how I know
19:58RSpliet: that CLK0 coefficient is your GPC clock afaik
19:59RSpliet: The "base clock" going into the PLL is 405MHz. That's just from memory, don't ask me why, there's another PLL bringing a 27MHz crystal up to 405 MHz or something
19:59nvuser: yeah I was curious why it would be less somehow than 1006 mode
19:59RSpliet: You got three hex numbers in your coeff: 0x1, 0x25, 0x8
19:59RSpliet: The middle one is the multiplier
20:00RSpliet: The other two multiplied get you the final divider
20:00RSpliet: so (405 * 37) / (8 * 1) = 1873,125
20:00RSpliet: Divide by two to get what NVIDIA and nouveau print -> 937MHz or something along those lines
20:01nvuser: interesting, so nvidia driver 'black magic' is different?
20:01RSpliet: The way NVIDIA and nouveau choose these coefficients is possibly different
20:01nvuser: since it does print 1150
20:03RSpliet: I haven't looked at the NVIDIA control panel in absolute ages, so can't tell you what's going on there
20:03RSpliet: (seriously, I'm using AMD these days, and before I used nouveau, haven't done any dev in this in 10 years)
20:04nvuser: okay, what about the 1137 values.. you're interested in the ones from nouveau driver, not nvidia driver?
20:05RSpliet: I'm curious why nouveau and nvidia report different things. Either the coefficients are different, or the control panel is lying ;-)
20:05RSpliet: But bear in mind that NVIDIA changes that clock all the time based on temperature and power budget constraints. Life isn't as simply as "three modes, three clocks" anymore.
20:06nvuser: hehe, so want me to boot into nouveau driver and give you what nvapeek says?
20:06RSpliet: Don't think nouveau does all of that fancy stuff
20:06RSpliet: If you want to get to the bottom of things
20:06RSpliet: I don't know what it'll lead up to in the end, other than understanding
20:07RSpliet: By the way, here's the nouveau code for all of this
20:07RSpliet: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gk104.c?h=v6.2.10
20:07nvuser: sure, I'll go and see what it gives for nvboost=2, if there's anything worth noting I'll put it in an issue.. I guess just some issue like '780ti clock info (more perf available)' etc
20:08nvuser: ah so it all hides in gk104
20:08RSpliet: worse, calc_pll calls gt215_pll_calc(subdev, &limits, freq, &N, NULL, &M, &P); ;-)
20:08RSpliet: So there's a bit hidden there
20:10nvuser: okay, so.. I'll just reboot this to get nouveau loaded and see what it says
20:10nvuser: brb
20:25nvuser: RSpliet: here they are https://dpaste.com/6UWTQ687C
20:30nvuser: I wonder if this means the nvapeek values for 1150MHz boost are not useful, maybe I need to look elsewhere for boost clocks
23:03fdobridge: <Conan Kudo (ニール・ゴンパ)> oh yikes!
23:03fdobridge: <Conan Kudo (ニール・ゴンパ)> thanks!
23:04fdobridge: <karolherbst🐧🦀> ohh Kevin wants to discuss internally first so we can go to nvidia with a proper plan (also including the distribution pain with shipping it and everything)
23:04fdobridge: <karolherbst🐧🦀> like versioning
23:05fdobridge: <karolherbst🐧🦀> and storage usage explosion and everything
23:06fdobridge: <gouz> This makes all descriptor loads to fail. I encountered it on tessellation and geometry shader tests. But I do not know yet why is it happening..
23:07fdobridge: <gouz> This makes all descriptor loads to fail. I encountered it on tessellation and geometry shader tests. But I do not know yet why it is happening.. (edited)
23:08fdobridge: <Conan Kudo (ニール・ゴンパ)> should I know who Kevin is?
23:09fdobridge: <Conan Kudo (ニール・ゴンパ)> but yeah, if you want my assistance on coming up with a plan, feel free to loop me in
23:11fdobridge: <karolherbst🐧🦀> Kevin Martin
23:12fdobridge: <Conan Kudo (ニール・ゴンパ)> Ah, okay.
23:12fdobridge: <Conan Kudo (ニール・ゴンパ)> Ah, okay. Never met him. (edited)
23:13fdobridge: <Conan Kudo (ニール・ゴンパ)> Or if I have, I don't know that I did. 😄
23:13fdobridge: <Conan Kudo (ニール・ゴンパ)> that's one of the lovely parts of this kind of collab work 😄