01:42redsheep[d]: karolherbst[d]: Do you still see results that consistent on recent architectures? GPU boost being what it is these days I would be pretty surprised if you ever get results that consistent, even after running for hours and heat soaking the hardware as close as possible to steady state
01:45redsheep[d]: Modern boost is so sensitive that 1 degree will change perf by like 0.3-0.5%, at least from what I have seen
01:50karolherbst[d]: redsheep[d]: that was before GSP
01:51redsheep[d]: karolherbst[d]: Is there any way at all in the gsp era to ask for fixed clocks? Even if the benchmark runs really slow at boot clocks or whatever... that might be more useful in this case
01:51redsheep[d]: Any performance difference would be exaggerated and all but certain not to be the cpu
02:12clangcat[d]: I have spent the last three hours trying to get nouveau to not crash and just work while not causing other problems.
02:12clangcat[d]: :Foxy_Dead:
06:46ahuillet[d]: karolherbst[d]: I had in mind of working on a trace of Talos FWIW
06:46ahuillet[d]: (went so far as to failing to capture)
09:06karolherbst[d]: mhhh.. but yeah, with GSP, it might make sense to have some benchmark mode we can put nouveau in, to guarantee stable clocks and stuff...
09:33redsheep[d]: karolherbst[d]: If for whatever reason that doesn't work out it is possible with shunt mods and the right tweaks to get a card running at or near full speed with more consistent clock behavior at least with regards to power. Thermals would remain a challenge though. There's also the peaky issue that you can hit the voltage limit pretty easily when you have lots of power and thermal headroom which
09:33redsheep[d]: would have really unpredictable interactions with changing around the instructions for testing.
09:34karolherbst[d]: you could disable boosting
09:34redsheep[d]: Is that possible without going to super low clocks?
09:34karolherbst[d]: yes
09:35redsheep[d]: Oh, perfect
09:37redsheep[d]: I assume I'd need a kernel patch to do that if I wanted to test?
09:37karolherbst[d]: yeah, we don't do any performance controls in nouveau yet, but I'm considering adding something like that
09:38redsheep[d]: That would be nice. I'd like to have a bunch of different controls. For instance being able to pick to overclock or underclock the vram, can help characterize a bottleneck
10:32karolherbst[d]: gfxstrand[d]: any opinions on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30414 ?
12:46zmike[d]: somebody opening tickets for random smooth lines piglit fails 🤔
12:46notthatclippy[d]: karolherbst[d]: Maybe something like https://gist.github.com/mtijanic/9c129900bfba774b39914ad11b0041f6?
12:47notthatclippy[d]: notthatclippy[d]: Except obviously for nouveau send to gsp from kernel and no ioctls
12:47karolherbst[d]: yeah.. but I don't think `NV2080_CTRL_PERF_BOOST_FLAGS_CMD_BOOST_TO_MAX` is the right thing here, is it?
12:48karolherbst[d]: should use `NV2080_CTRL_PERF_BOOST_FLAGS_CMD_BOOST_1LEVEL` or even `NV2080_CTRL_PERF_BOOST_FLAGS_CMD_CLEAR`
12:48karolherbst[d]: but good to know that nvidia has something similar to what I've done in nouveau pre GSP 😄
12:48karolherbst[d]: 1level is boost only to declared boost clock vs boost as high as possible or something
12:59notthatclippy[d]: There are more precise benchmark modes internally but IIRC they're not documented. I'm out of office for two weeks but I can check about publishing it when I get back.
13:00notthatclippy[d]: It's not really sensitive stuff or anything, just things that don't matter for regular operation.
13:05karolherbst[d]: I'd also have to check how amdgpu reports the current clocks so we can do something similar
13:05karolherbst[d]: though that's probably even more work
14:06gfxstrand[d]: karolherbst[d]: RB. Please land so I can rebase the rust code motion I'm working on for Christian right now
14:20karolherbst[d]: I like how such a change causes CI to run on like all jobs
14:22gfxstrand[d]: yeah...
14:22gfxstrand[d]: But you did literally update the CI images
14:23karolherbst[d]: yeah...
14:23karolherbst[d]: I had more pointless things triggering it
14:54karolherbst[d]: :blobcatnotlikethis:: the pipeline might fail, because `kernel+rootfs_x86_64 ` takes ages
14:54karolherbst[d]: should have triggered that one in my testing earlier...
15:01gfxstrand[d]: Yeah, it always takes a while to get those through
20:34karolherbst[d]: uhhh... I only wanted to fix the device UUID and I'm already at ` 9 files changed, 121 insertions(+), 47 deletions(-)`
20:49karolherbst[d]: anyway, the EXT_external_memory piglits tests aren't crashing anymore 😄
20:49karolherbst[d]: gfxstrand[d]: I think you wanted nvk and nouveau gl to report different UUIDs, right?
20:50karolherbst[d]: the annoying part here was, that the gallium driver doesn't have any of this information available, so I had to start adding some `drmGetDevice2` handling and stuff.. uhh
20:53gfxstrand[d]: karolherbst[d]: Very very yes
20:53gfxstrand[d]: Different driverUUIDs, anyway
20:54karolherbst[d]: okay
20:55karolherbst[d]: I'll clean that mess up tomorrow
22:50anarsoul: hey folks, so I tested mesa-git (and mesa-21.1.4) with linux-6.10.2 on GTX1650 (as a secondary GPU, it's a laptop with screen driven by Intel GPU), and I'm still getting gsp mmu fault when launching steam
22:50anarsoul: s/21.1.4/24.1.4
22:50redsheep[d]: karolherbst[d]: Boosting to declared boost might still not be consistent. With something like furmark it's trivial to cause it to fail to reach boost.
22:51karolherbst[d]: anarsoul: with nvk installed? Might want to try 24.2 then
22:51karolherbst[d]: redsheep[d]: with furmark it's easy to also drop below "base" clocks 😄
22:51redsheep[d]: Yeah
22:51anarsoul: karolherbst[d]: yes, with nvk. I already tested mesa-git, same issue
22:52anarsoul: (and I don't see anything similar in git log, so I presume it's not fixed and likely specific to hybrid configs)
22:54airlied: I don't think one fix make it into 6.10, should probably check
22:55anarsoul: https://gist.github.com/anarsoul/1dcd68e0a8396a86cf0eb0a2522a0705
22:56karolherbst[d]: `Composite Threa` is that a web browser?
22:56karolherbst[d]: ehh.. CEF inside steam prolly
22:57anarsoul: karolherbst[d]: it's steam
22:57karolherbst[d]: mhhhh...
22:57karolherbst[d]: it's a hybrid GPU system, right?
22:57anarsoul: yeah
22:57anarsoul: and it works fine on Intel GPU
22:57karolherbst[d]: right...
22:57karolherbst[d]: I suspect it would pick the gl driver in that case
22:58airlied: hmm not suire the fix is in any tree, wonder where it went
22:58anarsoul: I have NOUVEAU_USE_ZINK=1 set :)
22:58karolherbst[d]: does it work with `NOUVEAU_USE_ZINK=0`?
22:58anarsoul: so should be zink over nvk
22:58anarsoul: let me try
22:59airlied: dakr: what ever happened https://patchwork.freedesktop.org/patch/594068/ ?
22:59karolherbst[d]: I still utterly dislike steam always launching on the discrete GPU :ferrisUpsideDown: , but oh well
22:59airlied: can we get that into fixes, but add a Cc: stable@vger.kernel.org to it
23:00redsheep[d]: karolherbst[d]: If there is a way to force it to sit at base clocks that would probably work in all cases on Ada. I've yet to see ada fail to reach base in any workload. Ampere is a very very different story.
23:00anarsoul: karolherbst[d]: well, no gsp mmu fault in dmesg, but it keeps crashing and restarting
23:00karolherbst[d]: 🥲
23:00redsheep[d]: I've seen like 1020 mhz on a 3090
23:00karolherbst[d]: anarsoul: any idea why it crashes?
23:01anarsoul: karolherbst[d]: nah, nothing in console and I have no idea how to debug steam
23:03airlied: anarsoul: does nouveau.runpm=0 on the command line help?
23:03karolherbst[d]: here I get `X Error of failed request: BadMatch (invalid parameter attributes)` :ferrisUpsideDown:
23:04anarsoul: airlied: let me try
23:05karolherbst[d]: yeah.. in the mood of debugging steam honestly
23:07anarsoul: airlied: nope
23:07karolherbst[d]: I blame steam
23:07anarsoul: :)
23:08karolherbst[d]: not my fault it throws some weird X errors
23:09karolherbst[d]: but yeah.. with zink it also runs into the same error here
23:10karolherbst[d]: but could also be some weird modifier situation
23:10karolherbst[d]: anarsoul: is that on a wayland session? Might want to try X just in case it changes anything
23:10anarsoul: karolherbst[d]: fair enough. I don't like modern apps pulling in whole browser just to show UI either
23:10anarsoul: yeah, that's on wayland
23:12anarsoul: shouldn't matter though since it still goes through xwayland?
23:12karolherbst[d]: who knows
23:17anarsoul[m]: Yeah, my Plasma X11 session doesn't want to come up, it's stuck on black screen with cursor
23:29anarsoul: karolherbst[d]: with nouveau it actually segfaults, but coredump isn't really useful, no symbols
23:29karolherbst: huh...
23:30dakr: airlied, applied to drm-misc-fixes.
23:30anarsoul: yeah, sorry, I understand that my report isn't really useful nor actionable
23:33karolherbst[d]: not here
23:33karolherbst[d]: here it just does.. nothing
23:34anarsoul: DEBUGGER=gdb steam doesn't produce anything useful either, it looks like it crashes in a forked process
23:35karolherbst[d]: uhhh right..... steam is still 32 bit somehow
23:36anarsoul: let's try following forked process...
23:36anarsoul: nope :(
23:36anarsoul: I have no idea how to debug this overengineered beast
23:40karolherbst[d]: all I can see is "failed to create drawable" and those X errors
23:40karolherbst[d]: but also all this hybrid GPU stuff is super flaky with CEF
23:40karolherbst[d]: it does its own device selection afaik
23:41karolherbst[d]: and messes up modifiers and setting up rendering