01:08 karolherbst[d]: uhh...
01:08 karolherbst[d]: soo
01:09 karolherbst[d]: for all valid fp16 values (yes all), `MUFU.F16` behaves like `F2F16.RZ(MUFU.F32(F2F32)))` for cos and sin, so great? yeah.. well no 🙃
01:09 karolherbst[d]: apparently for 0x4c00 (16.0) exp2 diverges
01:10 karolherbst[d]: the fp32 route gives me `7bff` (65504.0) and MUFU.F16 gives me `7c00`(inf)
01:11 karolherbst[d]: which is still close enough tho 😄
01:11 karolherbst[d]: but yeah.. it looks like that MUFU.F16 rounds towards zero
01:12 karolherbst[d]: just that one value is odd..
01:12 karolherbst[d]: I should check the other ops
01:12 karolherbst[d]: okay...
01:13 karolherbst[d]: yeah the other ops are all perfect except for RCP
17:33 Siggi: I have a K2100M (NVE6 (GK106)) in my 2011 iMac. Unfortunately this has severe artifacting under nouveau. The right-hand side of the screen flickers, and sometimes the screen tears.
17:34 Siggi: Apparently this has been happening for a couple of years, but some people report suspending/resuming cleans this up. Unfortunately not an option on my hardware.
17:35 Siggi: But I'm wondering if I can figure out what's going on by comparing register settings between the nvidia driver and nouveau. I built envytools, but I'm not sure how to hold them.
17:36 Siggi: Any help or direction on how to dump register banks for comparison?
17:46 karolherbst: Siggi: 2011 iMac... that's already Intel, right?
17:51 karolherbst: the first one is from 2006, I'm feeling like 👴
17:53 Siggi: Yeah, it's a screaming monster, 4 core i5-2600 :)
17:54 karolherbst: I wonder if that's just ... something else
17:55 karolherbst: running on X? Might be worth forcing the modesetting DDX to see if that's better or try if it's better on a wayland compositor
17:55 karolherbst: the nouveau X driver is.... well.. broken in many regards
17:56 Siggi: Sadly I'm on Linux Mint Cinnamon
17:56 Siggi: I guess I can try Wayland, but it's still experimental
17:57 Siggi: I feel like this has to be the modeset, but I guess Wayland something to try
17:57 Siggi: how do I force DDX modesetting?
18:04 karolherbst: uhh.. wait a sec, maybe I still have a config around somewhere
18:05 karolherbst: Siggi: something like this: https://gist.githubusercontent.com/karolherbst/9cce055fba10fa992099f7304cec7603/raw/5ec12e7d95b38b5b85a7d3f8acd8c5ecf6509596/gistfile1.txt but maybe check first if it's nouveau being used or modesetting, because some distributions switched the default (e.g. fedora)
18:05 Siggi: mkay, will poke around
18:07 Siggi: Though I've found a few complaints going back 2-3 years about right-hand side flickering artifacts with the K2100M
18:08 karolherbst: yeah.. not saying that it can't be a genuine driver bug or wrong display configuration, but switching the X driver is an easy quick test at least...
18:10 Siggi: gotcha, so I might be able to look at the difference between nouveau and DDX?
18:10 Siggi: https://www.reddit.com/r/linuxquestions/comments/11j8qxb/nvidia_k2100m_xorg_gives_me_better_performance/
18:10 Siggi: case in point ...
18:11 Siggi: here's another one: https://forums.linuxmint.com/viewtopic.php?t=381854
18:11 Siggi: Interesting because the artifacts vanish on suspend/resume
18:39 karolherbst: yeah.. could be something wrong with how the card gets initialized..
18:48 Siggi: From what I understand, the BIOS/vBIOS initializes the card first (which seems to work) then nouveau?
18:48 Siggi: Assuming this is in the "CRT" setup (I'm reading up on this as hard as I can), presumably it ought to be "easy" to suss out
18:49 Siggi: I'm not a total n00b here, though I do feel like one :)
18:50 Siggi: back in the deep dark past I contributed a fb driver for m iXMicro TwinTurbo card
18:50 karolherbst: yeah.. suspend/resume is a bit weird, because it partly tears down the GPU and boots it up again later. Though VRAM content is usually kept around. There was an option that might help: "nouveau.config=NvForcePost=1"
18:51 karolherbst: that will force nouveau to initialize the GPU
18:51 Siggi: it was a bit of a PITA to figure out the registers map, as if/when I messed up and the screen went dark, I'd have to reboot throug MacOS :)
18:52 karolherbst: normally nouveau mostly takes over the state the GPU is from whatever the BIOS was doing, but also nouveau can't really initialize the GPU much differently than what the BIOS is doing, because both just simply run scripts that are backed into the VBIOS to set everything up. But who knows.
18:53 Siggi: Huh. Can you point me to code?
18:56 Siggi: kraloherbst: I'm finding my footing in all of this still
19:00 Siggi: karolherbst: oops, I'm finding my footing in all of this still.
19:51 karolherbst: Siggi: did the option help?
20:49 phomes_[d]: I figured out the dlss issue
20:51 phomes_[d]: no more glitches with the dlss 310.5 releases
20:52 phomes_[d]: it was in the fatbin reader
20:53 karolherbst[d]: figures
21:11 phomes_[d]: do you want to trade some rust advise for some benchmarking? 🙂
21:12 karolherbst[d]: depends
21:12 karolherbst[d]: don't have anything to benchmark tho
21:14 phomes_[d]: we had this: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37898/diffs?commit_id=8658686d838422c623f9b2ff209ad7fcbfb40156 but now max_warps_per_sm also takes a &ShaderModelInfo
21:16 karolherbst[d]: I'm confused why the impl needs to know tho...
21:18 phomes_[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37898/diffs?commit_id=56a48cf9a0b6e414e8c84eac07eb3b598fd24870#9c8ba2c4c2f29b3165afbd84f70b9344e5919e2c_0_310
21:18 karolherbst[d]: ohh it's part of the ext API? mhh
21:19 karolherbst[d]: mhh we also use it for the qmd...
21:20 karolherbst[d]: well could change the parameter.. it's not like NAK gets that information from somewhere else
21:21 karolherbst[d]: like all it uses is nak_compiler::warps_per_sm
21:44 phomes_[d]: is this crazy?
21:44 phomes_[d]: `#[no_mangle]
21:44 phomes_[d]: pub extern "C" fn nak_max_warps_per_sm(num_gprs: u32, nak: *const nak_compiler) -> u32 {
21:44 phomes_[d]: let nak = unsafe { &*nak };
21:44 phomes_[d]: let sm = ShaderModelInfo::new(nak.sm, nak.warps_per_sm);
21:44 phomes_[d]: max_warps_per_sm(&sm, num_gprs)
21:44 phomes_[d]: }`
21:44 karolherbst[d]: _that_ might be an option, just not sure how easy that is to pipe through, haven't checked
21:46 phomes_[d]: if it is good rust I have no idea. But it works
21:47 karolherbst[d]: well technically you'd mark the function as `unsafe` and add documentation around why it is unsafe and what the caller has to ensure to make it safe to call
21:48 karolherbst[d]: which in this case simply means `nak` needs to point to a valid `nak_compiler` object
21:48 karolherbst[d]:but
21:48 karolherbst[d]: I think it might be better to just change `max_warps_per_sm` instead
21:49 karolherbst[d]: and pass in `warps_per_sm` as a parameter
21:49 karolherbst[d]: dunno
21:49 karolherbst[d]: don't really have an opinion on what would look better
21:51 phomes_[d]: I cargo culted the unsafe nak from surrounding code. None of those doucment or mark as unsafe. So maybe I can get away with it 🙂
21:53 phomes_[d]: I use this and make an MR with the rebased/fixed things. Then I can update for comments there. Thank you for the help
21:55 karolherbst[d]: yeah so nak doesn't care much 😄