05:22flacks: Hi community, I'm reporting a lock-up and release of a 1080 Ti in dual-head mode here https://gist.github.com/flacks/3a228b6b06138524e7e5b9af4a2d9ca1. I found it interesting that the GPU lock would be released ~37 minutes afterwards. I'd just been hard-rebooting every time but it was quite surprising to see things return to normal given time.
05:23flacks: Anyone know if NVIDIA is in contact with the the nouveau community as of late? Is there any hope? :)
09:15RSpliet: flacks: There's been a drip-feed of information from NVIDIA for a while now.
09:16RSpliet: We're obvs not their main priority, but on some topics they help out with docs
12:16imirkin: flacks: maybe something like this? just a random guess. https://github.com/skeggsb/nouveau/commit/407168db348e0a5058e7339693c59bcc041e3bf5
12:27RSpliet: imirkin: I'm taking the risk of sounding silly, but... Is the vblank counter a 17 bit value? in that case, a recovery caused by a counter overflow would take (2^17)/(60*60) ~= 37 minutes
12:27imirkin: could be
12:28imirkin: (i have no idea)
16:53flacks: RSpliet: good, at least they're helping somewhat. I really wonder what the reasons behind their reluctance to commit serious time and resources to nouveau are.
16:54flacks: imirkin: I guess I could try recompiling nouveau with that change, but I am by no means an expert at drivers / kernel hacking. I'm almost at a complete loss here
17:24RSpliet: flacks: I can think of many reasons. Poor (near-zero in the grand scheme of things) return-on-investment, generational gap with relation to assessing and dealing with company secrets and a resulting negative opinion on open source, taking (too) serious their obligation to protect third party secrets...
17:25RSpliet: I don't agree with them, but I can come up with them ;-)
17:48flacks: sad state of affairs
19:32imirkin_: flacks: every expert was once a novice :)
19:34imirkin_: unfortunately you'll have to port that patch to a regular linux tree to test it
19:34imirkin_: the file structure is the same, just a few directory levels missing
19:34flacks: the issue here is reproducibility - my error is very spontaneous. but I will give it a go.
19:35flacks: no problem. thanks for the heads up.
19:48imirkin_: flacks: could just be something that happens rarely, like some counter rolling over
19:48imirkin_: i've definitely suspected that this is an issue for some time
20:21flacks: hm. ok. might be high time to finally finish Modern C :)
20:29imirkin_: i've had vsync issues come in the past
20:29imirkin_: after months of uptime
20:30imirkin_: where vsync just stops working for some reason
20:30imirkin_: no errors, no nothing, all is well, just no vsync events make it back
20:30imirkin_: with a vsync-driven compositor, this could very well cause the screen to hang