05:22 flacks: Hi community, I'm reporting a lock-up and release of a 1080 Ti in dual-head mode here https://gist.github.com/flacks/3a228b6b06138524e7e5b9af4a2d9ca1. I found it interesting that the GPU lock would be released ~37 minutes afterwards. I'd just been hard-rebooting every time but it was quite surprising to see things return to normal given time.
05:23 flacks: Anyone know if NVIDIA is in contact with the the nouveau community as of late? Is there any hope? :)
09:15 RSpliet: flacks: There's been a drip-feed of information from NVIDIA for a while now.
09:16 RSpliet: We're obvs not their main priority, but on some topics they help out with docs
12:16 imirkin: flacks: maybe something like this? just a random guess. https://github.com/skeggsb/nouveau/commit/407168db348e0a5058e7339693c59bcc041e3bf5
12:27 RSpliet: imirkin: I'm taking the risk of sounding silly, but... Is the vblank counter a 17 bit value? in that case, a recovery caused by a counter overflow would take (2^17)/(60*60) ~= 37 minutes
12:27 imirkin: could be
12:28 imirkin: (i have no idea)
16:53 flacks: RSpliet: good, at least they're helping somewhat. I really wonder what the reasons behind their reluctance to commit serious time and resources to nouveau are.
16:54 flacks: imirkin: I guess I could try recompiling nouveau with that change, but I am by no means an expert at drivers / kernel hacking. I'm almost at a complete loss here
17:24 RSpliet: flacks: I can think of many reasons. Poor (near-zero in the grand scheme of things) return-on-investment, generational gap with relation to assessing and dealing with company secrets and a resulting negative opinion on open source, taking (too) serious their obligation to protect third party secrets...
17:25 RSpliet: I don't agree with them, but I can come up with them ;-)
17:48 flacks: sad state of affairs
19:32 imirkin_: flacks: every expert was once a novice :)
19:33 flacks: :)
19:34 imirkin_: unfortunately you'll have to port that patch to a regular linux tree to test it
19:34 imirkin_: the file structure is the same, just a few directory levels missing
19:34 flacks: the issue here is reproducibility - my error is very spontaneous. but I will give it a go.
19:35 flacks: no problem. thanks for the heads up.
19:48 imirkin_: flacks: could just be something that happens rarely, like some counter rolling over
19:48 imirkin_: i've definitely suspected that this is an issue for some time
20:21 flacks: hm. ok. might be high time to finally finish Modern C :)
20:29 imirkin_: i've had vsync issues come in the past
20:29 imirkin_: after months of uptime
20:30 imirkin_: where vsync just stops working for some reason
20:30 imirkin_: no errors, no nothing, all is well, just no vsync events make it back
20:30 imirkin_: with a vsync-driven compositor, this could very well cause the screen to hang