09:54MrCooper: Venemo: <kode54> kde locks up waiting on the crashed xwayland
09:54MrCooper: this wouldn't happen if radeonsi didn't kill Xwayland
09:55MrCooper: mareko: "the spec should explicitly say that program termination is disallowed (not the other way around)" isn't how the spec works; if program termination isn't explicitly spelled out, it's not allowed
10:00MrCooper: also, when the spec says program termination may occur, that's surely about something like a CPU out-of-bounds buffer access resulting in a segmentation fault, not the driver saying "I don't feel like continuing, let's kill the process"
10:03kode54: MrCooper: technically Xwayland should be using the robustness APIs to make itself recoverable
10:03MrCooper: sure, and it probably will at some point, but the reality is it doesn't yet
10:04MrCooper: and it's not as simple as it may seem BTW
10:04MrCooper: though anybody's welcome to prove me wrong on that
11:47Venemo: MrCooper: sounds like a kde bug, honestly
12:58pixelcluster: MrCooper: the point is that it's also not as simple as "just don't kill xwayland"
12:59pixelcluster: if you don't kill apps you have two options: 1) no-op everything - in this case I don't think you could get any visual output from Xwayland and this would persist until the user manually restarts the DE, or 2) try carrying on as usual - this can and will cause reset loops where the gpu continually crashes because of memory corruption from previous resets
13:00bnieuwenhuizen: even more with "no-op everything" is what do you do with app readbacks
13:01bnieuwenhuizen: given that nothing is produced
13:01pixelcluster: yeah well correctness is out of the window anyway so I guess you just give uninitialized memory or something
13:01pixelcluster: might also kill the app
13:01bnieuwenhuizen: well, giving it wrong memory is very nonconformant nayway
13:02bnieuwenhuizen: so no matter what you do you're not spec conformant :shrug:
13:02pixelcluster: neither of these two solutions are acceptable IMO (and yeah I'd agree that killing xwayland isn't acceptable either - but we'll have to wait for the proper enhancements from the kernel)
14:52MrCooper: pixelcluster: kwin could kill Xwayland and start a new one
14:53pixelcluster: MrCooper: but the driver can't know or influence that?
14:53MrCooper: or the user could manually kill it
14:53MrCooper: what driver? I'm not talking about anything driver related
14:57pixelcluster: MrCooper: isn't the topic about what the driver should do when the gpu is reset?
14:58pixelcluster: also "the user could manually kill it" <-- this is terrible user experience, arguably worse than having it get killed on its own
15:01MrCooper: I don't disagree, it's not as bad the session freezing though
21:13mareko: I'm pretty sure anything is allowed with a non-robust context when a device is lost
21:15mareko: non-robust means "it's conformant until it stops working, and then it's not conformant", robust means "it's conformant until it stops working, and you can query when that happens"