21:47 Desty: Hello all. Since updating this old laptop to Ubuntu 21, Nouveau has become unstable on my GT218 (NVA8), causing temporary or permanent hangs when I do certain things (e.g. running Godot, or opening multiple windows in Firefox). Most of the errors seem related to nouveau_ttm_io_mem_free ("Trying to vfree() bad address"). How can I debug / fix this?
21:50 Desty: Is there a quickstart guide somewhere for compiling the nouveau module on its own? I've downloaded the entire kernel tree but it would be nicer to just build the required parts and add printks or whatnot, to try and figure out if a double-free or something is happening... but I'm a bit lost :)
21:50 imirkin: Desty: the simplest thing is to bisect the kernel
21:50 imirkin: don't download the tree
21:50 imirkin: download the git repo
21:50 imirkin: (do a "git clone")
21:50 imirkin: there are some good guides for bisecting
21:50 imirkin: but that's if you're pretty sure it's a kernel issue...
21:51 imirkin: which it might not be
21:51 Desty: yes, I have the RC kernel git repo. What other things could cause this apart from a kernel issue?
21:51 imirkin: i've been seeing a lot more issues reported on nouveau of late
21:51 imirkin: i suspect it's a combination of factors
21:52 imirkin: e.g. using a wayland-based compositor, more things wanting to do GL, etc
21:52 imirkin: in combination with some bitrot on the kernel side
21:53 imirkin: i'm on v5.6.7 fwiw, and that seems fairly stable
21:53 Desty: I've used git-bisect before, but not on the kernel... is it possible to do that with just a single module? Like, what happens if I try to load a really old module version with the current kernel (I'm on 5.13.0-20)
21:53 imirkin: been afraid to upgrade given all the issues that are being reported for 5.8/9/10
21:53 imirkin: you get a module load error
21:53 Desty: kind of wishing I'd stayed on that older release :(
21:54 imirkin: which says "kernel version mismatch, go away"
21:54 imirkin: modules are not separable from the kernel they're compiled for
21:54 imirkin: (in rare cases, they can be, but not in the general case)
21:54 Desty: makes sense. In that case bisect might not be easy without installing a complete older kernel, right?
21:54 imirkin: s/easy/possible/
21:55 Desty: :D
21:55 imirkin: again, if you _really_ knwo what you're doing, you can sometimes get away with a mismatch
21:56 imirkin: but it's not worth it imo
21:56 imirkin: (for reference, i've been hacking on nouveau for the better part of a decade, and i wouldn't consider myself as "knowing what i'm doing" in that regard)
21:57 Desty: I'd like to add some kind of debugging around the TTM handling, but I'm not even sure how to just compile and test the current version of nouveau with the kernel I'm running now
21:57 Desty: it does seem kind of complex. I count 1314 files below the gpu/drm/nouveau directory :-/
21:57 imirkin: most of those files are just license headers + a couple lines :)
21:59 Desty: Ah, good to know. The stacktrace is helpful at least, although even that is quite long, going from nouveau_drm_ioctl to ttm_bo_* to nvif_object_unmap_handle, nvkm_umem_unmap and finally vunmap which dies
21:59 Desty: the call trace is 40 lines
21:59 imirkin: check on the mailing list / bug tracker
21:59 imirkin: to see if others have filed issues with similar traces
21:59 imirkin: chances are you aren't the first
22:01 Desty: hmm good call, will check there. I submitted an issue to Ubuntu's tracker a while back but there wasn't much interest (probably because it's old hardware)