00:56imirkin: marek pushed some changes which mess with refcounting ... someone should check them out with nouveau - it's going to be esp sensitive as i think it's the only driver that doesn't use u_vbuf.
17:47karolherbst: meh... now I can't blame GPU crashes on multithreading anymore :D
18:00HdkR: I'm sure there will be new multithreading bugs to look forward to
18:02imirkin: there's a lot more wrong than just multithreading
18:02imirkin: i think we have VM issues
18:06karolherbst: something is funky with CL at least
18:06karolherbst: I get... "random" context crashes
18:06karolherbst: like random
18:06karolherbst: stuff runs fine for minutes and then suddenly it crashes
18:07karolherbst: and yeah.. we have this silly multithreading issue inside the kernel as well
18:07imirkin: which kind of crashes?
18:07imirkin: like SIGSEGV?
18:07imirkin: or like "write error"
18:07karolherbst: killed context
18:07imirkin: what causes it to get killed?
18:07imirkin: some PTE/PDE thing right?
18:07karolherbst: mhh, no
18:07karolherbst: fifo: fault 00 [VIRT_READ] at 0000000006a00000 engine 40 [GR] client 11 [GPC0/GCC] reason 00 [PDE] on channel 2 [00ff8f6000 test_bruteforce]
18:08karolherbst: ohh, yeah
18:08karolherbst: I missed the PDE
18:08imirkin: so yeah. i think this points to our VM issues
18:08karolherbst: something is funky
18:08karolherbst: but at least tsan doesn't complain
18:08imirkin: either some core problem with synchronization between TLB flush and whatever
18:08imirkin: or some dumber problem where we accidentally don't claim to use some buffer that we need to use
18:08karolherbst: or some stale GPU state or whatever
18:08karolherbst: at least this also happens when running singlethreaded
18:12imirkin: i've pored over our buffer usage submissions, and i'm moderately sure our logic is fine there
18:12imirkin: coz there were problems and i fixed them :)
18:12imirkin: that said, the problems were super-subtle, and i couldn't rule additional (but different) super-subtle problems
20:13karolherbst: mhh, yeah... I think it's probably the best to work on dumping all required data to tell what went wrong.
20:27karolherbst: imirkin: anyway, I will try to push out the fixes in smaller badges. One to fix the fencing stuff, one to fix the pushbuffer stuff and I have a third lock on the cur_ctx thing, which is also a bit racy. But it seems like those can be all fixed independently of each other. Will probably test chromium with multithreading enabled to see how well all of this works though.
20:27karolherbst: or do you know an "easy" way of testing it? I think chromium requires a recompilation or something?
20:27imirkin: just update nouveau -> noveau
20:28karolherbst: good idea
20:29imirkin: check warsow 2 as well
20:29karolherbst: I think you mentioned it before :D
20:29imirkin: and then some like random games
20:29karolherbst: dolphin also does multithreading, but jsut for compilation with some vbo stuff to workaround some drivers
20:29HdkR: For the nvidia blob specifically :P
20:30karolherbst: but anyway, I got CL test to run multithreaded without races :)
20:31imirkin: and pass? :p
20:31imirkin: mpv --vo=gpu
20:31imirkin: on a vp4/5 board
20:31imirkin: should reveal some problems
20:32karolherbst: ahh yeah...
20:32imirkin: i.e. fermi/kepler
20:32imirkin: you probably didn't touch the video stuff?
20:32karolherbst: I fixed this atm more on a "tsan reported stuff so I fixed it" base
20:32karolherbst: not at all
20:32HdkR: Will the multithreading fixes be in 21.0 or is this going to need 21.1 as a release?
20:32imirkin: 21.0 is basically done
20:32imirkin: whereas these changes are still mid-development
20:32HdkR: ah, k
20:32karolherbst: but I am surprised that libdrm isn't causing any issues anymore :)
20:33karolherbst: there is just one minor annoying detail I still need to find a nice solution for, but crappily it's all fixed
20:34karolherbst: imirkin: any, I already posted the race on the name, which is mostly harmless sadly :D https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8765
20:34imirkin: ah ok. fyi i don't get gitlab notifications.
20:49karolherbst: you could at least subscribe to the nouveau label :p
20:49imirkin: i could, but i don't