10:48karolherbst: omeringen: ahh, you were always offline when I saw your replies :D
10:48karolherbst: sorry for that
10:49karolherbst: anyway, the log is basically saying "we crashed the GPU context"
10:50karolherbst: but we probably also are doing something terribly wrongly
10:53karolherbst: oh wow.. dpaste kills my browser when searching through the page
10:54karolherbst: mhh, others also reported issues with plasmashell.. I think I will try to take a look on my machine and see what's going on there
10:54karolherbst: probably something stupid
11:20omeringen: @karolherbst thanks for the answer, i am tracking what you write on the logs at freedesktop webpage :D
11:37omeringen: if dpaste is buggy for you, you can check my google drive folder for my GF108M issue: https://drive.google.com/drive/folders/19kBULZ3OKNISdoFBkmCF66uqq4vLwVI5
19:37omeringen: so i am the only one who reports this since there is nobody else using GF108M nowadays :D
19:39karolherbst: omeringen: probably it's something related with plasma
19:40omeringen: hmmm, related to this specific card or general issue ?
19:40karolherbst: could be both
19:44omeringen: i can try fedora workstation 36 beta live, it's using gnome. Downloading. . .
19:44karolherbst: if you try booting up fedora, you can also test plasma there.. sometimes I even hit issues only happening on one distro
19:44karolherbst: it's very weird sometimes
19:46omeringen: there is no v36 plasma version yet, already tried v35 plasma long time ago. it is hanging same as arch.
20:55mattst88: karolherbst: does https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/3573407/ seem like an appropriate hack to workaround nouveau's multithreaded bugginess?
20:56karolherbst: this just makes the kernel sync on the pushes
20:56karolherbst: but the corruption happens in userspace
20:57mattst88: okay, thanks
20:58karolherbst: mattst88: I have it mostly fixed here though: https://gitlab.freedesktop.org/karolherbst/mesa/-/tree/nvc0_threading
20:59karolherbst: just some things I have to change to not regress on older hw
21:01mattst88: where does 320M fall on the older/newer spectrum? :)
21:02karolherbst: it's not fermi so it's probably not working out alright
21:02mattst88: okay, thanks
21:02karolherbst: but I actually need review on this MR, because I think it's fine to merge: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10752
21:03karolherbst: it just doesn't fix all the issues
21:03karolherbst: but some
21:04mattst88: okay, and then your nvc0_threading branch is more fixes on top of that MR?
21:04karolherbst: this MR fixes most races outside the command buffer
21:04karolherbst: the others fix the command buffer itself
21:05karolherbst: the main problem is that I start to push the command buffer way more often and that makes older hardware fail in weird situations
21:06karolherbst: didn't figure out what's wrong there, but I might go to a queueing approach, where I just queue submission into a worker which crunches through those or something.. dunno yet
21:06karolherbst: something something
21:10karolherbst: mattst88: I think your patch might fix _something_ else, which is a bit worrying tbh
21:10mattst88: heh :|
21:10karolherbst: there seems to be an issue with fencing, which I don't understand and I doubt anybody does
21:10karolherbst: but we think there is some bug somewhere doing something
21:12mattst88: the author says it avoids some crashes, so it must be working around _something_ :)
21:12karolherbst: skeggsb: seems like setting NOUVEAU_GEM_PUSHBUF_SYNC works around some weird video accel crashes :/
21:13karolherbst: mattst88: soo what this patch does it to block the ioctl submitting the command buffer until the hw is done executing it
21:13karolherbst: it might help to prevent races in userspace though if the threads are constantly in waiting state?
21:14mattst88: yeah, that could be
21:15skeggsb: karolherbst: doesn't at all shock me, as i've said a few times, i'm reasonably sure our fences don't work properly sometimes (i suspect weird flushing issues when switching between engines) - i'm getting to that side of things, working on pushing out the rest first
21:16karolherbst: okay, cool
21:16mattst88: good to hear
21:16skeggsb: i think some of the random mmu faults with piglit are from gr<->ce switches and fencing
21:16karolherbst: that would solve tons of random fails
21:17karolherbst: and I should finalize the mt fixes without wanting to rewrite the entire driver :D
21:17karolherbst: skeggsb: do you think it would be fine to flush on every draw because that's my current solution to not having to rewrite the entire thing
21:18skeggsb: i suspect that'd slaughter performance, fix the issue instead :P
21:18karolherbst: (although I could also mark everything dirty and just roll with it)
21:18karolherbst: skeggsb: the issue is, that we have weirdo state tracking
21:18skeggsb: i'm well aware, the 3d driver sucks :P
21:18karolherbst: so our submissions are diffs, so if one "draw command stream" is build on some state, I have to make sure nothing gets submitted in the meantime
21:19karolherbst: so I flush it out
21:19karolherbst: marking everything as dirty will also tank perf as we will constantly reupload data :(
21:19karolherbst: it's just terrible from every perspective
21:19karolherbst: or we just synchronize each context and don't do threading at all...
21:19skeggsb: the vk method is probably best: private command buffers + after a flush, resubmit all state (if there's more than one context sharing a channel)
21:20karolherbst: the problem isn't state resubmission
21:21karolherbst: so multiple context gets their command stream built, but that always happens on based on some tracked state
21:21karolherbst: but the entire command stream becomes invalid once another thread changes that
21:21karolherbst: you could probably restore the state you build your buffer on I guess
21:21karolherbst: in case you meant that
21:22skeggsb: yes, that's why i'm saying resubmit all state for a context after it flushes, to its *own* push buffer, so any submission has the right state for the right context
21:22karolherbst: what do you mean "after" it has to happen before
21:24karolherbst: or do you mean restore what was there before submitting?
21:24karolherbst: we don't track GL or mesa state, we only track hw state
21:25skeggsb: <initial state><draw><draw><draw><...><flush to kernel><resubmit state><draw> ...
21:25karolherbst: okay, so you meant the initial state
21:26karolherbst: I don't see how that's going to even work though, because won't you end up resubmitting some old state over and over again if the contexts are heavily threading?
21:26skeggsb: no, i meant, once you send a context's pushbuf to kernel, consider the channel state invalid (if there's multiple contexts, no point if there's only one) and resubmit *all* the state after a context flushes just in case another thread draws first
21:26karolherbst: we can't
21:26skeggsb: yes, we can, i've prototyped it before but i messed up something else
21:27karolherbst: because it won't work in our current driver
21:27karolherbst: you already have the commands in the pb building upon some state
21:27karolherbst: you can't fix those
21:27skeggsb: it requires significant changes, yes, but they're worth doing
21:27skeggsb: or you wait for the vk driver and use zink :P
21:28karolherbst: I don't plan on rewriting the entire driver, which was kind of my point the entire time, no?
21:28karolherbst: ehh this is just annoying
21:28karolherbst: probably easier to trash the entire driver and start from scratch :D
21:30skeggsb: maybe, maybe not.
21:30karolherbst: skeggsb: anyway, my plan is kind of to get something working without having to just rewrite everything... _maybe_ it would be simple to just restore the state to how that looked like before submitting... but dunno
21:31karolherbst: sadly zink is just for vulkan capable hardware, because it would be faster than our mesa driver anyway
21:33karolherbst: skeggsb: I think what I'll do... we just store the state in each context what it builds its pb upon and resubmit everything if something differs
21:34karolherbst: probably the easiest solution for now
21:34karolherbst: context on flush will update the screens state. before they flush they compare that to what they assume was the state, if it differs -> mark everything as dirty
21:35karolherbst: and do two submissions: 1. restoring state 2. doing the actual work
21:40karolherbst: or I just give up and do the crappy solution: use one pb and make it not deadlock.....
21:40karolherbst: and we just focus on making the vulkan driver good
21:40karolherbst: and if people want performances with GL they use zink then
21:40karolherbst: (which they need in either case if we don't write a new GL driver)
23:09mangix: long live vulkan
23:10mangix: I think valve plans on removing their opengl backend
23:38HdkR: zink the world
23:48karolherbst: mangix: for future games that makes sense
23:49karolherbst: what's actually the main reason for vulkan not being supported on a bit older hardware?
23:50karolherbst: probably something with memory, but what exactly?
23:53mangix: I imagine drivers
23:53karolherbst: I heard there are actually reasons like in hardware
23:53mangix: doesn't vulkan capability map to OpenGL support?
23:58mangix: google says "Initial specifications stated that Vulkan drivers can be implemented on any hardware that supports OpenGL ES 3.1 or OpenGL 4. x and up. As Vulkan support requires new graphics drivers, this does not necessarily imply that every existing device that supports OpenGL ES 3.1 or OpenGL 4."