10:15 omeringen: :|
10:48 karolherbst: omeringen: ahh, you were always offline when I saw your replies :D
10:48 karolherbst: sorry for that
10:49 karolherbst: anyway, the log is basically saying "we crashed the GPU context"
10:50 karolherbst: but we probably also are doing something terribly wrongly
10:53 karolherbst: oh wow.. dpaste kills my browser when searching through the page
10:54 karolherbst: mhh, others also reported issues with plasmashell.. I think I will try to take a look on my machine and see what's going on there
10:54 karolherbst: probably something stupid
11:20 omeringen: @karolherbst thanks for the answer, i am tracking what you write on the logs at freedesktop webpage :D
11:37 omeringen: if dpaste is buggy for you, you can check my google drive folder for my GF108M issue: https://drive.google.com/drive/folders/19kBULZ3OKNISdoFBkmCF66uqq4vLwVI5
16:41 ciscon: https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Ampere-Firmware-Blobs
19:37 omeringen: so i am the only one who reports this since there is nobody else using GF108M nowadays :D
19:39 karolherbst: omeringen: probably it's something related with plasma
19:40 omeringen: hmmm, related to this specific card or general issue ?
19:40 karolherbst: could be both
19:44 omeringen: i can try fedora workstation 36 beta live, it's using gnome. Downloading. . .
19:44 karolherbst: if you try booting up fedora, you can also test plasma there.. sometimes I even hit issues only happening on one distro
19:44 karolherbst: it's very weird sometimes
19:46 omeringen: there is no v36 plasma version yet, already tried v35 plasma long time ago. it is hanging same as arch.
19:46 karolherbst: okay
20:55 mattst88: karolherbst: does https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/3573407/ seem like an appropriate hack to workaround nouveau's multithreaded bugginess?
20:56 karolherbst: no
20:56 karolherbst: this just makes the kernel sync on the pushes
20:56 karolherbst: but the corruption happens in userspace
20:57 mattst88: okay, thanks
20:58 karolherbst: mattst88: I have it mostly fixed here though: https://gitlab.freedesktop.org/karolherbst/mesa/-/tree/nvc0_threading
20:59 karolherbst: just some things I have to change to not regress on older hw
21:01 mattst88: where does 320M fall on the older/newer spectrum? :)
21:02 karolherbst: it's not fermi so it's probably not working out alright
21:02 mattst88: okay, thanks
21:02 karolherbst: but I actually need review on this MR, because I think it's fine to merge: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10752
21:03 karolherbst: it just doesn't fix all the issues
21:03 karolherbst: but some
21:04 mattst88: okay, and then your nvc0_threading branch is more fixes on top of that MR?
21:04 karolherbst: yes
21:04 karolherbst: this MR fixes most races outside the command buffer
21:04 karolherbst: the others fix the command buffer itself
21:05 mattst88: okay
21:05 karolherbst: the main problem is that I start to push the command buffer way more often and that makes older hardware fail in weird situations
21:06 karolherbst: didn't figure out what's wrong there, but I might go to a queueing approach, where I just queue submission into a worker which crunches through those or something.. dunno yet
21:06 karolherbst: something something
21:10 karolherbst: mattst88: I think your patch might fix _something_ else, which is a bit worrying tbh
21:10 mattst88: heh :|
21:10 karolherbst: there seems to be an issue with fencing, which I don't understand and I doubt anybody does
21:10 karolherbst: but we think there is some bug somewhere doing something
21:12 mattst88: the author says it avoids some crashes, so it must be working around _something_ :)
21:12 karolherbst: skeggsb: seems like setting NOUVEAU_GEM_PUSHBUF_SYNC works around some weird video accel crashes :/
21:13 karolherbst: mattst88: soo what this patch does it to block the ioctl submitting the command buffer until the hw is done executing it
21:13 karolherbst: it might help to prevent races in userspace though if the threads are constantly in waiting state?
21:14 mattst88: yeah, that could be
21:15 skeggsb: karolherbst: doesn't at all shock me, as i've said a few times, i'm reasonably sure our fences don't work properly sometimes (i suspect weird flushing issues when switching between engines) - i'm getting to that side of things, working on pushing out the rest first
21:16 karolherbst: okay, cool
21:16 mattst88: good to hear
21:16 skeggsb: i think some of the random mmu faults with piglit are from gr<->ce switches and fencing
21:16 karolherbst: ahh
21:16 karolherbst: that would solve tons of random fails
21:17 karolherbst: and I should finalize the mt fixes without wanting to rewrite the entire driver :D
21:17 karolherbst: skeggsb: do you think it would be fine to flush on every draw because that's my current solution to not having to rewrite the entire thing
21:18 skeggsb: i suspect that'd slaughter performance, fix the issue instead :P
21:18 karolherbst: (although I could also mark everything dirty and just roll with it)
21:18 karolherbst: skeggsb: the issue is, that we have weirdo state tracking
21:18 skeggsb: i'm well aware, the 3d driver sucks :P
21:18 karolherbst: so our submissions are diffs, so if one "draw command stream" is build on some state, I have to make sure nothing gets submitted in the meantime
21:19 karolherbst: so I flush it out
21:19 karolherbst: marking everything as dirty will also tank perf as we will constantly reupload data :(
21:19 karolherbst: it's just terrible from every perspective
21:19 karolherbst: or we just synchronize each context and don't do threading at all...
21:19 skeggsb: the vk method is probably best: private command buffers + after a flush, resubmit all state (if there's more than one context sharing a channel)
21:20 karolherbst: the problem isn't state resubmission
21:21 karolherbst: so multiple context gets their command stream built, but that always happens on based on some tracked state
21:21 karolherbst: but the entire command stream becomes invalid once another thread changes that
21:21 karolherbst: you could probably restore the state you build your buffer on I guess
21:21 karolherbst: *built
21:21 karolherbst: in case you meant that
21:22 skeggsb: yes, that's why i'm saying resubmit all state for a context after it flushes, to its *own* push buffer, so any submission has the right state for the right context
21:22 karolherbst: what do you mean "after" it has to happen before
21:24 karolherbst: or do you mean restore what was there before submitting?
21:24 karolherbst: we don't track GL or mesa state, we only track hw state
21:25 skeggsb: <initial state><draw><draw><draw><...><flush to kernel><resubmit state><draw> ...
21:25 karolherbst: okay, so you meant the initial state
21:26 karolherbst: I don't see how that's going to even work though, because won't you end up resubmitting some old state over and over again if the contexts are heavily threading?
21:26 skeggsb: no, i meant, once you send a context's pushbuf to kernel, consider the channel state invalid (if there's multiple contexts, no point if there's only one) and resubmit *all* the state after a context flushes just in case another thread draws first
21:26 karolherbst: ahh
21:26 karolherbst: well
21:26 karolherbst: we can't
21:26 skeggsb: yes, we can, i've prototyped it before but i messed up something else
21:27 karolherbst: because it won't work in our current driver
21:27 karolherbst: you already have the commands in the pb building upon some state
21:27 karolherbst: you can't fix those
21:27 skeggsb: it requires significant changes, yes, but they're worth doing
21:27 skeggsb: or you wait for the vk driver and use zink :P
21:27 karolherbst: :P
21:28 karolherbst: I don't plan on rewriting the entire driver, which was kind of my point the entire time, no?
21:28 karolherbst: ehh this is just annoying
21:28 karolherbst: probably easier to trash the entire driver and start from scratch :D
21:30 skeggsb: maybe, maybe not.
21:30 karolherbst: skeggsb: anyway, my plan is kind of to get something working without having to just rewrite everything... _maybe_ it would be simple to just restore the state to how that looked like before submitting... but dunno
21:31 karolherbst: sadly zink is just for vulkan capable hardware, because it would be faster than our mesa driver anyway
21:33 karolherbst: skeggsb: I think what I'll do... we just store the state in each context what it builds its pb upon and resubmit everything if something differs
21:34 karolherbst: probably the easiest solution for now
21:34 karolherbst: context on flush will update the screens state. before they flush they compare that to what they assume was the state, if it differs -> mark everything as dirty
21:35 karolherbst: and do two submissions: 1. restoring state 2. doing the actual work
21:40 karolherbst: or I just give up and do the crappy solution: use one pb and make it not deadlock.....
21:40 karolherbst: and we just focus on making the vulkan driver good
21:40 karolherbst: and if people want performances with GL they use zink then
21:40 karolherbst: (which they need in either case if we don't write a new GL driver)
23:09 mangix: long live vulkan
23:10 mangix: I think valve plans on removing their opengl backend
23:38 HdkR: zink the world
23:48 karolherbst: mangix: for future games that makes sense
23:49 karolherbst: what's actually the main reason for vulkan not being supported on a bit older hardware?
23:50 karolherbst: probably something with memory, but what exactly?
23:53 mangix: I imagine drivers
23:53 karolherbst: I heard there are actually reasons like in hardware
23:53 mangix: doesn't vulkan capability map to OpenGL support?
23:55 karolherbst: dunno
23:58 mangix: google says "Initial specifications stated that Vulkan drivers can be implemented on any hardware that supports OpenGL ES 3.1 or OpenGL 4. x and up. As Vulkan support requires new graphics drivers, this does not necessarily imply that every existing device that supports OpenGL ES 3.1 or OpenGL 4."
23:59 karolherbst: right..