00:37 imirkin: screen-wide objects will be in each bufctx
00:37 imirkin: also a particular thing could be bound to multiple contexts
07:35 tjaalton: imirkin: hi, I noticed fdo#111218 now where it was said that ubuntu builds with asserts enabled. but we already have '-Dd_ndebug=true' which was supposed to fix these.. is something else needed?
07:36 tjaalton: it was a regression when switching to meson
07:51 tjaalton: looks like debugoptimized enables asserts? and that's the default, so blame the defaults then
08:00 tjaalton: ..and b_ndebug=true should disable them, if not then there's a bug somewhere
08:18 imirkin: tjaalton: i don't really know how one is supposed to use meson, so i can't really help you
08:19 imirkin: tjaalton: however the outcome was that someone using ubuntu builds of mesa was getting asserts
08:19 imirkin: could be that this person had a build from before you made that change, i really don't know
08:20 imirkin: they were fairly responsive, so i suspect you could get more info from them should you need it
10:22 tjaalton: nope, that version inherited the change from debian with b_ndebug=true
10:23 tjaalton: meson in fedora seems to default to that nowadays, mesa was only rebuilt against it in march
10:26 karolherbst: imirkin: mhh, sure, but the issue is, that the .head list is identical at some point, because changing it on one bufctx ended up changing it inside the other as well
10:26 karolherbst: I was more talking about the list object in general
10:27 karolherbst: also, it's a list of nouveau_bufref
10:27 karolherbst: not buffers
10:28 karolherbst: and two bufctx having the same nouveau_bufref object sounds very wrong to me
10:32 karolherbst: anyway.. sounds like a serious issue inside libdrm, or mesa... but I bet it's the former
10:33 karolherbst: and fixing this might actually solve a lot of weird issues
10:33 karolherbst: because, the issue I currently try to track down is, where a bufref gets its bo set to NULL for no reason at all, and I think that's interference from a different bufctx
11:50 karolherbst: yeah.. so there is a list entry bound to a different bufctx.. and it exists in both lists... mhh, that's obviously a bug
12:11 karolherbst: ohh uhm.. that's because I am stupid :(
12:21 karolherbst: head is of course the structure, because the bufctx is inside a list of another object ... ... mhh, now I need a new explenation for that bo = NULL thing
13:44 karolherbst: uff.. seems like the pending list can end up being empty if we calling pushbuf_validate
17:07 karolherbst: mhh, okay, now I know what's going up with the last race condition... but no idea how to fix it yet
17:08 karolherbst: nouveau_screen_fence_finish can be called in any thread and the passed in context isn't guarenteed to be made current on the thread :/
17:10 karolherbst: mhh.. actually..
19:16 karolherbst: imirkin: I think I reached the point where I am saying as well, that fixing that mt stuff directly on top of libdrm isn't possible
19:17 karolherbst: worst solution: add a layer inside mesa between nouveau and libdrm to serialize concurrent access + some smartness to make it not suck that bad... and in the second step we can replace libdrm with whatever we want to do
19:18 imirkin: i think just not using libdrm_nouveau at all is the way to go
19:18 imirkin: just have to be careful not to repeat all the same mistakes
19:21 karolherbst: yeah... but for that I have to understand how all the buffer submission actually works
19:21 karolherbst: + we might get a new API anyway
19:21 karolherbst: if I can make it not suck with the layer in between, we can reimplement it when we've got the new API
19:23 karolherbst: the main thing I want to make it a bit smarter is to still have per thread push buffers, but serialize all that into a single buffer inside that layer... which should give applications the mt benefits by not blocking all too much... just have to see how easy that would be. What I don't want to have is a lock on every call into the layer...
19:23 karolherbst: maybe that's not possible anyway
19:23 karolherbst: then I'd just reimplement it
19:25 karolherbst: my last try was working out quite well though.. was able to run around 900 tests in avarage from ./deqp-egl --deqp-case='*multithread*' until crash... without any patches it was more like 10
19:25 imirkin: yeah, i mean it's not tooo hard to get to the 99% mark
19:26 imirkin: the final 1% however points out the approach is just fundamentally flawed
19:26 karolherbst: yeah.. the fencing stuff was super annoying
19:27 karolherbst: but that might be a result of the restructuring I've did
19:27 imirkin: the basic issue is that nouveau_bo_map can trigger a wait
19:27 imirkin: which in turn can cause a submit
19:27 imirkin: which in turn can cause a fence callback
19:27 imirkin: which in turn can trigger work
19:27 imirkin: which in turn can want to do something nasty
19:27 karolherbst: I got that under control though
19:27 karolherbst: that was fixed on my branch afaik
19:27 karolherbst: what was not, was that a thread is calling eglSync*(), where other threads are still doing work on a context
19:28 karolherbst: and now that one thread is triggering all kind of fences
19:28 karolherbst: attached to all kind of contexts
19:28 karolherbst: and I could fix this by throwing bunch of locks in there....
19:28 karolherbst: but... at that point it becomes a bit pointless
23:36 karolherbst: imirkin: uff.. seems like nv30 and nv50 are doing a bit weirdo stuff we might want to change
23:36 karolherbst: like the pneding/current list iterations on the bufctx
23:37 imirkin: probably not on purpose for nv50
23:37 karolherbst: mhh, nv50_bufctx_fence
23:38 imirkin: and what does nvc0 do?
23:39 karolherbst: ehh.. something similiar
23:39 karolherbst: I didn't get to nvc0 yet
23:39 imirkin: :)
23:40 karolherbst: mhhh, but having those lists exposed do the drivers can cause all kind of weirdo issues :/
23:41 karolherbst: although those bufctx are per context.. screen flushes will still trigger those fences on random threads so random threads might actually change those lists, while a current context might ran through the state_validate stuff
23:42 karolherbst: maybe iteration through callbacks instead?
23:44 karolherbst: I would like to just get rid of those lists entirely and do magic.. but I guess it won't be that easy