00:00danvet: more like old classic I think
00:04danvet: I guess I should go sleep now, before I get banned here too :-P
00:05karolherbst: you were kind of off topic here, ngl
00:06danvet: yeah if I stay banned for much longer we might need that fdo-memes bridge indeed
02:27airlied: dakr: need to also work out a plan for syncobjs when the channel crashes
02:27airlied: currently we get stuck for ever waiting
02:31dakr: airlied: yeah, guess you run into such a case?
02:32airlied: dakr: yeah a lot of vulkan tests cause some faults :-)
02:33airlied: dakr: it should be easy enough to write a test, execute a copy without binding a vm for one of the addrs
02:55dakr: airlied: got it reproduced...
02:55dakr: the fence that's attached to the syncobj actually gets signaled if the channel fails..
02:57dakr: the issue is that for subsequent exec jobs the channel is already dead and we run into a NULL pointer dereference in nv84_fence_emit() and then the syncobj of the subsequent exec job stalls.
03:00airlied: dakr: ah makes sense
03:00dakr: So, when the channel is killed we should probably remove all subsequent jobs from the queue.
03:01airlied: and refuse to process any
03:24dakr: airlied: yeah, probably enough to just refuse the exec ioctl?
03:25dakr: returning -ENODEV
03:35airlied: dakr: if (unlikely(atomic_read(&chan->killed)))
03:35airlied: return nouveau_abi16_put(abi16, -ENODEV);
03:35airlied: yeah just taking that from the pushbuf ioctl should be good
03:46Wally: Does nvidia hw (tu1xx+) require any PCI config register setting or any of the sort before being able to map one of the PCI bars?
04:14dakr: airlied: pushed the patch..
04:50airlied: jekstrand, karolherbst: do we expect a VkDevice will have it's own channel and vm?
04:52airlied: I suppose we do create a new context for each vkdevice
04:57airlied: dakr: btw you got unlocked and locked mixed up :-)
04:57airlied: unlocked is the one that has to take the lock
05:01dakr: airlied: oops, just the naming though, functionally it's fine. :-)
05:09airlied: yeah I just reworked the nouveau_uvmm wrapper here and got confused :-)
05:15dakr: airlied: fixed it.
05:16airlied:has a test getting deadlocked somehow, just rebuilding my kernel with lockdep
05:26airlied: dakr: https://paste.centos.org/view/raw/2625963e is you want to wonder why locking is messy :-P
05:54dakr: airlied: yeah, that looks messy. :/ though not related to the new uapi. :)
05:55airlied: the problem I'm getting is unfortunately related to it, but happens after that one so lockdep won't catch it
05:57dakr: airlied, can you get the stack of the hanging processes?
06:01airlied: dakr: multithreaded, one is hung getting one of the locks in uvmm
06:01airlied: https://paste.centos.org/view/f3e35c84 is the kernel traces from each thread
06:03airlied: so one thread is waiting for a reservation while holding the uvmm lock
06:03airlied: not sure why it nevers get it
06:07airlied: hmm I wonder do we need to have the object reserved to destroy the vma
06:10dakr: airlied: yeah, nouveau_gem_object_close() takes the reservation and then waits for the uvmm lock, while the job has the uvmm lock and waits for the reservation.
06:10dakr: that's bad.
06:10airlied: yeah definitely seems to be an ordering violation
06:11airlied: moving the uvmm teardown outside the resv seems to fix it, but I'd have to convince myself that was correct
06:21dakr: airlied, that's probably fine I guess.
06:25airlied: dakr: another one https://paste.centos.org/view/raw/df0fe16b
06:28dakr: airlied, yeah, that one was by mistake..
06:29dakr: gonna look into both tomorrow, it's already after 7am :D
06:30airlied: dakr: no worries, go sleep!
06:34airlied: hmmm need to consider if it's okay to call vma map twice with same params and just ignore the second one
06:34airlied: as vulkan allows aliasing images over the same memory range
06:35airlied: though not sure wtf we should do if you alias an image and buffer and have a ->kind set
07:04fdobridge: <LaughingMan> karolherbst: danvet: airlied: regarding those discord bans and their speculated remedies: i've never had a problem with being banned. no twitter/steam/etc are linked, no 2fa, no oauth provider. always accessing discord via their website in firefox with ublock, umatrix, and an enabled tracking protection.
07:42danvet: LaughinMan ... huh ...
07:49danvet: I'm still blocked and no answers :-/
08:04fdobridge: <jekstrand> airlied: Yes, each VkDevice should have its own VM. Maybe a channel per queue, maybe a channel per device. IDK yet.
08:05airlied: we have a channel per device, but shared vma at the moment
08:06airlied:is now trying to work out how who should take care of aliasing, and if it's possible ->kind can be different which would be not a good thing
08:07airlied: not sure if we should just allow the kernel to accept binds that overlap a previous bind exactly
08:07airlied: but also have some tests that do an image in 4096 and an 8192 buffer from the one memory object
08:07airlied: so that might require extending a vm allocation
08:08airlied: tomorrow's problem
08:35jekstrand: airlied: We very much need to support two different VA bindings with different kinds pointing at the same BO range.
08:35jekstrand: s/VA bindings/VA ranges/
10:16airlied: jekstrand: yeah i think i might have the wrong model in the vk driver right now, will dig more tomorrow
14:44jekstrand: airlied: Probably
14:44jekstrand: Pass: 223920, Fail: 472, Crash: 339, Warn: 4, Skip: 1335730, Flake: 98, Duration: 14:28
14:44jekstrand: Alright, sportsfans. I'm merging MSAA. There are still compiler bugs but I'll deal with those with the new compiler.
14:58karolherbst: sounds like a plan :)
14:58karolherbst: the sooner we get rid of codegen the better
16:36jekstrand: fifo: PBDMA0: 01000000  ch 33 [017f7ef000 deqp-vk] subc 0 mthd 0000 data 00000000
16:36jekstrand: Uh, what?
16:37karolherbst: yeah, that sometimes still happens, maybe submited an empty buffer?
16:37jekstrand: I don't think so
16:38jekstrand: Yeah, no. That's not possible.