00:00 danvet: https://imgflip.com/i/6z1n7l
00:00 danvet: more like old classic I think
00:00 karolherbst: definitely
00:02 danvet: https://imgflip.com/i/6z1niu
00:03 danvet: https://imgflip.com/i/6z1nlz
00:04 danvet: https://imgflip.com/i/6z1npe
00:04 danvet: I guess I should go sleep now, before I get banned here too :-P
00:05 karolherbst: :D
00:05 karolherbst: scared?
00:05 karolherbst: you were kind of off topic here, ngl
00:05 danvet: https://imgflip.com/i/6z1nvn
00:06 danvet: yeah if I stay banned for much longer we might need that fdo-memes bridge indeed
02:27 airlied: dakr: need to also work out a plan for syncobjs when the channel crashes
02:27 airlied: currently we get stuck for ever waiting
02:31 dakr: airlied: yeah, guess you run into such a case?
02:32 airlied: dakr: yeah a lot of vulkan tests cause some faults :-)
02:33 airlied: dakr: it should be easy enough to write a test, execute a copy without binding a vm for one of the addrs
02:55 dakr: airlied: got it reproduced...
02:55 dakr: the fence that's attached to the syncobj actually gets signaled if the channel fails..
02:57 dakr: the issue is that for subsequent exec jobs the channel is already dead and we run into a NULL pointer dereference in nv84_fence_emit() and then the syncobj of the subsequent exec job stalls.
03:00 airlied: dakr: ah makes sense
03:00 dakr: So, when the channel is killed we should probably remove all subsequent jobs from the queue.
03:01 airlied: and refuse to process any
03:24 dakr: airlied: yeah, probably enough to just refuse the exec ioctl?
03:25 dakr: returning -ENODEV
03:35 airlied: dakr: if (unlikely(atomic_read(&chan->killed)))
03:35 airlied: return nouveau_abi16_put(abi16, -ENODEV);
03:35 airlied: yeah just taking that from the pushbuf ioctl should be good
03:46 Wally: Does nvidia hw (tu1xx+) require any PCI config register setting or any of the sort before being able to map one of the PCI bars?
04:14 dakr: airlied: pushed the patch..
04:50 airlied: jekstrand, karolherbst: do we expect a VkDevice will have it's own channel and vm?
04:52 airlied: I suppose we do create a new context for each vkdevice
04:57 airlied: dakr: btw you got unlocked and locked mixed up :-)
04:57 airlied: unlocked is the one that has to take the lock
05:01 dakr: airlied: oops, just the naming though, functionally it's fine. :-)
05:09 airlied: yeah I just reworked the nouveau_uvmm wrapper here and got confused :-)
05:15 dakr: airlied: fixed it.
05:16 airlied:has a test getting deadlocked somehow, just rebuilding my kernel with lockdep
05:26 airlied: dakr: https://paste.centos.org/view/raw/2625963e is you want to wonder why locking is messy :-P
05:54 dakr: airlied: yeah, that looks messy. :/ though not related to the new uapi. :)
05:55 airlied: the problem I'm getting is unfortunately related to it, but happens after that one so lockdep won't catch it
05:57 dakr: airlied, can you get the stack of the hanging processes?
06:01 airlied: dakr: multithreaded, one is hung getting one of the locks in uvmm
06:01 airlied: https://paste.centos.org/view/f3e35c84 is the kernel traces from each thread
06:03 airlied: so one thread is waiting for a reservation while holding the uvmm lock
06:03 airlied: not sure why it nevers get it
06:07 airlied: hmm I wonder do we need to have the object reserved to destroy the vma
06:10 dakr: airlied: yeah, nouveau_gem_object_close() takes the reservation and then waits for the uvmm lock, while the job has the uvmm lock and waits for the reservation.
06:10 dakr: that's bad.
06:10 airlied: yeah definitely seems to be an ordering violation
06:11 airlied: moving the uvmm teardown outside the resv seems to fix it, but I'd have to convince myself that was correct
06:21 dakr: airlied, that's probably fine I guess.
06:25 airlied: dakr: another one https://paste.centos.org/view/raw/df0fe16b
06:28 dakr: airlied, yeah, that one was by mistake..
06:29 dakr: gonna look into both tomorrow, it's already after 7am :D
06:30 airlied: dakr: no worries, go sleep!
06:34 airlied: hmmm need to consider if it's okay to call vma map twice with same params and just ignore the second one
06:34 airlied: as vulkan allows aliasing images over the same memory range
06:35 airlied: though not sure wtf we should do if you alias an image and buffer and have a ->kind set
07:04 fdobridge: <L​aughingMan> karolherbst: danvet: airlied: regarding those discord bans and their speculated remedies: i've never had a problem with being banned. no twitter/steam/etc are linked, no 2fa, no oauth provider. always accessing discord via their website in firefox with ublock, umatrix, and an enabled tracking protection.
07:42 danvet: LaughinMan ... huh ...
07:49 danvet: I'm still blocked and no answers :-/
08:04 fdobridge: <j​ekstrand> airlied: Yes, each VkDevice should have its own VM. Maybe a channel per queue, maybe a channel per device. IDK yet.
08:05 airlied: we have a channel per device, but shared vma at the moment
08:06 airlied:is now trying to work out how who should take care of aliasing, and if it's possible ->kind can be different which would be not a good thing
08:07 airlied: not sure if we should just allow the kernel to accept binds that overlap a previous bind exactly
08:07 airlied: but also have some tests that do an image in 4096 and an 8192 buffer from the one memory object
08:07 airlied: so that might require extending a vm allocation
08:08 airlied: tomorrow's problem
08:35 jekstrand: airlied: We very much need to support two different VA bindings with different kinds pointing at the same BO range.
08:35 jekstrand: s/VA bindings/VA ranges/
10:16 airlied: jekstrand: yeah i think i might have the wrong model in the vk driver right now, will dig more tomorrow
14:44 jekstrand: airlied: Probably
14:44 jekstrand: Pass: 223920, Fail: 472, Crash: 339, Warn: 4, Skip: 1335730, Flake: 98, Duration: 14:28
14:44 jekstrand: Alright, sportsfans. I'm merging MSAA. There are still compiler bugs but I'll deal with those with the new compiler.
14:58 karolherbst: sounds like a plan :)
14:58 karolherbst: the sooner we get rid of codegen the better
16:36 jekstrand: fifo: PBDMA0: 01000000 [] ch 33 [017f7ef000 deqp-vk[148565]] subc 0 mthd 0000 data 00000000
16:36 jekstrand: Uh, what?
16:37 karolherbst: :(
16:37 karolherbst: yeah, that sometimes still happens, maybe submited an empty buffer?
16:37 jekstrand: I don't think so
16:38 jekstrand: Yeah, no. That's not possible.
16:38 karolherbst: weird...