04:33imirkin: skeggsb: not sure if you saw scrollback, but emersion was looking to make flipping wait for fences on buffers
04:34imirkin: like the implicit ones from pushbuf submits
04:34imirkin: i saw that the display hw has semaphores, but it didn't look like we used them (at least i couldn't find it)
04:34imirkin: and also not sure if they'd be appropriate for this sort of usage
04:36skeggsb: they're sorta meant for that, and we used to use them too, i killed that temporarily when switching to atomic, until we don't risk stalling display forever when channels crash..
04:36imirkin: yeah, i was concerned about that too
04:37imirkin: we switched to atomic quite a while ago now...
04:37imirkin: anyways, i was hoping you'd have some pointers for him. coz i didn't :)
04:40fling: export GALLIUM_DRIVER=llvmpipe
04:40fling: export LIBGL_ALWAYS_SOFTWARE=true
04:40fling: my box crashes when I start X if I add these lines to my .xinitrc ^
04:40fling: X starts, I see the wallpaper being setup but it crashes soon after
04:40fling: somewhere around awesome exec
04:40fling: or should it be an awesome bug? :D
04:40skeggsb: there's not much to it really, except you can't really wait on the fences directly, disp doesn't have ACQUIRE_GE last i checked, so our counting fences don't work
04:41skeggsb: to have it fully done on the GPU you'd have to have a host channel do a semaphore acquire on it, then a release of the semaphore disp is waiting on
04:45skeggsb: there's other things i'd rather see there first in the area than implementing that again though (until we have more robust recovery), like using interrupts instead of polling for flip/modeset completion
04:45skeggsb: that's got some tricky aspects to it too, because the hw doesn't quite behave like you'd expect
06:43Zeem: I hope this is the right place for troubleshooting the driver
06:43Zeem: I've ran into a problem switching from proprietary nvidia 340.108 driver to nouveau 1.0.16
06:43Zeem: I used to be able to run Underrail through Wine and Proton without issues, but after moving to nouveau severe artifacting starts after a couple minutes of running the game
06:43Zeem: There is an error message:
06:43Zeem: "nouveau: kernel rejected pushbuf: Cannot allocate memory
06:43Zeem: nouveau: ch11: krec 0 pushes 477 bufs 52 relocs 0"
06:43Zeem: Proceeded by what appears to be buffer dump
06:43Zeem: I'm running Void Linux with kernel 5.9.16 and my GPU is NVIDIA GeForce G 105M
07:39imirkin: Zeem: that means you've run out of vram. nouveau's not very good at dealing with this.
07:39imirkin: sorry =/
07:41Zeem: I suspected as much
07:41Zeem: Any plans to improve this?
07:41imirkin: not really
07:45imirkin: (this isn't like a trivial thing to fix)
07:47imirkin: iirc that G 105M has like 128 or 256MB of VRAM, which isn't much
07:47Zeem: I understand, just wanted to know if it's something I can expect from Nouveau later or just stick to proprietary drivers and upgrade ASAP
07:49imirkin: skeggsb: well, i think emersion's focus is on making wlroots work better on nouveau, and making pageflip wait for fences is probably part of it
07:49imirkin: the specific mechanism for waiting probably doesn't matter to him though
07:51skeggsb: it already does
07:51imirkin: where does it do that?
07:51imirkin: it didn't seem to, in at least one of his tests
07:52skeggsb: wherever we call drm_atomic_helper_wait_for_fences(), we assign the fence in the atomic state from prepare_fb()
07:53imirkin: and that waits on the implicit fence that gets set when submitting a pushbuf?
07:54skeggsb: things would be a lot more wonky if we didn't do that...
07:54imirkin: hrmph ok. he was having issues with glFlush(), but it worked with glFinish()
07:55imirkin: would you expect there to be a difference between a pitch and blocklinear surface?
09:25emersion: yeah, maybe linear triggers it somehow. i'll try to hack weston to pick linear and see how it goes
09:29emersion: skeggsb: i've seen that we extract the fence, but where do we wait on it?
09:48emersion: ok, weston displays a black screen if i force linear
09:48emersion: so it has definitely something to do with linear
09:48emersion: and nothing to do with dmabufs/fbos
17:58imirkin: emersion: fwiw rendering to linear only works if there's no depth attachment
17:58imirkin: you'd see stuff in dmesg about stuff failing if you tried to do that
17:59emersion: ah, interesting
17:59imirkin: and we have asserts about this, although you wouldn't get that in a production build
17:59emersion: we don't attach depth though, so we're fine
17:59imirkin: yeah, figured you weren't doing too much depth testing :)
17:59imirkin: but thought i'd mention it.
18:00imirkin: [or stencil]
18:00imirkin: actually i could imagine you using stencil
18:00imirkin: emersion: btw, are you familiar with EXT_window_rectangles? that was supposed to be helpful for compositors.
18:00imirkin: (and implemented by nouveau)
18:02emersion: we don't use stencil, but we do use scissors
18:02emersion: ah, no, let me have a look
18:02imirkin: scissors aren't a problem
18:02imirkin: stencil is part of the "zeta" attachment, i.e. same thing as depth
18:03emersion: oh, it sounds like EXT_window_rectangles would be something like multi-scissor?
18:03imirkin: pretty much
18:03imirkin: also has include/exclude
18:03imirkin: i.e. anti-scissors :)
18:03emersion: sounds pretty useful
18:04imirkin: anyways, it's implemented by nouveau (and NVIDIA)
18:04imirkin: not sure anyone else picked this up
18:04emersion: sounds like radeonsi has it as well
18:04imirkin: iirc AMD did have some sort of hw for it
18:04imirkin: wasn't sure if they implemented it - but cool if they did
18:05emersion:adds to todo-list
21:20Lyude: btw skeggsb - have you ran any of the nouveau crc tests on ampere?
23:35Lyude: skeggsb: have you ever managed to get an evo gpu into a state where display commits suddenly start slowing down like crazy?
23:37imirkin: i've definitely seen some problems, after like 3-6mo of uptime
23:37imirkin: vsync randomly stops working
23:38Lyude: somehow I've managed to find a way with some of the igt tests I've been running on nouveau to get the gpu into this bizarre state where it looks like atomic commits suddenly start taking so much time between writing each method that we start hitting timeouts on page flips
23:39Lyude: i'm probably going to do some more trouble shooting to make sure it's not this zero csc stuff (which I am getting very close to just giving up on, in which case i'll just go through and make sure igt tests stop relying on being able to disable the primary plane)
23:39Lyude: since i've already managed to hit some very unusual bugs with it (I -thought- I fixed them, but I keep finding new ways to break things)
23:40imirkin: "zero csc"?
23:40imirkin: i.e. set the matrix elements to 0?
23:40Lyude: imirkin: for the core channel in order to make the core surface appear all black
23:40Lyude: it was the trick I mentioned nvidia gave me for "disabling" the base channel without turning off the head
23:40imirkin: well, it won't necessarily appear all black
23:41imirkin: but yeah, that's clever. i like it.
23:41imirkin: what's the issue?
23:41imirkin: i mean, you still have to give it a legit plane
23:41Lyude: it appears to in my testing (i've got some tests to compare it to just setting the base to black)
23:41imirkin: well, depends on the contents of the gamma lut
23:41imirkin: the input to the gamma lut will be 0
23:41imirkin: but the gamma lut could make that come out to whatever fixed color it wants
23:42imirkin: (frequently 0 does work out to black, of course.)
23:42Lyude: ahhh - we don't need to worry about that afaik, as this is for the core csc and not the base csc. I think the only possible situation we could end up with something in the core lut is with C8 surfaces
23:42imirkin: ah. and you're flipping the bit presumably?
23:42imirkin: for whether to use core or base csc?
23:43imirkin: iirc skeggsb said there was some funny synchronization between those
23:43Lyude: imirkin: yeah-we're actually already doing that
23:43Lyude: imirkin: that sounds very much like what I might be seeing here
23:43Lyude: i've already hit a bug where somehow it doesn't try enabling CRCs until the display commit after the one that would have enabled it
23:43imirkin: that's a very faint recollection though
23:45Lyude: tbh I might try getting rid of this trick anyway and just fixing igt because I have a feeling it's going to be tough to figure out if we can actually make this work properly in hardware (note that while nvidia gave us this trick, their driver currently doesn't use it supposedly) until we have a wider range of tests enabled so we can exercise all the different combinations of evo methods with this.
23:45Lyude: (you'd think that last part wouldn't matter, but it apparently does!)
23:56Lyude: part of me is starting to wonder if this is an issue with us not making sure non-blocking commits are being run with a high enough task priority
23:57Lyude: especially since I'm seeing times of upwards of multiple ms per-individual method we're writing into the evo pushbuffer, which, is just bizarre and also I can't really think of any sensible way that could happen otherwise
23:58Lyude: ...also because I can't seem to reproduce now the way I did last time