00:12 karolherbst: imirkin: :O
00:12 karolherbst: it works?!?
00:12 karolherbst: I mean...
00:12 karolherbst: for some definition of works
00:13 karolherbst: but the channel is up and running
00:13 karolherbst: the game is just... well.. not rendering as it should be ;D
00:13 karolherbst: apparently I have to replace the pushbuffer as well
00:14 karolherbst: imirkin: do I only have to ~0 dirty_3d and dirty_cp or also all the other dirty fields on a context?
00:14 karolherbst: I guess I have to...
00:14 imirkin: also the textures/etc
00:14 karolherbst: yeah...
00:14 imirkin: and ALLLSO careful with the graph state
00:14 karolherbst: why? can't I just set it to 0 and be done with it if everything is dirty?
00:15 karolherbst: it's a new channel
00:17 imirkin: the "global" graph state is outside of the dirty logic
00:17 karolherbst: ahh
00:17 imirkin: i forget what's in it
00:17 imirkin: but really you just need to set those global vars
00:17 imirkin: to whatever the "initial" value is
00:18 karolherbst: yeah
00:18 imirkin: and/or send down some commands which make whatever the graph state's values be the thing
00:18 karolherbst: I just split out screen_create
00:18 karolherbst: but maybe I should set it to 0 as well
00:18 karolherbst: only patch_verticies is set to 3
00:21 imirkin: i guess better to make the graph state match up to what was already there
00:21 imirkin: it's mostly simple settings
00:21 imirkin: just expensive to change iirc?
00:21 karolherbst: no clue.. going for the simpliest change for now
00:25 karolherbst: uhh
00:25 karolherbst: the dirty things are sized...
00:25 karolherbst: ewww
00:59 Lyude: hey do we have anyone who runs opensuse here?
00:59 Lyude: i just noticed something strange. a friend of mine just did an out of the box install and they've got a gtx1060, but it appears that it's defaulting to llvmpipe
00:59 karolherbst: uhh
00:59 Lyude: yeah…
00:59 karolherbst: maybe firmware stufF?
01:00 Lyude: oh maybe
01:01 Lyude: oh
01:01 Lyude: https://www.phoronix.com/scan.php?page=news_item&px=openSUSE-42.2-RC2
01:01 Lyude: ok it is what i thought
01:02 Lyude: well that sucks
09:47 karolherbst: Lyude: oh wow
13:53 karolherbst: ehh
13:53 karolherbst: the kernel rejected the push restoring state...
14:52 karolherbst: okay...
14:52 karolherbst: I think I have something working now :)
14:53 felco: karolherbst there is a way to test your patch?
14:53 karolherbst: not yet
14:53 felco: ok
15:09 karolherbst: mhh.. but visually it is still broken :/
15:10 karolherbst: imirkin: I think we need better state tracking :/
15:10 karolherbst: we can't recover the same state with the info we store atm
15:10 imirkin: ok
15:10 imirkin: not completely surprising.
15:10 imirkin: since we've never done something like this before
15:10 karolherbst: yeah.. but not sure _what_ is missing :D
15:11 imirkin: we only track the state that changes between contexts, not screen-level state
15:11 imirkin: karolherbst: so like ... i assume you're running the "magic" init right?
15:11 karolherbst: sure
15:11 karolherbst: everything
15:12 karolherbst: imirkin: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/e6a45701caae35f4f97c336f6d012af8749162c0
15:12 imirkin: sorry, i can't really look now
15:13 karolherbst: mhh but do we actually set so much state on a screen level?
15:13 karolherbst: ohh.. I think the issue is something else
15:13 imirkin: not _so_ much
15:13 imirkin: but some :)
15:13 karolherbst: sure, but bo contents are preserved
15:13 imirkin: yea, but like ...
15:14 imirkin: hm, cgit is down too
15:14 karolherbst: I am more wondering about things we set from gallium API calls
15:14 imirkin: yea, can't look now
15:14 karolherbst: without actually storing the stuff
15:16 karolherbst: let me see what I override in save_state
15:17 karolherbst: I think those num_* thingies should be left alone
15:24 karolherbst: maybe something else is up
15:24 karolherbst: it essentially just stops rendering
15:25 karolherbst: well
15:25 karolherbst: or stops displaying stuff
15:26 karolherbst: mhh probably shader related
16:06 karolherbst: imirkin: nvc0_render_condition is one example :/
16:07 karolherbst: but not sure if that is relevant here
16:08 karolherbst: we need a debugfs file to just kill a channel :D
16:09 karolherbst: nvc0_screen_bind_cb_3d is probably an issue as well
16:13 karolherbst: ehh vbo stuff as well. mhh
16:14 karolherbst: okay.. there is quite a bit which could get lost
20:28 karolherbst: ehhhhhh
20:30 karolherbst: imirkin: my current assumption is that stuff inside nvc0_surface.c could cause issues after a channel reset :/
20:31 karolherbst: but that's just copying stuff, no?
20:31 imirkin: and clear i think
20:31 imirkin: yea
20:31 karolherbst: mhh
20:31 imirkin: (and blit/etc)
20:31 karolherbst: wouldn't explain a stale output
20:31 karolherbst: so the problem is.. after the reset I get one frame and it stays like it forever
20:32 karolherbst: no errors
20:32 karolherbst: the vbo path is executed normally
20:33 karolherbst: is there something stupid which has to get send in order to get any output at all? but the entire state is invalid after a reset ... :/
20:33 karolherbst: ehh
20:33 karolherbst: dirty
20:33 karolherbst: I hope...
20:35 karolherbst: ehh.. the game uses occlusion queries... maybe something with the query setup is broken?
20:35 karolherbst: but still
20:39 karolherbst: maybe I should set vbo_dirty to true as well
20:41 karolherbst: and uniform_buffer_bound to false :)
20:43 karolherbst: uhh.. this will be so annoying with MT fixed as well...
20:46 karolherbst: :O :O :O
20:46 karolherbst: imirkin: that was it!
20:46 karolherbst: and it works!!!
20:46 imirkin: w00t!
20:47 karolherbst: mhhh.. now it hangs..
20:47 karolherbst: but I was able to press "continue" and it just looked as nothing happened after recovery
20:48 karolherbst: ahh
20:48 karolherbst: nouveau_bo_wait blocks forever :)
20:48 karolherbst: I guess some details need to be figured out still
20:48 imirkin: that's a long time.
20:48 karolherbst: yeah..
20:49 karolherbst: but it rendered like for 20 seconds after recovery
20:49 karolherbst: and everything was correct
20:49 karolherbst: so.. .. I guess there are just bos we wait on for random reasons
20:49 karolherbst: I did signal all fences though
20:49 karolherbst: mhh
20:49 karolherbst: DRM_NOUVEAU_GEM_CPU_PREP returns -1
20:52 karolherbst: mhh it's a query
20:52 karolherbst: I guess I have to deal with those as well
20:52 karolherbst: and set them all to NVC0_HW_QUERY_STATE_READY
20:58 imirkin: it'll get set to READY after a bo_wait() at worst
20:58 imirkin: i think
20:59 karolherbst: sure.. but it stalls for several seconds for each call to nvc0_hw_get_query_result :/
20:59 karolherbst: end no
20:59 karolherbst: it returns falls as the wait failed
20:59 karolherbst: *false
21:00 karolherbst: but I also don't know what DRM_NOUVEAU_GEM_CPU_PREP actually does
21:07 karolherbst: it looks funny if the context gets killed ever 2-3 seconds :D
21:23 karolherbst: [ 3933.712471] nouveau 0000:01:00.0: Xwayland[6393]: nv50cal_space: -16 fun
21:23 karolherbst: doesn't seem ultra reliable this channel recovery thing
21:24 imirkin: that meanst that the pushbuf ring got filled up
21:24 imirkin: submits going too fast
21:24 imirkin: or gpu stck
21:24 imirkin: or
21:24 imirkin: ...
21:25 karolherbst: I should start to file gdb bugs
21:26 karolherbst: maybe I use this gen counter for fences as well
21:26 karolherbst: then I won't have to touch the entire list...
21:28 karolherbst: mhh
21:29 karolherbst: sometimes getting those "gr: SHADER a204020e, sph: 0x04020e, stage: 0x22" after recovery
21:31 karolherbst: but no idea what could be wrong with those shaders..
21:31 karolherbst: I guess depends on when the recovery happens
21:32 karolherbst: not fun when the recovery happens _while_ validating the state :)
21:33 karolherbst: on the other hand.. retrying to set the state might just crash the channel again
21:34 karolherbst: ehh.. happaned during dri_flush and still those shader errors
21:34 karolherbst: mhh
21:34 karolherbst: imirkin: do you think I should reupload all shaders or something?
21:34 imirkin: you set the text page, yea?
21:34 karolherbst: yes
21:36 karolherbst: but there is also nothing of interest inside nvc0_screen_resize_text_area
21:37 karolherbst: besides setting the code page
22:03 karolherbst: ohh... stupid bug
23:25 karolherbst: skeggsb: I think we need to clear all reservation objects on bos part of a push buffer submission from killed channels
23:25 karolherbst: otherwise you can't wait on those bos, as nouveau_gem_ioctl_cpu_prep will timeout
23:27 karolherbst: or whatever we have to do
23:30 karolherbst: ahh.. calling validate_fini_no_ticket essentially on "broken" submissions or so mhh
23:30 karolherbst: I think
23:31 karolherbst: that might also fix some clients doinb bo_waits on broken channels