03:22 pmoreau: Are there any specific requirements before interacting with the flip mechanism - like flip_stop or flip_next -?
11:47 Karlton: "[17889.874618] nouveau E[ DRM] GPU lockup - switching to software fbcon" Can I fix this without rebooting?
11:48 imirkin: what are you trying to fix?
11:48 imirkin: no way to get hw fbcon back afaik
11:48 imirkin: but sw fbcon is just fine...
11:50 Karlton: imirkin: I am trimming an api trace but KDE is slow as hell now after I broke it while trying to launch a game xD
11:50 imirkin: ah ok. yeah, dunno... the gpu lockup recovery is variably successful
11:51 imirkin: you can try a suspend/resume cycle
11:51 specing: lol
11:51 specing: Karlton: you know what is funny?
11:51 buhman:gk106
11:51 specing: [676056.784498] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
11:52 imirkin: specing: are you using bumblebee?
11:52 imirkin: and/or manually futzing with acpi
11:53 specing: imirkin: nope
11:53 specing: faulty G84
11:53 imirkin: ah
11:54 specing: Im going to see how much it lasts on nouveau
11:54 specing: currently at 21 days
11:54 specing: nvidia-drivers lasts ~59 days
11:56 Karlton: gawd, sw fbcon sucks!
11:57 imirkin: Karlton: why are you even using fbcon?
11:58 Karlton: imirkin: Isn't that where it goes by default after GPU lockup?
11:59 imirkin: Karlton: fbcon = the thing that drives vt's
12:01 Karlton: imirkin: Then what should I be using?
12:02 imirkin: i dunno... X comes to mind
12:02 Karlton: well I was before it died...
12:02 imirkin: so... restart it
12:03 specing: we don't just restart our computers
12:03 imirkin: specing: glad you understand :)
12:04 imirkin: up 64 days, 12:10
12:10 specing: up 65 days, 6:38
12:10 specing: har har
12:11 imirkin: i'll catch up soon!
12:13 imirkin: well that's just great. i managed to (accidentally) repro the "multiple buffers on list" issue
12:13 imirkin: mlankhorst: looks like it happens when i use vdpau and drag the window around
12:15 imirkin: mlankhorst: which is weird coz the thing that gets printed by nouveau as the rejected thing doesn't actually have the same buffer on there multiple times
12:15 imirkin: it complains about the last buffer on the list though
12:16 imirkin: mlankhorst: skeggsb: http://hastebin.com/ocuvocoxir.sm
12:16 imirkin: this is on 3.17
12:16 Karlton: imirkin: restarting X made everything freeze up, had to do a hard reboot
12:17 imirkin: libdrm 2.4.60
12:17 imirkin: Karlton: doh, i guess it didn't really recover
12:17 imirkin: Karlton: i think suspend/resume tends to work more often as a "fix it" tactic
12:19 imirkin: mlankhorst: skeggsb: looks like a new issue in 2.4.60 -- 2.4.59 works fine
12:20 imirkin: which has all the thread-safe fixes :p
12:20 imirkin: i'll bisect... but i kinda know where it's going to lead me...
12:25 imirkin: mlankhorst: yeah, as i suspected, commit 5ea6f1c3262888 is first bad
12:34 mlankhorst: uh oh o.O
12:37 mlankhorst: imirkin: can you try to reproduce with valgrind?
12:37 imirkin: yeah
12:38 imirkin: valgrind's all happy
12:38 imirkin: i can add debug things or valgrind flags or whatever you want... just let me know
12:38 mlankhorst: bleh i should try to get my debugfs patches upstream, would have made this easier to debug
12:38 mlankhorst: try with --free-fill=cc ?
12:40 imirkin: valgrind's all good
12:41 imirkin: looks like it's always the last buffer
12:41 imirkin: and also the 13th buffer, but that could just be mplayer-specific
12:41 imirkin: does it not repro for you?
12:41 mlankhorst: yeah no surprise there..
12:41 mlankhorst: I can only test on gk20a right now
12:41 imirkin: mplayer -vo vdpau -- just move it around
12:41 mlankhorst: which probably has no vdpau
12:41 imirkin: ah
12:41 imirkin: no, not on the chip anyways
12:41 imirkin: it has a separate video encoder/decoder thing
12:42 mlankhorst: it's always the last buffer because of a lifetime thing, probably..
12:43 mlankhorst: can you dump the BO in userspace?
12:43 imirkin: "the bo"?
12:43 imirkin: you mean the bo list?
12:44 mlankhorst: yeah
12:44 imirkin: which userspace? mesa before it submits? or in libdrm?
12:44 mlankhorst: libdrm
12:44 mlankhorst: hm..
12:44 imirkin: k, let's see if i can figure it out....
12:45 imirkin: know offhand which function i should be looking at?
12:45 mlankhorst: does the pushbuf ioctl unref somewhere?
12:48 imirkin: so i set NOUVEAU_LIBDRM_DEBUG=1
12:48 imirkin: which dumps all the things before submitting them
12:48 imirkin: and it was the same as the thing printed when it errors
12:50 mlankhorst: meh, can you check from lsof how many times card0 was opened?
12:51 imirkin: let's say i don't know how to do that...
12:54 mlankhorst: lsof /dev/dri/card*
12:55 imirkin: that just shows me who's using it
12:55 imirkin: not how many times it was opened
12:55 imirkin: right?
12:55 imirkin: anywyas, right now i have 3 from X, and 1 plugin-container
12:56 imirkin: mplayer also opens it 3 times
12:56 imirkin: mlankhorst: http://hastebin.com/epitikoroy.vbs
12:56 imirkin: (X also has 1 "mem" and 2 numbered fd's)
13:01 mlankhorst: imirkin: eep, 2 fds. shouldn't be the case..
13:01 imirkin: mlankhorst: X does the same thing
13:01 mlankhorst: oh well then
13:01 mlankhorst: imirkin: does glxinfo list vdpau_interop?
13:01 imirkin: not to say that X is the metric for perfection, but merely pointing it out :)
13:02 imirkin: probably. but mplayer doesn't use X
13:02 imirkin: er
13:02 imirkin: gl
13:02 imirkin: yeah, it's listed
13:02 mlankhorst: ok probably fine then
13:02 imirkin: this is just straight-up vdpau
13:03 imirkin: but it does open/close the device a few times for good luck
13:03 mlankhorst: imirkin: yeah but still should be fine, screen's shared.
13:03 mlankhorst: does it use any kind of multithreading?
13:03 imirkin: mesa 10.5.1 ftr
13:03 imirkin: i *highly* doubt it
13:04 imirkin: let's see what mpv does
13:04 imirkin: ah, i've uninstalled it.
13:04 imirkin: o well
13:05 imirkin: oh, but fwiw, windowmaker likes to do that crazy X call when moving windows around
13:05 mlankhorst: imirkin: hm, do pushbufs have refcounting for bo's on the validation list?
13:05 imirkin: XBlockClients or osmething
13:05 imirkin: i forget
13:05 imirkin: the one that destroys the universe for a short while
13:05 imirkin: pushbuf_refn?
13:06 imirkin: tbh i dunno too much about all those various details
13:10 joi: maybe mmt+demmt can answer this question?
13:13 imirkin: ooh, good idea
13:15 imirkin: how do i know which thing it errors on?
13:17 joi: if you add -e ioctl-raw you should see ioctl return values
13:24 joi: oh, actually, you don't need it - when ioctl returns error it's printed as "err: %d"
13:25 imirkin: joi: how are ioctl errors displayed in demmt?
13:26 joi: "err:" is appended to decoded text
13:26 imirkin: hmm
13:26 imirkin: not seeing that in the trace =/
13:26 imirkin: running with -e all
13:27 joi: https://github.com/envytools/envytools/blob/master/demmt/drm.c#L680
13:29 imirkin: joi: hm, well i guess it's never seeing the error
13:29 imirkin: which is odd, since both the kernel and libdrm definitely complain
13:29 joi: but you see "nouveau: kernel rejected pushbuf: Invalid argument" in dmesg when it happens?
13:30 joi: eh, not in dmesg of course
13:30 imirkin: that message is in the stdout/stderr of the app
13:30 joi: yeah
13:30 imirkin: dmesg has the validate failures
13:30 joi: hmm, you could catch stderr/stdout with mmt
13:31 joi: --mmt-trace-stdout-stderr
13:42 joi: imirkin_: on completely different topic, did you look into why shaders@glsl-fs-lots-of-tex does not pass?
13:42 imirkin: well, i guess we know the answer to that question -- no, i can't catch stderr/stdout
13:43 imirkin: display updates died somehow... nothing in dmesg =/
13:44 imirkin: i'll play with this some more later when my raid array rebuilds. gr.
13:45 buhman: needs more ssd
13:45 imirkin: meh, it's a bunch of 2T drives
13:46 RSpliet: yes, because SSD are so reliable that you never have to rebuild your RAID array
13:46 buhman: needs more magnetic tape
13:46 imirkin: they don't make SSD's that big in my price range
13:46 imirkin: (certainly not back when i bought this set up)
13:46 imirkin: magnetic tape also not as big (or reliable) as many people believe
13:46 imirkin: if you want a few T of data, the *cheapest* thing is a hdd
13:47 buhman: I wonder if a suffiently large raid0 tape array would provide acceptable performance
13:47 buhman: or, volume group composed of tape drives, or something like that
13:51 imirkin: let me know when you try it out
13:51 imirkin: joi: re glsl-lots-of-tex -- the test is wrong
13:51 imirkin: joi: should just be removed
13:52 imirkin: coz with the CSE that the glsl ir does, it doesn't even test the thing it claims
13:53 joi: I'm wondering why does it pass on intel
13:54 imirkin: coz it's not THAT wrong :)
13:54 imirkin: it's only a little wrong
13:54 imirkin: basically the intermediate value is like 2.55 / 256
13:54 imirkin: it's expecting that it get rounded up to 3
13:55 imirkin: but nvidia rounds it down to 2
13:55 imirkin: (when storing to the RGBA8888 unorm surface)
13:55 imirkin: however this is totally unspecified behaviour
13:55 joi: ok
13:56 imirkin: (i might have the specific numbers wrong, but that's the general idea of why it fails)
14:55 whompy: imirkin: I saw in the notes on the y-tiling patch that you need a piglit run on an nv50. I haven't run much with piglit on nouveau yet, but seem to remember hearing about threading issues or some such.
14:56 imirkin: whompy: yeah, see http://people.freedesktop.org/~imirkin/ for how i run it
14:56 imirkin: whompy: if you could just try that one piglit test first, that'd be great
14:57 imirkin: apparently G80 has some sort of additional restrictions on 3d tiling that are gone in G84+
14:57 imirkin: difficult to bring myself to care though :)
14:58 imirkin: maybe if i found one i'd care more...
15:03 whompy: Which test is it that you would like to see?
15:04 imirkin: whompy: bin/texelFetch fs sampler3D 98x129x1-98x129x9 -auto -fbo
15:04 whompy: Ok, thanks!
15:04 imirkin: should fail on master, pass with my change
15:04 imirkin: if it still fails, pastebin the text output and i'll have more questions for you :)
15:23 whompy: imirkin: fails before, passes after on nv50 as intended.:)
15:23 imirkin: whompy: awesome, thanks for checking
15:24 whompy: No problem. Gave me an excuse to fix my piglit setup on here anyway.
15:24 imirkin: this was with a nva5 right?
15:25 whompy: Yep
15:35 imirkin: calim: thoughts on http://lists.freedesktop.org/archives/nouveau/2015-April/020449.html ? intuitively makes sense since we only had one tile mode to texture setup... but not sure why you had that limit in the first place
16:39 imirkin: glennk: do you know if it's common to have 3d textures whose height is > 16 * 2^depth
16:39 glennk: you mean z rather than height?
16:40 imirkin: no, i mean height :)
16:40 imirkin: y
16:40 imirkin: basically there was an issue in the minification of 3d textures
16:41 imirkin: s.t. if you ended up with a miplevel whose depth was 1 but height > 32, then fail.
16:41 imirkin: wondering if this can happen in real life, or only piglit
16:41 imirkin: [height > 16 on nv50]
16:43 glennk: only thing i can think of are some fluid simulations with only a few z layers, but more detail in x/y
16:44 imirkin: like water in some random game?
16:45 glennk: those tend to be just flat 2d
16:46 imirkin: what's a fluid simulation then?
16:47 glennk: this one comes to mind http://www.nvidia.com/coolstuff/demos#!/box-of-smoke
16:48 glennk: hmm, or cloud rendering
16:48 imirkin: ah heh ok
16:48 imirkin: so like fog
16:53 calim: imirkin: it would be nicer if you moved that condition into nv50_tex_choose_tile_dims, just pass level0depth to it ...
16:54 calim: or some such parameter
16:54 calim: nice bug catch btw.
16:55 calim: eeeehlegance :)
16:55 imirkin: calim: hm, yeah, that makes sense too
16:55 calim: the limit is what the blob does (or did), I figured it's a performance optimization so I copied it
16:56 imirkin: ah ok. it (sorta) makes sense... good with keeping it
16:57 imirkin: that last param will probably be more like layout_3d && pt->depth0 > 1