09:53imirkin: mogorva: i'll try to take a look... i suspect the STACK_UNDERFLOW issue is a compiler bug. however it may be unrelated to the hangs... those are due to "PGRAPH TLB flush idle timeout fail", which is basically the GPU saying "goodbye"
09:57imirkin: mogorva: btw, you're having *disproportionate* trouble with nouveau compared to most users... not to say that it works perfectly for everyone, but most people just have one bug and then they're good. you seem to be hitting issues in every game you play :(
09:58imirkin: [but in the process helping improve nouveau for everyone, so i appreciate the reports!]
10:54mogorva: imirkin: Hi-ho! I'm back.. Maybe people who has problem with open source drivers are just reluctant to report the bugs they encounter and they switch over to the binary drivers or use Windows if they want to play games.
10:55mogorva: Just have a look at this log I gathered during a short session. I just tried a few games that run natively on Linux. At least 3 of them generated bogus messages in the log: http://pastebin.com/1igpPpYv
10:56mogorva: Yet my system didn't crash, I was able to reboot when I was fed up
10:57imirkin: INVALID_OPCODE at 0742c8 warp 0, opcode e0990c05 e0990c05
10:57imirkin: well that opcode sure is invalid!
10:59imirkin: some of those stack mismatches are weird too
10:59imirkin: so like it makes sense for something like "f0000001 e0000002"
10:59imirkin: but not the other ones
11:00imirkin: oh... i bet we report the current opcode for all MP's not just the faulting ones
11:01imirkin: anyways, i'll try to have a look later
11:01mogorva: with casual desktop application (browser, video playback) nouveau is pretty stable here, it's just the games that I have problems with
11:03mogorva: there's a very common problem in almost every game I tried whether they are Windows games under Wine or native games: after a few minutes I see some random polygons flashing on the screen. They pop up at random locations on the screen for 0.5 seconds then they disappear
11:07mogorva: so far I bumped into only a handful of games that completely bring my system down. For example I tried the trace from https://bugs.freedesktop.org/show_bug.cgi?id=90513
11:08mogorva: when replaying the trace my system freezes in the first 1-2 minutes
11:31mogorva: imirkin: by the way, it was one of the d3d gurus from Wine who suggested me to report bugs in nouveau/Mesa
11:31mogorva: imirkin: quote from him: "Unrelated, but what's mostly interesting with Nouveau is the Mesa version installed. There have been a few significant bug fixes for your GPU family (NV50) in recent releases and I know Ilia Mirkin would love to get more bug reports and fix bugs :)"
11:53shakesoda: mogorva: I've run into some bugs and it's really unclear where to even go about them
11:53shakesoda: especially since I don't have lots of test hw
11:54shakesoda: latest example: blender is screwed up on nouveau (on NVA5 at least). selections are wonky in edit mode.
11:54shakesoda: last week: cities: skylines takes out X
11:55shakesoda:was going to say something about it after getting home later today or tomorrow
11:55mogorva: shakesoda: have you opened a bug report yet?
11:56shakesoda: mogorva: not yet, no
11:57mogorva: is it the native linux version of blender that you have problems with ?
11:59mogorva: shakesoda: i can give it a try on my system
12:01shakesoda: mogorva: default cube -> edit mode (solid display mode) -> should have visible triangles which shouldn't be there and selection will work incorrectly if you go for certain places.
12:02mogorva: huh..i'm not familiar with blender, the interface looks a bit complicated :)
12:03shakesoda: mogorva: assuming all default settings, just press tab and right click in some places on the cube
12:03shakesoda: (make sure mouse is in the center to begin with)
12:03shakesoda: also, I tested with blender 2.74 / latest stable. if you're using a repository version it is likely older.
12:06mogorva: shakesoda: this is what I see in blender 2.74 -> http://i.imgur.com/M0nB2Xi.png
12:07shakesoda: mogorva: what'd it look like before moving anything?
12:07shakesoda: broken looks like this: http://i.imgur.com/jkH2Abp.png
12:08shakesoda: selection in the darker regions is what gets weird.
12:10shakesoda: also I should have specified, it's face selection that gets messed up (ctrl+tab or the button with the highlighted polygon on the cube)
12:10mogorva: the starting screen here: http://imgur.com/gbcnz1C
12:11shakesoda: http://i.imgur.com/HRUeXkw.png bad selection in action, life-ring looking thing is where I clicked, highlighted polygon is what got selected.
12:15pmoreau: shakesoda: Same behaviour on my NVAC
12:16pmoreau: Not sure if I would blame Nouveau for that though, at least not for the incorrect selection
12:17shakesoda: pmoreau: where to place the blame in matters like this gets hazy really fast :)
12:17shakesoda: (especially for people like myself who aren't driver developers, I guess)
12:19pmoreau: I agree :-)
12:19shakesoda: it doesn't happen with llvmpipe or on nvidia blobs, at the least, which leaves... gallium or nouveau?
12:19pmoreau: It's way more obvious if you get tons of spam from Nouveau in dmesg :D
12:19pmoreau: Oh, ok
12:20shakesoda: yeah, dmesg spam makes things nice and clear.
13:20imirkin: shakesoda: so the basic deal is that i don't really have time to learn every program and futz with it. however if you make an apitrace that demonstrates the issue, and you include an explanation of what the thing ought to look like, i can take a look
13:21imirkin: dmesg spam is often incredibly unclear, but every so often it's useful
13:22shakesoda: imirkin: I should be able to make a good and bad trace when I get home and have a bit of time.
13:24imirkin: well ideally there's just one trace
13:24imirkin: that renders incorrectly on nouveau and correctly on... not-nouveau
13:25imirkin: wait, can you explain what's wrong?
13:25imirkin: i might know what the issue is... nv50 doesn't support edgeflags, which iirc blender uses
13:26imirkin: (edgeflags specify which lines get drawn when the polyfill mode is "lines")
13:28imirkin: or is the issue the shading of the polygons, that they look darker than they should? that could be a much simpler error
13:29shakesoda: I think it's two issues, the shading only happens in one draw mode (solid) but the selection issue happens in all of them.
13:30shakesoda: *the shading problem
13:30imirkin: ok. well, make the trace, and then try to replay it with LIBGL_ALWAYS_SOFTWARE=1
13:30imirkin: if it renders properly with that, just let me know, and i'll be able to easily compare
13:30imirkin: if it also renders improperly there, try to describe the issues
13:31imirkin: nv50 (and nvc0) also has an issue if you set the shade model to flat, but then set some of the colors to smooth in the glsl.
13:32glennk: imirkin, looks to me like selection might be using depth readback
13:33imirkin: all i know about blender is that if it's an outdated way to do something, it does it.
13:36shakesoda: trace replays properly with LIBGL_ALWAYS_SOFTWARE set
13:36imirkin: awesome. put it up somewhere, i'll take a look
13:37shakesoda: I was looking at blender's drawing code a while back and it's really nasty and outdated, although things are finally getting a bit more modern... slowly.
13:38imirkin: if i can repro on nvc0, things will get fixed much easier than if it's nv50-only problems
13:38imirkin: i should probably plug a nv50 in instead of the nv40 that i have now
13:39shakesoda: it's at least confirmed on NVAC, too
13:39imirkin: yeah, that's all nv50-era
13:39shakesoda: oh, I see.
13:39imirkin: those are tesla family. nvc0 is fermi.
13:40imirkin: you mentioned you have a nva5? that actually has some features nvac doesn't... it's DX10.1 while nvac is DX10
13:45shakesoda: https://dl.dropboxusercontent.com/u/1960937/blender-selection-bug.trace here you go. frame 44/45 should be the ones with the bad behavior for selection, 34/35 are the first with bad display
13:58imirkin: hm, sad. looks to get drawn just fine on my nvc0
14:03imirkin: shakesoda: can you put up what your frame 45 looks like?
14:22shakesoda: imirkin: https://dl.dropboxusercontent.com/u/1960937/blender.trace_frame_827651-1_0.png
14:22imirkin: ok cool, thanks
14:22imirkin: so... looks like it's drawing stuff with GL_SHADE_MODEL flat
14:23Yoshimo: skeggsb: What is this big thing Phoronix mentions that you didn't get done in time for the last kernel merge window?
14:23imirkin: glennk: is there any way to specify a color varying to be smooth without using glsl (which blender doesn't appear to use)?
14:23shakesoda: imirkin: note that 45 was a good selection, 49 is bad (my bad, it's the same thing except I had the polygon on the left side)
14:24shakesoda: display is wrong either way, though
14:25glennk: imirkin, blender uses glShadeModel(GL_FLAT) and GL_SMOOTH
14:25glennk: and it likes drawing stuff where thats the only state that changes
14:25glennk: had a r600 bug where it failed to detect that
14:28glennk: see commit 0fb221
14:30imirkin: well, nv50/nvc0 handle the shade model incorrectly
14:30imirkin: and in different ways, unfortunately
14:30imirkin: it seems like this bug could be accounted for by a shade model handling issue =/
14:31imirkin: unfortunately there's no good way to fix it... have to introduce shader variants and do it all "by hand"
14:31imirkin: and nouveau doesn't currently do shader variants at all
14:31glennk: hmm, thought nv had hardware support for this on older chips at least
14:32imirkin: yeah, it does. but it doesn't handle some of the stupidity added by GL3
14:32imirkin: i.e. if you want to flat-shade, you can. but then you can't change your mind and enable smooth shading for some things.
14:33glennk: you mean its all done in the shader?
14:33imirkin: the hw just disables interpolation for interpolation of the colors
14:33imirkin: so there's nothing a shader can do to get it back
14:33glennk: oh that
14:33glennk: not an issue for blender at least
14:34imirkin: yeah i guess not
14:34imirkin: the varying setup on nv50 is quite confusing in its own right
14:34glennk: ditto on r600
14:34imirkin: and quite amusingly, there are differences between the original G80 and G84+. i.e. it's the same things but i think they're indexed differently.
14:35imirkin: haven't quite worked it out though
14:35imirkin: and it's *really* hard to care about the G80
15:33imirkin: shakesoda: hey, can you load it up in qapitrace and do a draw @47769 and @47786 (you can use ^G) and check that in the first one you have a cube from just the lines, and that the cube shading in the second one is already messed up
15:36nfk: [ 825.065851] nouveau E[vlc] push 1 buffer not in list
15:36nfk: [ 825.067989] vlc: segfault at 7fd800000000 ip 00007fb3d2251ef1 sp 00007fb3a214b8c8 error 4 in libdrm_nouveau.so.2.0.0[7fb3d224f000+6000]
15:37imirkin: nfk: i pushed a change which might help with that...
15:37nfk: if it's known, then that's okay, i was just testing color management, normally i do not touch vlc
15:37nfk: and my moms watches her dvds on the laptop which has radeon pos
15:37imirkin: nfk: well, i dunno if *that* one is known, but i fixed an issue which presented in a similar way
15:38imirkin: nfk: http://cgit.freedesktop.org/mesa/mesa/commit/?id=78d58e642549fbf340fdb4fca06720d2891216a8
15:38nfk: nothing else but vlc has given me that error yet
15:38imirkin: they'd have to be using an indirect draw or doing something *very* odd with queries
15:38nfk: vlc and getting it right, hell would thaw sooner than that
15:39imirkin: is this a nv50 or nvc0+?
15:39nfk: nvc3, iirc
15:39imirkin: k, so that would have indirect draw
15:41nfk: [ 4.314478] nouveau [ CLK][0000:01:00.0] 0f: core 850 MHz memory 1900 MHz // has my memory always been that fast?
15:41imirkin: it never has. that's the highest perf level, if perf level switching were available in nouveau for your gpu. it isn't.
15:43imirkin: shakesoda: either way, please file this issue at bugs.freedesktop.org Xorg -> DRI/nouveau (iirc)
15:43nfk: i just suddenly feel that 1900 MHz is kinda out of place but i'm sure it's normal but effort to find the box which is probably in the cupboard behind me under and behind some junk
15:43imirkin: GDDR5 is fast
15:44nfk: yeah, i was hoping for some GDDR5 RAM the other year but we're still stuck at DDR3 for most mobos
15:44nfk: and those 32 GB DDR4 kits are also hella expensive
15:45imirkin: hm, well looking at the vlc source, i don't see it doing any indirect drawing
15:45nfk: in fact, why does my laptop from 6 years ago have about as much RAM and resolution (albeit shitty TN panel) as today's commoner laptops?
15:47imirkin: and it doesn't appear to do anything with TF, so i dunno if my changes would fix anything
15:47imirkin: otoh i'm not too familiar with vlc, so who knows
15:48imirkin: i also threw a debugging patch into libdrm to prevent that sort of fail, so if you build libdrm from git with debugging, it should assert-fail
15:53it_tard: imirkin, it worked the second time albeit it went black in few frames in fullscreen and in windowed mode performance was super bad but at least i could see it was using vdpau (yes, i have the firmware loaded) which probably means via the va-api compatiblity and then it hung x11 so well that i had to SysRq+REISUB
15:54nfk: ah, time to do some excorcism
15:55imirkin: nfk: it might be helpful to do the debug patch to libdrm which should cause it to assert
15:55imirkin: rather than do things with the bad pointer
15:55nfk: how annoying is that on gentoo
15:56nfk: as you might imagine, i don't care much about vlc so my motivation is not that high
15:56nfk: in fact, it will probably go away if i reconfigure vlc not to use vdpau/vaapi
15:58imirkin: nfk: probably. i'm pretty sure asserts should actually be enabled by default
15:58imirkin: i would advise just checking out the drm tree, building it with --prefix=/home/user/install
15:58imirkin: and then running with LD_LIBRARY_PATH=/home/user/install/lib
16:00nfk: imirkin, but how will that differ from gentoo version then?
16:00imirkin: nfk: it will have my patch, which i pushed to that repo
16:00nfk: ah, nouveau drm?
16:00imirkin: well, libdrm_nouveau.so yea
16:00imirkin: but it's all part of libdrm
16:01nfk: so why isn't normal libdrm enough then?
16:01nfk: or you mean your patch for that issue??
16:02imirkin: i mean my patch should detect misuse of the libdrm api
16:03imirkin: i just pushed it, it's not part of a released version
16:04imirkin: shakesoda: btw, one thing to test out is to comment out the first if() in nv50_fp_linkage_validate [which tries to bail early if the twosided-lighting setting hasn't changed]
16:11nfk: imirkin, i decided i don't care about vlc enough to bother debugging this, sorry
16:11nfk: though if it shows up elsewhere, i'll be glad to get off my fat arse and debug it properly
17:11koz_: Hi all!
17:11koz_: Was wondering if anything's happening with respect to the bug I listed about the core speed not reclocking properly.
17:12imirkin: which bug?
17:13koz_: imirkin: I posted it quite a while ago. I have an asus geforce gtx 650 (IIRC), and when I try to reclock it, the *memory* reclocks just fine, but the core frequency stays put.
17:13koz_: Because of a voltage issue or something.
17:13imirkin: ah ok. no progress afaik.
17:14koz_: imirkin: Sadface. Might just have to buy another card.
17:14koz_: Because I really don't wanna run the blob.
17:14koz_: What cards reclock well, in your experience?
17:15imirkin: sorry, no personal experience with it
17:16koz_: imirkin: Would anyone on this channel know more?
17:17imirkin: mmmaybe... the thing is that it's extremely board-dependent
17:17imirkin: every manufacturer uses diff settings/etc...
17:18imirkin: i have a GK208 which reclocks just fine iirc... came in a dell, doubt it could be bought easily in a store though. that's my only kepler.
17:19imirkin: it's a chip... things like "GTX 650" aren't particularly specific.
17:19koz_: imirkin: Do you game on it much?
17:19imirkin: there are like 20 diff model names that have the same chip... and oftentimes a single model name can be multiple diff chips
17:19imirkin: i don't game at all
17:20koz_: imirkin: Ah.
17:20imirkin: anyways, it's a pretty weak card
17:20koz_: That's the main reason why it's not enough - for normal use, my card works just fine.
17:21koz_: I guess I might have to bite the bullet and try a 760.
17:21imirkin: koz_: skeggsb is the one who knows about all this stuff, but he can be pretty busy
17:22koz_: imirkin: Thanks. Is there a time he's usually available on this channel>
17:22imirkin: he's in AU, so... daytime there, but generally not on weekends
17:22koz_: imirkin: Ah, that works well, since I'm in NZ.
17:23koz_: Well, thanks. Now I have to wait around for him and save money.
17:23koz_: Is it true that 780Tis outperform the blob on Nouveau?
17:25koz_: Awww. :(
17:29imirkin: i mean, maybe playing d3d9 games with gallium/nine. but still unlikely since it won't reclock to the highest level
17:55koz_: I see.
19:57imirkin: mogorva: btw, is the sam3 issue with the latest mesa? joi debugged an issue in it a while back: http://cgit.freedesktop.org/mesa/mesa/commit/?id=f9e2295560f9b4869fa2a94933c1881ec7970af4
19:57mogorva: yes, with current git
19:58mogorva: oh, it's not Serious Sam 3, it's Sam n Max 3
19:59mogorva: the system you mean, right?
19:59mogorva: *system freeze
20:04imirkin: i guess it's a diff game :)
20:04mogorva: yes it is
21:14mogorva: i see these in dmesg after playing dota2 for a few minutes and the game finally crashed: nouveau E[steam] fail set_domain
21:14mogorva: nouveau E[steam] validating bo list
21:14mogorva: nouveau E[steam] validate: -22
21:16imirkin: is that dota2 reborn, or dota2 original?
21:16imirkin: not sure what "fail set_domain" is tbh
21:16mogorva: i think it's the original, just tried to reproduce https://bugs.freedesktop.org/show_bug.cgi?id=86356
21:18mogorva: the rendering issue reported there is gone when using your proposed fix from https://bugs.freedesktop.org/show_bug.cgi?id=90887#c9
21:19imirkin: oh very cool
21:19imirkin: i'm waiting on calim to get back to me about that fix... it's touching compiler guts that i tend to normally avoid
21:19mogorva: lots of tiny, dark dots are present on surfaces though: http://i.imgur.com/QoGIUQn.png
21:20imirkin: that reads like shader execution fail to me
21:20imirkin: or at least texturing fail
21:26mogorva: what do you advise if I see those failing messages from nouveau in dmesg ? is a reboot required because nouveau is dying or I can keep running the current session?
21:27imirkin: those things tend to be errors where the application asks the kernel to do something, and the kernel can't
21:27imirkin: so it says so
21:27imirkin: and the application doesn't take kindly to such things :)
21:27imirkin: and crashes
21:28imirkin: there's a very half-hearted attempt in nouveau mesa drivers to deal with such situations but... it tends to fall rather short
22:21mogorva: how reliable to work with a trimmed trace? I have a trace that consists of 1233 frames, but only frames between 1022 and 1089 that show the gameplay when the problem occurs, the rest of the frames are the menus and loading screens.
22:23imirkin: sometimes it works well. most of the time it's a pretty big fail.
23:20mogorva: yesterday I mentioned a problem that exists in almost every game I tried: black polygons appear randomly, even when the player is not moving and they tend to appear in the menus too, here's a video where you can catch some of those flashing objects: https://drive.google.com/open?id=0B-tTbLKBl-tOSG4xd09CRExFelk&authuser=0
23:21imirkin: probably some sort of flush issue :(
23:27imirkin: mogorva: btw, i assume that the Civ5 issues you filed a bug about happen with all the various fixes i've sent around applied?
23:28imirkin: iirc i have a fix for some form of Civ5:BE issue btw... it tries and fails to spill a b96, at least on nvc0. not sure how it'd shake out on nv50.
23:28mogorva: actually, i didn't try any of your patches in Civ5, but i'll try them now
23:29imirkin: well, no point in trying the crash fixers if it's not crashing
23:30imirkin: but there's at least the weirdo indirect addressing issue
23:31mogorva: what other patches do you mean? I know of your patch from bug #90887 that fixes the garbled screen issue in several games
23:31imirkin: the other one as well.. hold on
23:31imirkin: i mean the one from https://bugs.freedesktop.org/show_bug.cgi?id=91056
23:32mogorva: ah yes indeed...i give it a try with Civ5 asap
23:33imirkin: i'll plug a nv50 family card in tomorrow hopefully and try to figure out what the actual restrictions are
23:34mogorva: cool, thanks
23:34imirkin: maybe it's just indirect constbufs really can't be loaded from any ops
23:34imirkin: maybe it's something more subtle
23:36imirkin: that civ5 trace plays back fine on my nvc1 btw
23:36mogorva: a nouveau-related commit has just landed in git, have you seen it? http://cgit.freedesktop.org/mesa/mesa/commit/?id=a98600b0ebdfc8481c168aae6c5670071e22fc29
23:36imirkin: take a look at the 'committer' field
23:37imirkin: and Reviewed-by tag
23:37mogorva: i must have been blind :)
23:37imirkin: these things are all fake-able of course
23:37imirkin: but yes, i did. it's fixing a corner case of a corner case.
23:38imirkin: that in practice you end up only hitting with ZaphodHeads enabled
23:46mogorva: looks like the card I have (NV92) has more trouble with nouveau than others. So far you could only repro the issue that I reported in bug #91117, right?
23:48imirkin: mogorva: it's a different family
23:48imirkin: i have a fermi in my main box, which is a totally different (and more flexible) isa
23:49imirkin: it's also probably a bit more debugged... dunno
23:50imirkin: for each bug i have a different excuse :) but that's of little consolation to you.
23:50imirkin: the bug in 91117 was just a very simple "doh" bug where the optimizer was sticking the modifier on the wrong arg, so that'll happen to all gpu's
23:51imirkin: bug 90887 seems like it should affect everything, but the only way i've managed to *actually* trigger it doing something bad has been with the nv50-family TXL lowering code
23:51mogorva: your proposed patch from bug #91056 doesn't fix the issue in Civ5. Any other pending patches that can be related and worth trying?
23:52imirkin: hmmm... don't think so
23:52imirkin: can you see if it's either AlgebraicOpt or ModifierFolding that's killing it?
23:53imirkin: (since you said that both opt=0 and opt=1 work... those are the opt=2 passes which do "simple" things)
23:55mogorva: how could I see that?
23:55imirkin: comment out the RUN_PASS line
23:55imirkin: and see if it magically fixes things