08:41mlankhorst: hakzsam: argh was busy :(
11:59imirkin_: fun, someone figured out the sched codes for maxwell: https://github.com/NervanaSystems/maxas/wiki/Control-Codes
15:00imirkin_: Karlton: "good" news -- i can repro the terasology fail on my GK208 as well.
15:00imirkin_: now to stare at a ton of shaders :)
15:02tobijk: imirkin_: if it helps, i still have the trace of terasology which i once inspected more closely
15:03imirkin_: tobijk: mmmm... i'd rather look at it myself tbh
15:04imirkin_: since it works on nvc0, i'm going to concentrate on the changes between fermi and kepler
15:04imirkin_: like the stupid texdep things
15:04tobijk: imirkin_: as far as i remember it had something to do with batching up draw calls
15:05tobijk: the terasology problem
15:06imirkin_: i'm not familiar with "batching up draw calls"
15:07tobijk: let me look at the trace again, maybe i'll find it again ;-)
15:11imirkin_: gah! FFMA.FTZ R9, R9, 1, R16
15:11imirkin_: i thought we optimized that out :(
15:11tobijk: thats it already?
15:11imirkin_: just stupid code
15:12Karlton: imirkin_: nice, I wish you great success in your endeavor :)
15:15tobijk: ah now i remember: glCallList(s)
15:16imirkin_: tobijk: that shouldn't be related to any problems
15:16tobijk: mh you were suspecting them back then as well, dunno just what i remember :)
15:19imirkin_: probably where the problematic draw comes from
15:20imirkin_: however by the time it reaches nouveau, it's not a call list at all
15:20imirkin_: just a regular draw like any other
15:20tobijk: mh ok
15:25imirkin_: uh oh. looks like there's a branch to a sched instruction
15:25imirkin_: can't imagine that'd be a good thing...
15:26tobijk: sounds reasonable: execute in the wrong order and we get garbage
15:26tobijk: good you found the sched insn thing a few hours ago?
15:28imirkin_: oh, that's for maxwell
15:28imirkin_: and it's not like we didn't know that it was there... just didn't know what it meant
15:34a1fa: imirkin_: does vdpau acceleration work on NV43 - 6600 GT (AGP). I did the firmware extract, but I guess no suitable firmware is loaded
15:35imirkin_: a1fa: check the video accel page i pointer you to, it has all the info you need
15:35imirkin_: tobijk: hrm, well it renders differently now, but... still fail :(
15:36tobijk: imirkin_: after changing the insn we takled about above?
15:36imirkin_: after changing the branch to point to the actual instr
15:37imirkin_: tobijk: http://hastebin.com/wehinuzude.mel something like that, but in emit_nvc0 for you
15:37imirkin_: actually if you could test whether that changes anything that'd be nice -- there could be more issues with gk110
15:38tobijk: on it
15:38a1fa: ah imirkin_ vp1
15:38a1fa: N/A ;(
15:38imirkin_: a1fa: yes. works fine with xvmc (for mpeg1/2)
15:39a1fa: anything in AGP that i could get that would get vdpau support?
15:40imirkin_: tobijk: this is the emit_nvc0 version: http://hastebin.com/akajocigoy.mel
15:40imirkin_: a1fa: to decode h264? hmm... maybe there were agp r600's?
15:40tobijk: already have it inserted, but thanks
15:40imirkin_: i think G80+ are all PCI-E (and a few with PCI versions)
15:41imirkin_: [oh, and naturally that will destroy the world for fermi, but i'll fix that up]
15:45tobijk: imirkin_: looks the same, or at least not that different
15:45imirkin_: hm ok
15:45imirkin_: o well
15:45tobijk: at least i dont see a difference :/
15:46imirkin_: you did add it to the right place yeah?
15:46imirkin_: i.e. not in f->pp == OP_CALL, but below it?
15:46tobijk: right before the " // currently we don't want absolute branches" natually
15:47imirkin_: that should be ok
15:49imirkin_: hm you might be right that it doesn't change anything
15:49imirkin_: i realized i was on another branch between the before/after
16:21imirkin_: texture2D(shadowMap, texCoord + vec2(1.0/SHADOW_MAP_RESOLUTION, 0)).x
16:22imirkin_: that code is just so painful
16:22imirkin_: this is what textureOffset is for...
16:22imirkin_: i guess they want to stay in GL2-land =/
16:24tobijk: wild guess, but i guess thats not the cause, happes in GL3 land as well
16:28imirkin_: the code is written for #version 120
16:28imirkin_: it's unfortunately not taking advantage of various GL3 features. that's what i'm complaining about
16:28imirkin_: anyways... looks like the green channel is just not making it out =/
16:28imirkin_: the red is correct, but the green is missing
16:29tobijk: so sometimes the green one makes it, sometimes not? memory access problems * urgh*
16:30imirkin_: no, probably something dumb
16:39tobijk: what bothers me is that sometimes things are see-through in my trace :/ makes me wonder if there are not several problems with that game
16:40imirkin_: yeah, i see a bunch of rendering artifacts beyond the colors
16:52imirkin_: ok, found a big issue :(
16:52imirkin_: stupid conditional logic. why can't shaders just not have any funny business.
16:52imirkin_: turns out that texture instructions sometimes occur inside of conditional logic
16:52imirkin_: and furthermore, the RA likes to insert constraint moves
16:53imirkin_: and furthermore, if that constraint move sticks around, it needs a texbar before it.
16:53imirkin_: but it looks like the thing that inserts texbars runs before those constraint moves get added
17:00imirkin_: so i don't forget, here is a program that causes this: http://hastebin.com/ipirusuves.avrasm
17:00tobijk: i read virus :D
17:03imirkin_: hmmmm... insertTextureBarriers runs post-ra
17:05tobijk: shader is way bigger for e7 compared to c0
17:06imirkin_: 7 instruction difference... probably the texbars
17:07imirkin_: keep in mind that kepler also gets 8 bytes of sched info in ever 64 bytes of instructions
17:07tobijk: oh right
17:07imirkin_: i.e. 8 byte sched + 7 8-bite instructions
17:07imirkin_: while nvc0 also has 4-byte instructions. although iirc we don't use them
17:08imirkin_: not sure why
17:29imirkin_: tobijk: this fixes it for me: http://hastebin.com/camogiqeqa.hs
17:30imirkin_: obviously that's a BS fix, i need to think about wtf it's doing and figure out what to do with it
17:30imirkin_: calim: comments appreciated ;)
17:34tobijk: imirkin_: confirmed
17:34imirkin_: well that was fun
17:36imirkin_: i have no idea why that thing triggers... need a lot more prints
17:37imirkin_: that condition actually seems ok, so i'm a bit confused....
17:42tobijk: erhm yeah, dont know, playing with that condition is not very useful :D
18:00imirkin_: i think this is a deeper problem in the whole tex use logic
18:01imirkin_: way too clever for its own good :(
18:03tobijk: that is?
18:04toptwo: for nouveau firmware, it is enough to just have the nvidia firmware files exist in /lib/firmware/nouveau or more needs to be done beyond that for working accel?
18:21imirkin_: tobijk: well, some things get merged that probably shouldn't, and it decides that the tex instruction itself is a use, which in turn dominates the *real* use in the bb, and the code only adds the first use in any particular bb. ugh.
18:22Karlton: imirkin_: yes, that diff resolves the artifacts and purpleness! :D
18:29imirkin_: Karlton: thanks for confirming. now to figure out a proper way of fixing this issue
18:29imirkin_: and by 'now' i of course mean 'later' ;)
18:36tobijk: that insertTextureBarriers pass) does not get read easily :/
18:37imirkin_: it's actually quite simple
18:37imirkin_: the problem is that it's getting garbage input
18:37imirkin_: or at least unexpected input
18:37imirkin_: the tex's inputs and outputs got merged into one thing
18:38imirkin_: which causes hilarity to ensue
18:38imirkin_: that's the problem with doing these things post-RA
18:39imirkin_: but RA will sometimes insert mov's too, so you can't just do it after...
18:39imirkin_: er, before
21:13imirkin_: skeggsb_: not sure if you saw in the scrollback, but looks like someone took the time to work out the maxwell control code stuff. let me know if you plan on trying to integrate it into mesa
21:15skeggsb_: imirkin_: oh, i missed that
21:16imirkin_: skeggsb_: https://github.com/NervanaSystems/maxas/wiki/Control-Codes
21:16skeggsb_: yeah, reading it now :P
21:34imirkin_: btw, i have a patch that fixes branch targets on kepler/maxwell to avoid jumping to sched entries
21:34imirkin_: not sure if that'd help your gm200 woes
21:35imirkin_: in case you're interested: http://hastebin.com/uyixuhaxag.diff
23:36imirkin: skeggsb_: btw, also wanted to direct your attention to https://bugs.freedesktop.org/show_bug.cgi?id=90276#c9 in case you missed it
23:36imirkin: looks like someone bisected the PDISP fails to an innocuous-seeming commit