00:16pmoreau: Anyone familiar with PFIFO/EVO who could try some blind guess on the problem here? http://pastebin.com/YVD4ZmSh
00:17pmoreau: PFB seems unlikely as the difference between its state as default and being reseted through PCI are almost similar; the few differences aren't consistent across multiple tests.
00:36RSpliet: pmoreau: did you ask gdb what line in evo_wait that is?
00:40pmoreau: I don't think so, but I'm quite sure it's the "dmac->ptr[put] = 0x20000000" one
00:40pmoreau: Given how **big** put is
00:41RSpliet: likely, which means that either ptr is uninitialised, or put is a bogus value
00:42pmoreau: Probably the second option, as it worked before
00:42RSpliet: doesn't harm to print dmac, prt and put before executing the evo_wait()
00:42RSpliet: oh right
00:42RSpliet: that's the 3fffffff
00:42pmoreau: put is already printed
00:43RSpliet: yes, that's a problem
00:44RSpliet: apparently that resetting pushbuffer thing fails to do so properly :p
00:44pmoreau: Oh, one thing I forgot to say, is that if I happen to check for ``put > PAGE / 4`` and return NULL, then I get an interrupt in PMC
00:45RSpliet: is that after applying http://cgit.freedesktop.org/~darktama/nouveau/commit/?id=1c27bd1522d33075c9d9b625182530ea2bffd09b ?
00:45pmoreau: The "resetting pushbuffer" log is only there to say we entered the if conditional ``if (put + nr >= (PAGE_SIZE / 2) - 8)``, nothing more
00:46pmoreau: I tried that patch hoping it would fix the issue, but it didn't.
00:46pmoreau: Actually, let me run it again. It might be the laptop still hangs, but for a different reason
00:47RSpliet: please do
00:56pmoreau: RSpliet: What's the command again for checking the line with gdb?
00:58pmoreau: (But I still get an "unable to handle page request" in evo_wait, so quite likely the same issue)
01:06RSpliet: l *functionname+0x1337 from the top of my head
01:07pmoreau: Ah, indeed thanks!
01:08pmoreau: So yeah, it's that line which is failing
01:09pmoreau: (even with the patch)
01:12RSpliet: mm, so clearly the pushbuf isn't initialised properly
01:13RSpliet: it either overflows or is never set to sane values
01:14pmoreau: It had some sane values before the switch from channel 837e to 887d
01:14pmoreau: Or before methods were submitted to channel 837e
01:23RSpliet: yeah, same for me
01:23RSpliet: sth. with work :p
01:37mlankhorst: evil compiz is still using dri2.. wonder why
01:47mlankhorst: ah didn't kill it properly
01:57sigod: compton is my friend
02:10mlankhorst: gnurou: tegra doesn't set preferred_depth cap :P
02:27gnurou: mlankhorst: well it's not a display :P
02:28mlankhorst: I'm fighting flips at the moment
02:28mlankhorst: modesetting does something with pitch 4096
02:41pmoreau: RSpliet: Eh eh (back btw)
03:09mwk: imirkin_: wrt the Tesla branch thing, you know about the WATCHDOG thing, right?
03:15mlankhorst: I got the correct pitch now, but the bo allocated by dri3 has the wrong pitch so it falls back to a blit anyway.. :P
03:29mlankhorst: hm, dri3 passes bind_shared, which causes the bo to always be in gart
04:18mlankhorst: actually not the linear part it seems
04:30mlankhorst: mupuf: would 1600 mhz ddr3 be faster than boot speed gddr5?
04:31mupuf: no idea
04:31mupuf: ddr3 is slow though
04:45mlankhorst: yeah no flips
04:45mlankhorst: only blits for now..
04:45mlankhorst: and if it flips it might flip once
04:59RSpliet: on GT21X I'd say no, boot speed GDDR3/5 is better
04:59RSpliet: but there's too many factors to give a definite answer
07:01imirkin: mwk: not really
07:03imirkin: mwk: hm, what's the 0x18 value mean?
07:06imirkin: mwk: hm, looks like blob writes 0x1e in there
07:17mwk: 0x18 should be.. a lot
07:18mwk: 1 << 24 ticks
07:18mwk: 0x1e is 1 << 30
07:18mwk: 0x1f is unlimitted IIRC
07:18mwk: anyway, there's a doc...
07:19imirkin: tick == 1 instruction executed?
07:19mwk: I'd guess clock tick
07:20imirkin: so... at 1ghz clock, 0x18 is 0.016 seconds
07:20imirkin: which isn't an *outrageous* quantity of time, but definitely a lot
07:21imirkin: anyone with an nv50-family gpu who wouldn't mind doing a glretrace?
07:22imirkin: there's an apitrace in https://bugs.freedesktop.org/show_bug.cgi?id=78161, try retracing it with NOUVEAU_SHADER_WATCHDOG=false, and if that helps, try switching the 0x18 to 0x1e in nv50_screen.c
10:07mlankhorst: hm, flip fails because of ENOSPC..
10:28mlankhorst: gnurou: if I add a hack to make pageflipping work with the generic modesetting driver, another hack to make nouveau ddx use different alignment for shared dma-bufs so tiled pixmaps can be flipped, and a rebuild of the compiz opengl plugin to use dri3 instead of dri2 then I should have page flipping working now..
10:31mlankhorst: seems so, nice tiled scanout visible
10:33imirkin_: why tiled? is that what nouveau generaets?
10:33imirkin_: it seems like tegradrm should be able to handle tiled
10:33mlankhorst: it does
10:33imirkin_: just no api's for them to talk to one another?
10:34mlankhorst: i didn't do the ioctl yet
10:54librin: good day!
10:55librin: I just asked about SLI on the mailing list and got directed here...
10:55imirkin_: hi :)
10:55imirkin_: any particular reason you were interested in SLI beyond having the hw on hand?
10:56librin: For starters, actually using the hardware at hand
10:56librin: i.e. not having my second card doing nothing all the time
10:57imirkin_: did the things i talked about make any sense to you (without doing the research)
10:57imirkin_: if not, SLI might not be a great first project :)
10:57librin: I do know quite a bit of theoretical stuff
10:57librin: I just haven't really coded much with that knowledge
10:58librin: and yes, it all made sense to me (most was stuff I already know, even)
10:59imirkin_: ok awesome
10:59imirkin_: you've heard of tiled renderers?
10:59librin: Heard of them, but that's mostly it
11:01glennk: there's also alternate frame rendering, probably easier to get working
11:01librin: I just read through the wikipedia article on it :V
11:02mlankhorst: I'm not sure it's easier to get working though
11:02librin: Alternate frame rendering would probably be more useful in case I end up tackling my dream of making a stereoscopic 3D capability in MESA
11:02librin: ...or if someone else makes that
11:04imirkin_: librin: but... what is a frame :)
11:04imirkin_: this isn't the 90's anymore where you render to a frame and you're done... a single frame is actually like 1000 different draws
11:05glennk: i think the favored approach these days is let the game engine handle the multi gpu programming, and just provide the facilities to let it do so
11:05librin: I suppose I should tackle this issue from a different angle:
11:05glennk: at least thats what mantle/vulkan/dx12 appear to be doing
11:06imirkin_: glennk: right, that won't work for GL
11:06imirkin_: librin: i would _really_ encourage you to start with something smaller
11:06imirkin_: librin: you can look at some ideas at https://trello.com/b/ZudRDiTL/nouveau
11:06glennk: oh yeah, definitely start with something that looks stupidly trivial
11:06imirkin_: librin: if you're feeling ambitious, try running Unigine Heaven -- there's a funny issue there that i haven't been able to track down.
11:08librin: semi-rhetorical: to "make SLI work", is there even anything that needs to be added into kernel portion of the driver or is the existing multicard infrastructure enough and only Mesa work is needed?
11:08imirkin_: mmmm.... good question
11:09imirkin_: you might need to make it possible for card1 to wait on fences from card0
11:09imirkin_: should be possible with the new dma-fence stuff though
11:09imirkin_: or whatever it's called
11:09librin: I have a few funny issues on a certain game and I am almost sure that Nouveau is at fault (seems to run fine on Intel and FOSS AMD)
11:09librin: maybe I should tackle THAT
11:10imirkin_: step 1: build mesa git, and see if the issue persists
11:10imirkin_: also describe the issue, perhaps someone will have ideas
11:10imirkin_: (is it black textures? then you're missing libtxc_dxtn)
11:10librin: I am always using mesa git
11:11imirkin_: also make sure you have --enable-texture-float enabled
11:11librin: I am updating Mesa so often, it might look like I have OCD about it
11:11librin: it's either:
11:12librin: a) driver complaining and failing time something out (don't remember the exact message it spews into dmesg at the moment)
11:12imirkin_: that's a gpu hang :)
11:13mlankhorst: imirkin_: yeah unfortunately the patch that I wrote that made nouveau do a non-blocking sync across different gpu's was not merged :(
11:13mlankhorst: so instead it will wait during cs time now
11:13librin: unlike most GPU hangs, this one is fully recoverable by killing the game, it seems
11:14imirkin_: librin: that's true of many gpu hangs... unfortunately often the process that needs killing is X
11:15librin: killing X never seems to help with other hangs
11:15librin: starting a new X after getting rid of the old just insta-hangs the new one
11:15librin: I wouldn't call that "recoverable"
11:15imirkin_: perhaps you don't wait long enough for X to actually die? dunno.
11:15librin: especially if no graphics output is possible, not even fb
11:16librin: Oh, I do wait
11:16librin: and check if it's really dead
11:16mlankhorst: yeah.. hang recovery doesn't work
11:17librin: so yeah, with that game, the only kind of hang I observed that appears to be fully recoverable
11:18librin: ...I should probably look into nouveau debug output options
11:19imirkin_: gpu hang debugging is something we stink at
11:19imirkin_: we just try our hardest not to hang the gpu
11:22librin: >Clock gating in list DRM Power Management (kernel)
11:22librin: which kernel do I need if I want to work on that?
11:23imirkin_: latest kernel is always best for development
11:23imirkin_: but there's nothing special about clock gating re kernel versions
11:23librin: I remember Nouveau wiki pointing out I need some special version, but I can't seem to find the page
11:23librin: the Nouveau wiki is rather a PITA to navigate :V
11:23imirkin_: you just poke registers, measure effect on power consumption, move on to the next register
11:24imirkin_: if you have suggestions for improving it, let me know
11:24imirkin_: i redid the front page a while ago, imho it was an improvement over what was there before
11:24imirkin_: the 'special' tree you may be thinking of is at http://cgit.freedesktop.org/~darktama/nouveau
11:25librin: it said something about "Linus' kernel might not be up to date to [whatever is latest nouveau-wise]"
11:26librin: wiki page
11:26imirkin_: which one?
11:27librin: but I might be remember it wrong, been a while I read that one
11:32Karlton: the darktama one will have the latest code before it gets sent upstream, no?
11:32imirkin_: pretty much
11:32imirkin_: ben develops in his out-of-tree thing, and then syncs that around
11:33librin: >"If you don't have any git kernel tree yet, clone Linus' tree from kernel.org first"
11:33librin: that's linux-next, right?
11:33librin: ...mainline? :V
11:33imirkin_: linux-next is a random amalgamation of a boatlod of diff trees
11:34imirkin_: anyways, if you're looking at clock gating and whatnot, you don't need to futz with the kernel
11:34imirkin_: grab envytools, that'll have the tools needed to poke the card
11:34imirkin_: i think mupuf did a bunch of work on that at one point
11:34imirkin_: don't remember that anything came of it
11:35librin: "brb ddg'ing envytools"
11:36librin: wait, does this mean I will need to run the binary blob at some point?
11:37imirkin_: blob certainly helps to see what the blob does :)
11:38librin: might as well build a new computer with my secondary card
11:38imirkin_: librin: the *gentlest* introduction to nouveau development, i think, would be to take some piglit test that's failing on nouveau
11:38imirkin_: and figure out why it's failing and fix it
11:52mlankhorst: where's the fun in that ;)
11:52mlankhorst: but yeah good way to start
11:52imirkin_: i list a few fails in that trello
11:53imirkin_: there are others of course
11:53imirkin_: at the same time, i basically started with a too-big project and things worked out ok
11:54librin: I suppose this is mostly going to be mesa-sided code
11:55mlankhorst: meh I have no good way to see tiling on a pixmap from a different driver
12:39imirkin_: librin: yep. which is what you were interested in in the first place if the end goal is sli
14:57librin: question: should I turn off other running OpenGL programs when running piglit or is it ok to have OpenGL stuff running?
14:58imirkin_: librin: a full piglit run, or a single piglit?
14:58imirkin_: librin: i'd avoid doing as much as possible during a full piglit run... also see http://people.freedesktop.org/~imirkin/ for my recommended piglit cmdline
15:26librin: imirkin_, the command line there, that "./piglit-run.py"
15:27librin: I seem to unable to find that
15:29imirkin_: librin: it's in the piglit top-level dir
15:33librin: imirkin_, I suppose it doesn't get placed with "make install"
15:33librin: welp, that was stupid of me
15:33imirkin_: librin: uhm... you should definitely not be doing 'make install' with piglit
15:34imirkin_: i haven't a clue what that does
15:34imirkin_: but i can't imagine it's anything good
15:35librin: I did install into a "special" prefix
15:36librin: it does create a "piglit" script that seems can be used to run a "full" test
15:36librin: along with all the files it needs
15:36imirkin_: piglit-run.py == 'piglit run'
15:37librin: gonna try to run now
15:39librin: is it supposed to hang the driver on the 510th test when using that command line as defined in http://people.freedesktop.org/~imirkin/ ?
15:39imirkin_: no, in fact that command line generally avoids hangs
15:39imirkin_: what killed it? max-texture-size?
15:40imirkin_: that's another one i exclude
15:40librin: [40561.232817] nouveau E[ X] Unknown handle 0x0000002c
15:40librin: [40561.232822] nouveau E[ X] validate_init
15:40librin: [40561.232824] nouveau E[ X] validate: -2
15:40librin: [40561.236344] nouveau E[ PFIFO][0000:01:00.0] read fault at 0x0005442000 [PTE] from GR/GPC0/PE_3 on channel 0x007f6af000 [X]
15:40librin: [40561.236347] nouveau E[ PFIFO][0000:01:00.0] PGR engine fault on channel 5, recovering...
15:40librin: beats me
15:40imirkin_: oh nice, X complained
15:40librin: >[00510/18641] fail: 2, pass: 358, skip: 150 /
15:40librin: piglit output standing still on this
15:41imirkin_: check top... some tests can take a while
15:41librin: it's sleeping
15:41imirkin_: what is the piglit test being run?
15:41librin: and the "spinning" thing at the end of line
15:41librin: is stuck to '/'
15:41imirkin_: ps auxwww
15:41imirkin_: 'piglit' is just a test runner
15:41imirkin_: it's not an actual test
15:42imirkin_: the tests are separate binaries that execute
15:43librin: it's sleeping alright
15:44imirkin_: what is 'it' that's sleeping?
15:44librin: >rin 13175 0.7 0.3 180528 25544 pts/2 S 00:36 0:03 /mnt/ext/code/piglit/piglit/bin/texelFetch fs sampler2D 1x281-501x281 -auto -fbo
15:44imirkin_: ah no, that test just takes forever
15:44imirkin_: give it a minute
15:44imirkin_: or two
15:45librin: and I am getting dmesg spammed with nouveau errors :V
15:45imirkin_: oh. then there's probably some deeper issue
15:45imirkin_: kill it (^\)
15:46imirkin_: mlankhorst: do you really need .60?
15:46mlankhorst: imirkin_: for dri3
15:46mlankhorst: that call was added then
15:46imirkin_: ok, just checking
15:48mlankhorst: night :)
15:53librin: imirkin_ what is the normal crashing count when running piglit with that command line?
15:53imirkin_: 1 or 2
15:53imirkin_: maybe 5 at the outside, i don't remember
15:54librin: >at the outside
15:54librin: what does that mean
15:54imirkin_: at most
15:54librin: sorry for being such a n00b :C
16:08JethroTux: I've got this error on dmesg: "nouveau E[ PFIFO][0000:01:00.0] CACHE_ERROR - ch 1 [Xorg] subc 6 mthd 0x0000 data 0x80000013"anybody knows what could it be?
16:13imirkin_: JethroTux: what gpu?
16:22librin: imirkin_, I'm already at six crashes ;]
16:23imirkin_: i may be misremembering... or some of those are new
16:24librin: and snap, it takes a while. And I used to think running wine tests took long... :E
16:27glennk: imirkin_, still haven't fixed those concurrency issues with piglit?
16:28imirkin_: glennk: nope, no clue how to even begin
16:28imirkin_: librin: should take ~40 mins... you can speed it up if you clock your card up
16:29glennk: well, should be able to narrow down which tests clobber each other's state
16:30imirkin_: glennk: pretty sure it has nothing to do with that
16:31imirkin_: glennk: it tends to be the tests that pop up X windows that mess up the universe
16:31imirkin_: glennk: i guess i should double-check, but i suspect that with gbm we can do concurrent runs jsut fine
16:32glennk: so the x ddx then?
16:32imirkin_: this is the problem... it's not likely to be _anything_ in particular
16:32imirkin_: yet it mega-dies
16:32imirkin_: something with fencing... who knows
16:33librin: imirkin_, it took ~35 mins with a clocked up card. But it exploded in the end and seems to have not generated the "main" results file
16:33librin: so I am re-running it
16:33imirkin_: librin: you can still do summaries on the exploded files
16:33buhman: piglit crashes things?
16:33buhman: how does that happen?
16:33imirkin_: but yeah, the current situation is annoying... it used to still just generate a single file
16:34imirkin_: and now it's 100000 diff files =/
16:34librin: well, it was mostly me thinking control-c can be used to kill a hanged test
16:34imirkin_: buhman: pretty straightforward... run some commands on gpu, crash computer.
16:34imirkin_: librin: use ^\
16:34librin: and then it bit me in the end when all tests were done
16:34imirkin_: that sends a quit message
16:34buhman: I mean erm
16:34buhman: well I guess with my gk106 experiences, I shouldn't be surprised ;p
16:34librin: I just "kill -9 [pid of the hanged test]"
16:34librin: that works
16:35imirkin_: librin: ^C will kill the current test
16:35imirkin_: ^\ will kill piglit
16:36imirkin_: librin: but ^C isn't as strong as kill -9
16:36librin: ^C did jack s***, but instead >KeyboardInterrupt on the piglit script after the last test was done
16:36imirkin_: hm, that's not how i remember it
16:36imirkin_: perhaps things changed
16:37librin: and it seems test #510 hanged yet again, with dmesg spam
16:37librin: exciting C:
16:38imirkin_: you know, texelFetch dying sounds really familiar
16:38imirkin_: there was a bug with non-debug builds
16:38imirkin_: but i fixed that a while ago
16:38imirkin_: and you use git mesa, right?
16:38glennk: the name of the test is a bit more meaningful than the order piglit happened to run it
16:38librin: texelFetch was near the end
16:38librin: and I did have to kill it
16:39librin: err, scratch that
16:39imirkin_: this was the commit in question: fb1afd1ea5fd
16:40imirkin_: but that was just test failures, not dmesg spam
16:41librin: shader_runner was another one I had to kill, if I am not mixing things up
16:41librin: others that crashed died naturally
16:42librin: question: is there a way to query used/free video memory in Nouveau?
16:44librin: I assume the kernel side of the driver would have to keep know that nonetheless
16:44librin: but is there a way for a "user" to query it?
16:44imirkin_: sssort of
16:44imirkin_: there's no such thing, really
16:45glennk: there's a GL extension for that somewhere
16:46imirkin_: it's like "free" ram
16:46imirkin_: there's no such thing
16:47imirkin_: you can malloc(100TB)
16:47imirkin_: and the malloc call will succeed
16:47librin: oh and texelFetch had to be killed, too J:
16:47glennk: pages used
16:49librin: FWIW, malloc(100TB) won't succeed...
16:49librin: ...on Windows
16:50glennk: on linux it depends if you have overcommit enabled or not
16:50glennk: GL_ATI_meminfo, GL_NVX_gpu_memory_info
16:50specing: does there exist a nice GUI config utility for nouveau as there is for nvidia's blobs?
16:51librin: specing, not that I'm aware of
16:53glennk: GLX_MESA_query_renderer then returns the physical amount of memory on the card
16:53imirkin_: glennk: overcommit is default on linux
16:54librin: question: how's the interaction between nouveau and gallium-nine, development wise?
16:54imirkin_: specing: gui utility that does what?
16:54imirkin_: librin: it works
16:54librin: independent enough for neither needing to care about the other?
16:54librin: I know it works – I run it regularly =D
16:54librin: I was asking development-wise
16:54imirkin_: librin: there are a few caveats...
16:54imirkin_: nine is a gallium state tracker
16:55imirkin_: nouveau is a gallium driver
16:55imirkin_: as long as nine sticks to the api, it's fine
16:56librin: I suppose this means my assumption that nouveau and gallium-nine are separate enough to not care about one another development-wise, is true.
16:56imirkin_: until nine wants to extend the gallium api, at which point all drivers have to be updated
16:56librin: fair enough
16:57librin: ...what does that mean?
16:59specing: imirkin_: I don't know, provide clicky clicky?
17:18imirkin: ah, nothing like some snow to remind you it's the first day of spring...
17:18imirkin: specing: ... which does what?
17:19imirkin: specing: i.e. what do you feel is missing from the regular gui tools provided by your environment of choice?
17:22librin: ...there are "regular" tools?
17:22librin: BTW, imirkin
17:22librin: > *** Error in `/mnt/ext/code/piglit/piglit/bin/texelFetch': corrupted double-linked list: 0x0000000000fd2050 ***
17:22librin: this didn't happen the last time :V
17:22imirkin: librin: can you valgrind it and see what's up?
17:24librin: not really. Valgrind doesn't run on my machine. Doesn't yet understand the whole ISA of my CPU and complains on startup
17:24imirkin: mmmm... that's surprising
17:24imirkin: which isa does it hate?
17:24imirkin: sse4? coz that got updated a long time ago
17:25librin: I am running the very latest valgrind
17:25librin: and it's
17:25librin: [something from whatever that was newly added with piledriver]
17:25librin: IIRC it was certain encodings of AVX
17:26librin: but I might as well be very much wrong
17:26librin: it was an open issue in valgrind bug tracker the last time I checked
17:27librin: for that I do my valgrind needs on my laptop, running Linux Mint where libc is compiler to the lowest common denominator
17:27librin: but I can't run nouveau tests on THAT
17:27librin: oh and it's done
17:28librin: now, where's that "main" file? :V
17:29librin: I'll assume that's the results.json
17:34imirkin: yeah... it got renamed
17:35librin: imirkin, do You want my results?
17:35imirkin: mmmm... sure, send them on... xz -9 them first :)
17:36librin: "raw", summary'd, html-summary'd or all?
17:45librin: imirkin, here Ya go: https://seriouss.am/nvXX-2015-03-21-rin_results.json.xz
17:46imirkin: i assume that should be nve6?
17:47librin: it seems
17:47librin: nve6 seems to be the mobile variant
17:48librin: this ain't mobile
17:48imirkin: nve6 is certainly not mobile
17:49imirkin: there are mobile versions of most chips
17:49librin: whoops I mean
17:49librin: mine is GTX 770
17:50librin: which is listed under nve4
17:50librin: nve6 has mostly mobile chips listed
17:50librin: >01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1)
17:50imirkin: that list is far from complete fyi
17:50librin: GK104 also tranlsates into nve4
17:50imirkin: anyways, you do have a GK104 so yeah
17:51librin: well, my card (GTX 770) is clearly under nve4
17:51librin: it is in the list
17:55librin: thanks for all the guidance, imirkin!
17:56librin: I'm going to catch some shut-eye; it's 3 AM here
17:58imirkin: see ya