03:27 imirkin: hakzsam: fyi, elemental demo rendering is a bit messed up for me - the colors seem off for some of the scenes
03:29 imirkin: the initial scene is pretty flickery
03:41 imirkin: skeggsb: copy engine can't scale right?
03:54 skeggsb: imirkin: i don't believe so, no
03:54 imirkin: gr. so that's why i never fixed that bug
03:54 imirkin: it'd require me to do annoying things which are annoying.
03:55 skeggsb: yeah, from looking at nvidia's class header for it, that's a definite no
03:55 skeggsb: though, they've been known to omit things before
03:56 imirkin: the api it exposes definitely doesn't look like it does scaling either
03:57 skeggsb: i guess you're stuck using twod or the 3d class :/
03:57 imirkin: well, i'm stuck calling nvc0_blit()
03:57 imirkin: it's just ... not convenient.
03:58 skeggsb: can twod do everything you need it to do for the function you need scaling in?
03:58 imirkin: maybe.
03:58 imirkin: nvc0_blit avoids using 2d for 8x msaa resolve and for 3d textures
03:59 imirkin: but yeah, i should just tell them to suck it and use 2d. this is only for msaa surfaces being mapped, which means no 3d
04:00 imirkin: and i can't get myself to care about quality if you're using glReadPixels() on a winsys msaa surface
04:00 imirkin: moral of the story: don't do that.
04:00 imirkin: good call.
04:01 skeggsb: btw, have you ever toyed with using the (async, not the one on the gr channel) copy engines for transfers?
04:01 imirkin: we use it in the kernel, right?
04:02 skeggsb: for ttm stuff, yeah, but i'm thinking more for mesa's use
04:02 imirkin: right
04:02 skeggsb: might be able to have some stuff run asynchronously with rendering
04:02 imirkin: well, first we'd have to wean ourselves off of libdrm_nouveau
04:02 airlied: probably more us in vulkan
04:02 airlied: use
04:03 skeggsb: true
04:03 airlied: or opencl
04:03 imirkin: and the whole locking situation is ... mindblowing
04:03 skeggsb: imirkin: libdrm_nouveau supports that kind of thing if you use it properly
04:03 imirkin: maybe
04:03 imirkin: seems easier to just not worry about it
04:05 imirkin: anyways, making use of those copy engines certainly seems like a good idea
04:06 imirkin: i fully endorse that idea :)
04:06 imirkin: i also fully endorse the idea of getting a handle on how to analyze why XYZ is slow
04:06 skeggsb: patches welcome? ;)
04:06 imirkin: and i more-than-fully endorse the idea of making it so any little error in mesa doesn't take the kernel with it
04:07 imirkin: and yes, patches welcome for all of the above
04:29 imirkin: unfortunately there's more than enough bugs running around that trying to optimize things seems like a far-away task
06:18 hakzsam: imirkin, yeah, that happens to write the wrong fix
06:19 hakzsam: mmh, which gpu?
06:19 hakzsam: I do see some flickering on fermi as weel
06:19 hakzsam: *well
06:19 hakzsam: but colors seem not totally incorrect
11:20 karolherbst: imirkin: I plan to write a tool which takes nvidia generated binaries and compares the sched opcodes with what mesa-nouveau would produce for the same thing. I assume we have no disassembling stuff going on inside mesa. Maybe it makes more sense to write something based on envydis instead then
11:22 karolherbst: is there any solid way to write gpu assembly code and let nvidia fill the sched opcodes? like with those cuda tools?
11:24 RSpliet: karolherbst: It's probably most sane to tap binaries from NVIDIA using valgrind-mmt, then let your disassembler (envydis?) convert that to an assembly format that you can convert to NV50_IR post-RA?
11:25 karolherbst: RSpliet: good hint regarding mmt
11:25 karolherbst: but the IR actually misses a few things
11:26 karolherbst: but it shouldn't matter
11:26 RSpliet: I presume for a fair comparison of scheduler headers you don't want to apply nouveau's optimising passes and RA, as that could change scheduling decisions
11:26 karolherbst: I just care about sched opcodes
11:27 RSpliet: yeah, but RA has an influence on dual issue ;-)
11:27 RSpliet: (due to false register sharing complications)
11:28 RSpliet: so make sure to isolate what you want to quantify as well as you can :-)
11:31 karolherbst: mhh
11:31 karolherbst: I just want to run the method calculating the sched opcodes on the stuff
11:31 karolherbst: not build the entire cfg and all the bbs
11:32 karolherbst: mhh
11:32 karolherbst: but maybe I have to for this
11:32 karolherbst: :/
11:34 karolherbst: I think I will write something from scratch (except the actual disasm bits)
11:34 karolherbst: and implement what I actually need for this
11:39 hakzsam: karolherbst, sched codes for maxwell? or kepler?
11:41 karolherbst: kepler, but that tool could be used for both actually
11:41 karolherbst: it should just verify we do the right thing
11:42 karolherbst: or that we at least use higher wait times than nvidia
15:23 RSpliet: hakzsam: do you think you can talk me through using demmt to extract OpenCL kernels?
15:25 RSpliet: I have the puniest tiny little kernel (euclidean distance), and would love to have it's assembly on my monitor... but the -q output of demmt only lists unknown opcodes with size 0
15:25 imirkin: RSpliet: pastebin what you're talking about?
15:26 hakzsam: RSpliet, look for "CODE:" in the mmt trace
15:27 RSpliet: hakzsam: not found it seems...
15:28 hakzsam: weird
15:28 RSpliet: hakzsam: probably incompetence on my side rather than weirdness :-)
15:28 hakzsam: are you sure the trace is not corrupted?
15:28 hakzsam: do you see some "COMPUTE" commands in the trace?
15:28 imirkin: it might not be finding the ib
15:29 hakzsam: yeah
15:29 RSpliet: well, this isn't very helpful: https://paste.fedoraproject.org/419894/14728301/
15:29 hakzsam: imirkin, sounds familiar :)
15:29 imirkin: RSpliet: valgrind died
15:30 imirkin: did someone have patches... hm
15:30 hakzsam: I should have a patch for the IB stuff
15:31 imirkin: nah
15:31 imirkin: that won't help
15:31 imirkin: here mmt died while tracing
15:31 hakzsam: yeah, I saw
15:32 hakzsam: but maybe he will hit the other issue as well ;)
15:32 hakzsam: RSpliet, try this patch if demmt fails at finding the ib http://hastebin.com/qexisujenu
15:35 RSpliet: hakzsam: doesn't seem to make a difference
15:35 hakzsam: yeah, because valgrind died as ilia said
15:35 RSpliet: and, to be fair, the regular output overwhelms me, I don't know where to look to understand the issue
15:35 RSpliet: while -q is a bit underwhelming, apart from that error :-P
15:37 hakzsam: does your valgrind-mmt is up-to-date?
15:37 hakzsam: maybe you can send me your CL sample
15:37 RSpliet: hakzsam: build it this afternoon
15:37 RSpliet: CL sample is the "nn" kernel from Rodinia (nearest neighbour)
15:38 RSpliet: do you need me to send you that? or did you already obtain Rodinia?
15:38 hakzsam: yes, please send :)
15:39 RSpliet: the whole rodinia is 360MB though
15:39 imirkin: RSpliet: do you have the output of the mmt run? including the cmdline you used?
15:39 hakzsam: RSpliet, link?
15:39 imirkin: [i don't mean the mmt it generated, but the stdout]
15:40 RSpliet: valgrind --tool=mmt --mmt-trace-nvidia-ioctls --log-file=nn_small.bin ./nn filelist.txt -r 5 -lat 30 -lng 90 -t
15:40 RSpliet: let me fetch the full output
15:41 imirkin: hmmmm
15:41 RSpliet: hmm, it's just not a very interesting output
15:41 imirkin: i dunno if it's enough
15:41 hakzsam: it is
15:41 imirkin: i think i tend to add --mmt-trace-file=/dev/nvidia0 --mmt-trace-file=/dev/nvidiactl
15:41 imirkin: bbl
15:42 hakzsam: I use the same options as RSpliet usually
15:42 RSpliet: doesn't make a difference
15:42 RSpliet: ok, one thing at a time... Rodinia :-)
15:43 hakzsam: yep :)
15:43 imirkin: RSpliet: what's your goal of this btw? i.e. why do you want to look at that shader?
15:43 imirkin: RSpliet: another thing to try potentially is to downgrade blob versions
15:44 RSpliet: imirkin: it's a simple thing that helps me set up a sane inspection env
16:04 RSpliet: this hint helped me not crash valgrind https://bugs.kde.org/show_bug.cgi?id=352742
17:10 RSpliet: NVIDIA might actually have reshuffled their IOCTLs
17:10 imirkin_: hence the "try an older driver" idea
17:10 RSpliet: yeah
17:12 RSpliet: not stubbornness, but hakzsam's one-liner quickfix had broken demmt as it no longer tried to disassemble the OpenCL kernel even from his trace
17:13 RSpliet: just spent some time regenerating the output he got, and was happy to find the opcodes somewhere in my trace as well :-)
17:13 hakzsam: RSpliet, right, I tested it and it didn't work
17:14 hakzsam: it works only when you have both 3D and CP actually
17:14 hakzsam: for compute shaders
17:14 hakzsam: anyway, you got your shader code, that's fine :)
17:14 RSpliet: hakzsam: yeah. Now to see if I can replicate the result on my own machine
17:14 RSpliet: well
17:14 RSpliet: next week
17:14 RSpliet: thanks for all the help btw!
17:15 hakzsam: np
19:12 karolherbst: hakzsam: when will you need reator over the weekend?
19:46 librin: hello
19:47 librin: if a game segfaults in nouveau_fence_trigger_work(), is that a likely candidate for a nouveau bug?
19:52 RSpliet: librin: that depends on whether your software stack is synchronised with upstream
19:53 imirkin_: librin: yes. usually one related to locking.
19:53 imirkin_: librin: i have a 'locking' branch which may fix some of those bugs: https://github.com/imirkin/mesa/commits/locking
19:54 imirkin_: librin: but it's not 100% ready for prime time yet, unfortunately
19:56 librin: RSpliet: always git HEAD
19:56 librin: B)
19:57 librin: imirkin_: gee, it seems to crash in a different part of nouveau each time
19:57 imirkin_: librin: yeah, so that's almost definitely a locking issue
20:01 librin: imirkin_: ah, okay, thanks
20:01 librin: guess I don't need to do any bug-reporting, then
20:01 librin: xP
20:02 imirkin_: well, if you want the app to work better, you could try my branch
20:02 imirkin_: i should probably do a rebase at some point
20:09 librin: imirkin_: thanks a bunch!