03:06 kloofy: it's all the positive energy and earlier acheivements which produce correct in the brain about how things work
03:08 kloofy: finally after having tought alot about it to the idiots i get somewhat less daily basis picketations, it's entirely understood why those happened though
03:08 kloofy: reason is simple, most people can't use their minds/brain
03:09 kloofy: it's an inherited or inherent capability to be able to think legit and good
03:15 kloofy: people here tie a tie on violate and bluff extensively feeling important, most of the time they need real facts to be notified and unfortunently most the time the true ones how big idiots ones are
03:18 kloofy: so this is how quartus altera tools work, this is how americans work, it needs a bit intelligence to understand it
03:19 kloofy: the thing is optimizing altera synthesis can take any sort of noc on, and optimize it to fit
03:20 kloofy: current example is stanford noc switch is being optimized 22times of area with quartus II v11
03:21 kloofy: so we can't say such crap on wikipedia that outer world does not know how to use their technology
03:22 kloofy: so it is safe to trust available tools doing a very neat stuff to optimize netlist generation, what i always have said
03:27 imirkin: hakzsam: fyi, elemental demo rendering is a bit messed up for me - the colors seem off for some of the scenes
03:29 imirkin: the initial scene is pretty flickery
03:41 imirkin: skeggsb: copy engine can't scale right?
03:54 skeggsb: imirkin: i don't believe so, no
03:54 imirkin: gr. so that's why i never fixed that bug
03:54 imirkin: it'd require me to do annoying things which are annoying.
03:55 skeggsb: yeah, from looking at nvidia's class header for it, that's a definite no
03:55 skeggsb: though, they've been known to omit things before
03:56 imirkin: the api it exposes definitely doesn't look like it does scaling either
03:57 skeggsb: i guess you're stuck using twod or the 3d class :/
03:57 imirkin: well, i'm stuck calling nvc0_blit()
03:57 imirkin: it's just ... not convenient.
03:58 skeggsb: can twod do everything you need it to do for the function you need scaling in?
03:58 imirkin: maybe.
03:58 imirkin: nvc0_blit avoids using 2d for 8x msaa resolve and for 3d textures
03:59 imirkin: but yeah, i should just tell them to suck it and use 2d. this is only for msaa surfaces being mapped, which means no 3d
04:00 imirkin: and i can't get myself to care about quality if you're using glReadPixels() on a winsys msaa surface
04:00 imirkin: moral of the story: don't do that.
04:00 imirkin: good call.
04:01 skeggsb: btw, have you ever toyed with using the (async, not the one on the gr channel) copy engines for transfers?
04:01 imirkin: we use it in the kernel, right?
04:02 skeggsb: for ttm stuff, yeah, but i'm thinking more for mesa's use
04:02 imirkin: right
04:02 skeggsb: might be able to have some stuff run asynchronously with rendering
04:02 imirkin: well, first we'd have to wean ourselves off of libdrm_nouveau
04:02 airlied: probably more us in vulkan
04:02 airlied: use
04:03 skeggsb: true
04:03 airlied: or opencl
04:03 imirkin: and the whole locking situation is ... mindblowing
04:03 skeggsb: imirkin: libdrm_nouveau supports that kind of thing if you use it properly
04:03 imirkin: maybe
04:03 imirkin: seems easier to just not worry about it
04:05 imirkin: anyways, making use of those copy engines certainly seems like a good idea
04:06 imirkin: i fully endorse that idea :)
04:06 imirkin: i also fully endorse the idea of getting a handle on how to analyze why XYZ is slow
04:06 skeggsb: patches welcome? ;)
04:06 imirkin: and i more-than-fully endorse the idea of making it so any little error in mesa doesn't take the kernel with it
04:07 imirkin: and yes, patches welcome for all of the above
04:29 imirkin: unfortunately there's more than enough bugs running around that trying to optimize things seems like a far-away task
06:18 hakzsam: imirkin, yeah, that happens to write the wrong fix
06:19 hakzsam: mmh, which gpu?
06:19 hakzsam: I do see some flickering on fermi as weel
06:19 hakzsam: *well
06:19 hakzsam: but colors seem not totally incorrect
11:20 karolherbst: imirkin: I plan to write a tool which takes nvidia generated binaries and compares the sched opcodes with what mesa-nouveau would produce for the same thing. I assume we have no disassembling stuff going on inside mesa. Maybe it makes more sense to write something based on envydis instead then
11:22 karolherbst: is there any solid way to write gpu assembly code and let nvidia fill the sched opcodes? like with those cuda tools?
11:24 RSpliet: karolherbst: It's probably most sane to tap binaries from NVIDIA using valgrind-mmt, then let your disassembler (envydis?) convert that to an assembly format that you can convert to NV50_IR post-RA?
11:25 karolherbst: RSpliet: good hint regarding mmt
11:25 karolherbst: but the IR actually misses a few things
11:26 karolherbst: but it shouldn't matter
11:26 RSpliet: I presume for a fair comparison of scheduler headers you don't want to apply nouveau's optimising passes and RA, as that could change scheduling decisions
11:26 karolherbst: I just care about sched opcodes
11:27 RSpliet: yeah, but RA has an influence on dual issue ;-)
11:27 RSpliet: (due to false register sharing complications)
11:28 RSpliet: so make sure to isolate what you want to quantify as well as you can :-)
11:31 karolherbst: mhh
11:31 karolherbst: I just want to run the method calculating the sched opcodes on the stuff
11:31 karolherbst: not build the entire cfg and all the bbs
11:32 karolherbst: mhh
11:32 karolherbst: but maybe I have to for this
11:32 karolherbst: :/
11:34 karolherbst: I think I will write something from scratch (except the actual disasm bits)
11:34 karolherbst: and implement what I actually need for this
11:39 hakzsam: karolherbst, sched codes for maxwell? or kepler?
11:41 karolherbst: kepler, but that tool could be used for both actually
11:41 karolherbst: it should just verify we do the right thing
11:42 karolherbst: or that we at least use higher wait times than nvidia
15:23 RSpliet: hakzsam: do you think you can talk me through using demmt to extract OpenCL kernels?
15:25 RSpliet: I have the puniest tiny little kernel (euclidean distance), and would love to have it's assembly on my monitor... but the -q output of demmt only lists unknown opcodes with size 0
15:25 imirkin: RSpliet: pastebin what you're talking about?
15:26 hakzsam: RSpliet, look for "CODE:" in the mmt trace
15:27 RSpliet: hakzsam: not found it seems...
15:28 hakzsam: weird
15:28 RSpliet: hakzsam: probably incompetence on my side rather than weirdness :-)
15:28 hakzsam: are you sure the trace is not corrupted?
15:28 hakzsam: do you see some "COMPUTE" commands in the trace?
15:28 imirkin: it might not be finding the ib
15:29 hakzsam: yeah
15:29 RSpliet: well, this isn't very helpful: https://paste.fedoraproject.org/419894/14728301/
15:29 hakzsam: imirkin, sounds familiar :)
15:29 imirkin: RSpliet: valgrind died
15:30 imirkin: did someone have patches... hm
15:30 hakzsam: I should have a patch for the IB stuff
15:31 imirkin: nah
15:31 imirkin: that won't help
15:31 imirkin: here mmt died while tracing
15:31 hakzsam: yeah, I saw
15:32 hakzsam: but maybe he will hit the other issue as well ;)
15:32 hakzsam: RSpliet, try this patch if demmt fails at finding the ib http://hastebin.com/qexisujenu
15:35 RSpliet: hakzsam: doesn't seem to make a difference
15:35 hakzsam: yeah, because valgrind died as ilia said
15:35 RSpliet: and, to be fair, the regular output overwhelms me, I don't know where to look to understand the issue
15:35 RSpliet: while -q is a bit underwhelming, apart from that error :-P
15:37 hakzsam: does your valgrind-mmt is up-to-date?
15:37 hakzsam: maybe you can send me your CL sample
15:37 RSpliet: hakzsam: build it this afternoon
15:37 RSpliet: CL sample is the "nn" kernel from Rodinia (nearest neighbour)
15:38 RSpliet: do you need me to send you that? or did you already obtain Rodinia?
15:38 hakzsam: yes, please send :)
15:39 RSpliet: the whole rodinia is 360MB though
15:39 imirkin: RSpliet: do you have the output of the mmt run? including the cmdline you used?
15:39 hakzsam: RSpliet, link?
15:39 imirkin: [i don't mean the mmt it generated, but the stdout]
15:40 RSpliet: valgrind --tool=mmt --mmt-trace-nvidia-ioctls --log-file=nn_small.bin ./nn filelist.txt -r 5 -lat 30 -lng 90 -t
15:40 RSpliet: let me fetch the full output
15:41 imirkin: hmmmm
15:41 RSpliet: hmm, it's just not a very interesting output
15:41 imirkin: i dunno if it's enough
15:41 hakzsam: it is
15:41 imirkin: i think i tend to add --mmt-trace-file=/dev/nvidia0 --mmt-trace-file=/dev/nvidiactl
15:41 imirkin: bbl
15:42 hakzsam: I use the same options as RSpliet usually
15:42 RSpliet: doesn't make a difference
15:42 RSpliet: ok, one thing at a time... Rodinia :-)
15:43 hakzsam: yep :)
15:43 imirkin: RSpliet: what's your goal of this btw? i.e. why do you want to look at that shader?
15:43 imirkin: RSpliet: another thing to try potentially is to downgrade blob versions
15:44 RSpliet: imirkin: it's a simple thing that helps me set up a sane inspection env
16:04 RSpliet: this hint helped me not crash valgrind https://bugs.kde.org/show_bug.cgi?id=352742
17:10 RSpliet: NVIDIA might actually have reshuffled their IOCTLs
17:10 imirkin_: hence the "try an older driver" idea
17:10 RSpliet: yeah
17:12 RSpliet: not stubbornness, but hakzsam's one-liner quickfix had broken demmt as it no longer tried to disassemble the OpenCL kernel even from his trace
17:13 RSpliet: just spent some time regenerating the output he got, and was happy to find the opcodes somewhere in my trace as well :-)
17:13 hakzsam: RSpliet, right, I tested it and it didn't work
17:14 hakzsam: it works only when you have both 3D and CP actually
17:14 hakzsam: for compute shaders
17:14 hakzsam: anyway, you got your shader code, that's fine :)
17:14 RSpliet: hakzsam: yeah. Now to see if I can replicate the result on my own machine
17:14 RSpliet: well
17:14 RSpliet: next week
17:14 RSpliet: thanks for all the help btw!
17:15 hakzsam: np
17:45 kloofy: well children, uncle mart is here, the famous one...
17:46 kloofy: mart the roof, mart the big one
17:48 kloofy: ah never mind it's like it used to be at old times.. so that, world connection was managed via estonian stars
17:50 kloofy: i.e russians were mining the ore, estonian ones were buffering that to the rest of the world to do electronic acheivements
17:50 kloofy: just like mexican gangsters buffer cocaine to united states
17:52 kloofy: so tell me whos you'r daddy, who helps you to think properly and find some calm moements?
19:03 kloofy: is that the real deal thaty is from e......?
19:12 karolherbst: hakzsam: when will you need reator over the weekend?
19:46 librin: hello
19:47 librin: if a game segfaults in nouveau_fence_trigger_work(), is that a likely candidate for a nouveau bug?
19:52 RSpliet: librin: that depends on whether your software stack is synchronised with upstream
19:53 imirkin_: librin: yes. usually one related to locking.
19:53 imirkin_: librin: i have a 'locking' branch which may fix some of those bugs: https://github.com/imirkin/mesa/commits/locking
19:54 imirkin_: librin: but it's not 100% ready for prime time yet, unfortunately
19:56 librin: RSpliet: always git HEAD
19:56 librin: B)
19:57 librin: imirkin_: gee, it seems to crash in a different part of nouveau each time
19:57 imirkin_: librin: yeah, so that's almost definitely a locking issue
20:01 librin: imirkin_: ah, okay, thanks
20:01 librin: guess I don't need to do any bug-reporting, then
20:01 librin: xP
20:02 imirkin_: well, if you want the app to work better, you could try my branch
20:02 imirkin_: i should probably do a rebase at some point
20:09 librin: imirkin_: thanks a bunch!