14:57Tom^2: karolherbst: is envytools borked? was just about to get your temp readings from nvidia and nvaforcetemp just errors with , WARN: Can't probe 0000:01:00.0 and PCI init failure!
14:57karolherbst: you have to run that as root
14:58Tom^2: yea i am
14:58karolherbst: pretty yure you don't :p
14:59Tom^2: karolherbst: http://i.imgur.com/9GAPw7J.png well. *shrug*
15:00karolherbst: run "id"
15:00Tom^2: or want a scrot of that too? xD
15:01karolherbst: for me it works
15:01karolherbst: mwk: any ideas?
15:02Tom^2: it has worked before, so its something changed or im missing something
15:13Tom_2_: karolherbst: its something nvidia has changed then, because with nouveau booted i can issue the nva tools without errors
15:14karolherbst: which driver version?
15:14karolherbst: alltough that shouldn't matter at all
15:15Tom_2_: karolherbst: 364.19
15:15karolherbst: super odd
15:15karolherbst: I have lke 367.18
15:16Tom_2_: unless envytools required bunch of lib32-libs that got brought in because of lib32-mesa-libgl
15:16Tom_2_: but that shouldnt be the case
15:17karolherbst: I am sure you messed something up :p
15:18Tom_2_: karolherbst: https://gist.github.com/anonymous/c9ea95e191ca0fbbed53d13c61ec210f well no idea but il bring it back.
15:19karolherbst: Tom_2_: anyway, thw nvidia driver shouldn't have anny effect on those nva tools
15:21Tom_2: karolherbst: it sure does here
15:21karolherbst: Tom_2: well gdb helps then :p
15:25Tom_2: karolherbst: https://gist.github.com/anonymous/5252fa1633587327aaaf81e082e34d99 that resource0 file doesnt exist.
15:27karolherbst: this is liker super odd then
16:13karolherbst: Tom^: found anything?
16:13Tom^: nope, that file is missing and it makes nva_init bork. i think
16:14Tom^: and then i rebooted to windows and endulged my senses i pointless procrastination.
16:15karolherbst: very odd though
20:30gregory38: karolherbst: hello
20:30gregory38: where did you get the register allocation info ?
20:30karolherbst: which on?
20:30karolherbst: you mean the ir dump?
20:31gregory38: yesterday you told me that my new shader used 20 register
20:31karolherbst: I checked which was the highest used red
20:31karolherbst: but there is a way to print that info
20:31imirkin_: gregory38: we output stats via debug info
20:31imirkin_: gregory38: you can see it in glretrace, or you can add your own debug handler in your application
20:31imirkin_: only in debug contexts though
20:32karolherbst: imirkin_: maybe it makes sense to print it with NV50_PROG_DEBUG enabled though
20:32imirkin_: karolherbst: i'm going to revamp the env vars at some point
20:32imirkin_: the current thing is dumb for lots of reasons
20:32gregory38: oh ok
20:33gregory38: So far I set the env var in PCSX2 and I redirect stderr to file
20:33imirkin_: gregory38: https://cgit.freedesktop.org/mesa/shader-db/tree/run.c#n575
20:33imirkin_: that will give you notifications whenever we compile
20:33imirkin_: (in a debug gl context)
20:33gregory38: ok, I will redirect it
20:34imirkin_: well, it's up to you what you do with it - it just calls a function
20:34imirkin_: that function can print the message to stderr, or it can go make coffee - doesn't really matter
20:34gregory38: love coffee :p
20:34imirkin_: anyways, that includes a bunch of metrics i felt were interesting to keep track of
20:35imirkin_: instruction count, gpr usage, local memory, and byte size
20:35imirkin_: for nvc0+, byte size == 8 * instruction count
20:35imirkin_: but nv50 has both 4- and 8-byte encodings
20:35imirkin_: [actually nvc0 does too, but neither we nor the blob use them]
20:36gregory38: Nice :)
20:36imirkin_: if local memory > 0, that means you lost
20:36imirkin_: you really want it to be 0
20:36gregory38: why ?
20:36imirkin_: GFxxx / GK10x have 64 registers, GK110+ has 256 registers
20:36imirkin_: coz it's slow
20:37imirkin_: but we have to use it if you e.g. have an array that you indirectly access
20:37imirkin_: (so don't do that)
20:37imirkin_: or if you use too many registers and we have to spill
20:37imirkin_: (so don't do that)
20:38gregory38: 64 registers <= I guess for a groups of threads ? Or by thread ?
20:38imirkin_: or if you're using opencl and have an alloca() call. so don't do that :)
20:38imirkin_: 64 registers per thread
20:38imirkin_: but there's a finite number of registers in a SM
20:38gregory38: by thread, I mean a shader run for a fragment (or a vertex)
20:38imirkin_: so by using more registers per thread, you end up losing parallelism
20:39imirkin_: for fermi, there's 32K registers/SM, for kepler+ there's 64K
20:39imirkin_: [except the mythical GK210 which has 128K]
20:40Calinou: 640K ought to be enough for anybody.
20:40imirkin_: Tesla K80
20:40gregory38: are register 32/64/128 bits ?
20:40karolherbst: Calinou: not thrtr yet :p
20:41imirkin_: on G80, each register is actually 16-bit, but most registers are accessed via 32-bit views on that space.
20:41imirkin_: and there are 128 32-bit registers, or 256 16-bit ones, depending how you look at it
20:42imirkin_: but short encodings can only address up to 64 regs, so it's better to keep under that boun
20:42gregory38: thanks for all the info.
20:42imirkin_: probably a lot more than you wanted to hear :)
20:43gregory38: Well I'm quite curious so it is fine
20:43karolherbst: gregory38: well if you see stupid things the compiler does, you can always tell us. I tried to find some simple things we could improve, but somehow the simple things are mostly gone :/
20:44imirkin_: yeah, one thing to remember is that it's really easy to optimize things in your head, but a compiler can have a more difficult time
20:44gregory38: yeah I know
20:45gregory38: Better tune glsl a bit
20:45karolherbst: but I think some of my opts are save enough though but I think I will still them for some time because I seem to run into issue every week
20:46gouchi: the backtrace I got for mpv: pushbuf.c:238: pushbuf_krel: Assertion `bkref' failed.
20:47gouchi: I will have to recompile to mpv with debug symbol
20:47imirkin_: gouchi: expected.
20:47karolherbst: multithreading again?
20:47imirkin_: and my patch won't help him either
20:47imirkin_: gouchi: use mplayer.
20:47gouchi: imirkin: mpv which is for of mplayer and mplayer2
20:48imirkin_: gouchi: mplayer. mplayer. not mplayer fork. mplayer.
21:03gregory38: is it normal to use that much gpr ?
21:04gregory38: the glsl shader just move stuff around (create 2 triangles (quad) from a line )
21:05imirkin_: it's unfortunate
21:06imirkin_: the issue is that something decides to (a) load ALL the inputs and then (b) emit stuff
21:06imirkin_: [something = varying packing]
21:06imirkin_: and we don't have an instruction scheduler
21:06imirkin_: which means that we also don't do anything to reduce register pressure/etc
21:07gregory38: ah ok. Doesn't help
21:07imirkin_: if you can redo things to always use vec4's, that would "solve" that problem, since it wouldn't add stupid varying packing bs
21:07imirkin_: oh, and just fyi, $r63 == hard wired to 0
21:08gregory38: well I need to move either the X/U or the Y/V stuff
21:08imirkin_: or i think if you use explicit locaitons, that will also avoid the varying packer
21:08gregory38: you mean for the interface?
21:09gregory38: I'm waiting tim's patches to support the missing 4.4 extensions
21:09gregory38: but good to knwo
21:09imirkin_: you mean enhanced layouts?
21:09gregory38: I'm using interface block
21:10imirkin_: i think you can set explicit locations on items within the iface block, no?
21:10gregory38: in enhanced layouts ;)
21:11gregory38: anyway, it is a low priority
21:14imirkin_: i wish the varying packer didn't suck so much on geometry shaders
21:14karolherbst: though nouveau sucks here too
21:14imirkin_: yeah, but it's creating unnecessary suckage for nouveau
21:15imirkin_: and i think ken fixed it for tess
21:15imirkin_: just have to hook that same thing up for geom
21:15karolherbst: I am sure this entire thing can be done in lik 30 instructions if we would b clever about it
21:17gregory38: well you need 18 export + 6 emit
21:18gregory38: 30 instructions feel a bit optimistic IMHO ;)
21:24imirkin_: it's better on nv50 where you can export as you load :)
21:25karolherbst: yeah... do we have gemotry shaders there already?
21:25imirkin_: yep. that was my second large contribution to nouveau :)
21:25karolherbst: now I know why my tesla spit out 3.0 today
21:26karolherbst: my glxinfo was just too old then